[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

Paul Rogers (Code Review) Thu, 27 Dec 2018 12:07:04 -0800

Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )


Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
......................................................................


Patch Set 9:

(39 comments)

This will be a great feature. Detailed review from the perspective of 1) 
synchronization and 2) diagnosing issues when they occur in the field.

This is a partial review, will review remaining files separately.

http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@302
PS9, Line 302:     if (BackendConfig.INSTANCE.getHMSPollingFrequencyInSeconds() 
<= 0) {
Good check. Better to define this as behavior: to disable the polling, set the 
interval to 0 (or, trivially <0). This would be a good fall-back safety-valve 
for support cases, tests, etc.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@313
PS9, Line 313:       } catch (TException e) {
Nice use of try-with-catch. No need for nested tries. Remove inner catch, move 
catch to outer try.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2317
PS9, Line 2317:   MetastoreEventsProcessor getMetastoreEventProcessor() {
Should this be public? If only protected, please explicitly label it protected.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/Db.java
File fe/src/main/java/org/apache/impala/catalog/Db.java:

http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/Db.java@112
PS9, Line 112:     synchronized (thriftDb_) { thriftDb_.setMetastore_db(msDb); }
What is the synchronization model?

As written, we lock on the prior DB to set the new one. This means access is 
not synchronized. Better to add synchronized to the method itself, here and 
accessor.

But, more broadly, if we synchronize only at the DB level, we'll have all 
manner of race conditions. One client will grab the DB with one version the 
thrift object, then another will be slotted in underneath.

This probably needs more thought. Is the model documented in the write-up?


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/Db.java@159
PS9, Line 159:     synchronized (thriftDb_) { return thriftDb_.deepCopy(); }
How often do we do this? Do we want this to block on an HMS call about 
parameters?

Maybe a better model is for the RPC calls to do this:

- Obtain a lock
- Get a pointer to the current thrift object
- Release the lock
- Make the RPC

Not that there is no need to copy the thrift object above: the logic here 
implies that the Thrift object is immutable once set. All we are doing is 
making sure we use the same object throughout the RPC.

Still, not convinced we have the right locking semantics.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/Db.java@164
PS9, Line 164:     synchronized (thriftDb_) { return thriftDb_.getDb_name(); }
Now getting the name is synchronized on an HMS call. That can't be good for 
performance. The planner/analyzer gets the name all the time, so we really 
don't want to do this.

The name should be invariant or all heck breaks loose in the planner. So, if we 
swap out Thrift objects, pull the name outside the thrift object and cache it 
in a final field.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java
File fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@50
PS9, Line 50:  * in Hive metastore using the the public API 
<code>get_next_notification</code> These events could
Impala generally does not use Javadoc markup, sadly. But, if you want to use it 
(we really should), then a reference to a Java method should be: {@link 
Class#method}.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@63
PS9, Line 63:  *   <li>CREATE_TABLE, CREATE_DATABASE</li>
Wrong form for HTML lists. I think you want a definition list:

<dl>
  <dt>CREATE_TABLE, CREATE_DATABASE<dt>
  <dd>A new table/database is ... </dd>
...

That sad, if Vuk was here, he'd have you rip all this out. So, a broader 
question is: do we want our Javadoc to follow C++ doc standard so the C++ folks 
can follow it, or Javadoc standard so the documentation looks correct when 
formatted? I prefer the Javadoc approach, so would encourage you to keep the 
(corrected) formatting.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@91
PS9, Line 91:   private static final Logger LOG = 
Logger.getLogger(MetastoreEventsProcessor.class);
Statics generally go before instance members. Move this up with the constant 
string.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@106
PS9, Line 106:       ExtendedJSONMessageFactory.getInstance();
Same as above, and below.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@118
PS9, Line 118:     // know  if we need to ignore some of events which are 
already applied in catalogD
To avoid race conditions, separate creation from starting. Have a separate 
method that starts the timer, to be called once the system is ready for action.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@137
PS9, Line 137:   void scheduleAtFixedDelayRate(long pollingFrequencyInSec) {
Please put explicit public/private/protected declarations before each class and 
method so we can determine your intent. Shouldn't this one be private? Or, if 
we make the race-avoiding change above, it would be public, called once.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@145
PS9, Line 145:         LOG.warn(String.format(
LOG.error?


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@158
PS9, Line 158:   void processHMSNotificationEvents() throws ImpalaException {
Please document the timer semantics supported here.

This is not synchronized. Say we poll every 5 seconds. Say we get a very slow 
HMS message and the previous poll is still in progress when the next tick 
occurs. Would we get two poll events overlapping? Or, will the timer skip a 
tick if the previous has not yet completed? That is, does the scheduler handle 
this case, or must you handle it here?


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@177
PS9, Line 177:       throw new MetastoreNotificationException(
To whom are we throwing this? This method was called by the scheduler, wasn't 
it? Does the scheduler know what to do with this?

Or, should we log the error here?

What happens if the errors continue? (HMS server was shut down on host x and 
moved to host y. It is never coming back up.) Should we retry for x times, then 
turn off polling? Use an exponential back-off?

Whatever we do, please document the error handling strategy.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@186
PS9, Line 186:             .format("Unable to process notification event %d due 
to %s. Ignoring this event...",
If we get this in an escalation, we'll need to debug. At the very least, dump 
some info for the kind of event. Ideally dump things like table name, type of 
change, etc.

Next, document why it is OK to ignore an event and continue. Since we are 
invalidating, the worst that will happen is that something is not invalidated 
that should be.

But, if we ever wanted to replay events, failing on one would be fatal, 
continuing would corrupt the metadata. A fall-back there would be a global 
invalidate for the DB in question.

Should we use that as the fall-back here?

Given the severity, this should be LOG.error.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@198
PS9, Line 198:     LOG.info("Processing event " + event.getEventId() + " : " + 
event.getEventType()
Use this info in the error message above. In a production server, the info 
events may be turned off, we'll get only the error/warn events.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@202
PS9, Line 202:     Db db = catalog_.getDb(dbName);
Are there any events for which the DB name will be missing? System-wide events?


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@204
PS9, Line 204:     switch (event.getEventType()) {
Is the event type provided as an enum or just a string? String comparisons 
work, are slower than enum/int comparisons.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@209
PS9, Line 209:         invalidateCatalogTable(table);
No need to invalidate the entire table (potentially 1000s of partitions) 
because one changed. Talk to Todd to see if he can recommend a more focused 
invalidation.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@212
PS9, Line 212:         catalog_.addTable(dbName, tblName);
No need to clutter up the catalog with new tables that no query has requested. 
Maybe just ignore these events. We'll get the table data on the first query for 
that table.

This is particularly true for temp tables and other transient items.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@221
PS9, Line 221:           invalidateCatalogTable(table);
Should the fall-back logic be implemented inside the rename table event method?


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@225
PS9, Line 225:         catalog_.removeTable(dbName, tblName);
This one prompts questions about synchronization. Suppose 10 tables are 
removed. Suppose we're planning a query at the same time. The way this works, 
for each removal, we'll lock the catalog, make the change, and unlock. A 
planning session will see multiple changes. The current logic will (I believe) 
fail the plan and restart.

Since we do the 10 removes (or other changes) one-by-one, it means we will fail 
and restart planning for any in-flight queries once for each change (10 times 
here) rather than once for overall poll.

Do we like this? Does this help with the goal of performance? Can we do 
something else?


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@235
PS9, Line 235:         catalog_.addDb(dbName, msDb);
Don't add the DB for the same reason cited for tables. If this Impala is used 
for Sales, we really don't need to take space in the cache for DBs created by 
Production, say.

Also, if this is a remove, we should remove the old DB, then add a new one. 
Does this avoid the need for the db Thrift switch code discussed in the DB 
class?

Reason: we need to remove all tables for the DB, and all partitions for all 
those tables. If a DB is dropped and replaced, none of the prior tables 
survive. We should not retain them in our cache.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@238
PS9, Line 238:         if (db != null) {
Let's think a bit. We're checking for null db on every event. We don't care 
about a new DB. So, should we do the "we don't have the DB in our cache so we 
don't care about this event" check before the switch statement?


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@261
PS9, Line 261:         LOG.warn(
We'll get thousands of these. Set this to Log.detail/debug (whatever the 
finest-grain level is). We know it is OK to ignore them because you wrote the 
code stating that this is OK, not much the user can do with the warnings other 
than call for help.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@271
PS9, Line 271:    * table in-place
We're just invalidating, not renaming. So, just remove the old table. Impala 
will load the new one on demand when the next query arrives that uses it.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@281
PS9, Line 281:             event.getMessage());
Can deserialization fail? If so, what does that exception due to the scheduler 
when propagated back up through the event thread?


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@290
PS9, Line 290:         return false;
Wait, then why did we receive this event? Did something else about the table 
change?

Should we just invalidate the table to be safe if we don't know what changed?


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@296
PS9, Line 296:       catalog_.addTable(newTable.getDbName(), 
newTable.getTableName());
Skip this step for reasons given above.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@299
PS9, Line 299:       throw new MetastoreNotificationException(e);
How is this handled?


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@304
PS9, Line 304:    * Deserializes the event message and create a database object 
out of it.
nit: Deserializes ... create[s]...


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@315
PS9, Line 315:       if 
("CREATE_DATABASE".equalsIgnoreCase(event.getEventType())) {
The switch statement already parsed out the event type. Looks like we're doing 
it here again so we can share some code at the top and bottom.

Better: one method per event. Put common code into methods, called from the 
handler methods.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@318
PS9, Line 318:                 .getCreateDatabaseMessage(event.getMessage());
This code block appears multiple times. For each I'll ask error handling 
strategy. Cries out for being moved to its own method. Each returns a different 
class. This can be handled with Java generics and passing the class of the 
expected message type.

Which suggests that the method either check the message class type before 
casting, or handle the class cast exception. This will make the code more 
resilient to unexpected changes to the RPC protocol since the protocol itself 
is not versioned (I believe.)


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@326
PS9, Line 326:         if 
(!checkSupportedCasesForAlterDatabaseEvent(msDbBefore, msDb)) {
Good error handling. Can be made better. Change the method to 
"validateAlterDatabaseEvent". Have it return void. In the method, parse out 
each error case:

if (first problem occurs) {
  throw Something("describing the first problem"
} else if (second problem occurs) { ...

That way, we'll log the precise failure mode making it easier to diagnose 
issues in the field when all we have are logs.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@343
PS9, Line 343:       throw new MetastoreNotificationException(e);
Tests should trigger each error condition to see how they are reported in the 
log. Determine if we could track down the problem from the log messages. Error 
injection is helpful in such cases. Do we have a preferred mechanism? Else, we 
can invent something.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@365
PS9, Line 365:     if (table == null) {
Why would we call this for a null table? Preconditions instead?


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@373
PS9, Line 373:         + " invalidated because the table was recreated in 
metastore");
Is this method only for recreation? Looks like it is called for ALTER_TABLE, 
which is not recreation. Better log message?

Also, how the heck does the catalog know if the table was removed or added? Is 
this the right method to be calling here? Does that method call out to HMS to 
learn about the table? If so, should we just invalidate instead?


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@401
PS9, Line 401:   static class MetastoreNotificationException extends 
ImpalaException {
You should have gotten a warning here about lack of a serialization ID, I you 
are using an IDE. To solve that, add the suppress warnings annotation that your 
IDE suggests.

Also, where do we expect to catch the exception? If inside this code only, then 
a nested class is fine. If outside, then moving the class to a separate file is 
a good idea.

Finally, nested classes generally should appear at the top of the file. Mark 
the class as public unless you really want to hide if from other modules.



--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 9
Gerrit-Owner: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Paul Rogers <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Thu, 27 Dec 2018 20:06:35 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

Reply via email to