Hello,

Can you provide your feedback on this issue? We need to make decision soon.

1) Summary

The current approach updates changeID (primary key for permission change
and path change) manually. In a single transaction, the code reads the max
changeID from DB, increases it by one, and save the value in new change
entry. If two threads are adding changes into DB at the same time,
collision happens (primary key does not allow multiple entries having the
same value), and one transaction fails. Then the failed transaction goes
through multiple re-tries. If retry count reaches max value, the
transaction fails.

In our stress testing on a single sentry server, with 15 clients doing
grant/revoke operations concurrently, we saw multiple transaction failures,
and the exponential-back-off retry increases the latency of every
transaction in sentry. We have serious performance issue on saving
permission and path updates.

2) Potential solutions

2.1) Find out why we have collision on a single sentry server with
synchronization on saving updates. Once find the cause, fix it.
+ Follow existing approach. Does not introduce big change to the code base.
- Need time to investigate why synchronization at application level on a
single sentry server does not prevent key collision.
- Does not scale. All updates are serialized, not much concurrency.
- Still have key collision exception and transaction failure when more than
one Sentry servers are deployed.
- Transaction failure at collision increases time to execute a transaction.
- It is confusing to customer that there are transaction failure in normal
operation. Increase support cases

2.2) Auto-increment changeID and send delta changes as much as possible

The patch that achieves 5 times or more performance increase than the
current approach.

It contains the following changes

   - revert sentry-1795 (so the changeID is auto-increment. This avoids key
   collision. This is main reason we have performance improvement)
   - revert sentry-1824 (no need to synchronize when changeID is
   auto-increment)
   - get continuous delta list from SentryStore even when the delta list
   has hole (for example, the list is 1,2,3,5,6, return 1,2,3. If the hole is
   at the front of list, return full snapshot)

+ Relative small changes. Verified working with good performance when there
is no transaction failure

+ When the hole in delta list is temporary (transaction in flight), return
the continuous delta list is effective to deal with the hole. Most likely,
the hole will disappear next time HDFS requests for changes.

- When there is transaction failure (the hole in changeID is permanent),
sends back full snapshot, which is expensive. If we can detect permanent
hole, then we don't need to send full snapshot, which is very expensive,
and may exhaust memory for big customer

2.3) Use timestamp to sort the changes

a) use timestamp, such as MSentryPermChange.createTimeMs or
MSentryPathChange.createTimeMs to sort the entries. If there are more than
one entry having same timestamp, use changeID to break the tie.
b) HDFS asks for updates using these timestamp values instead of changeID.
Sentry server sends back changes at and after this timestamp. HDFS keeps
the list of changeIDs associated with the requesting timestamp and skip
entries already processed. This is to handle the situation when more than
one entry having same timestamp, and some are sent in previous request, and
some need to be send in next request.
c) changeID is primary key, only used to uniquely identify the entry, and
not required to be sequential nor consecutive.
d) Purge the change entries in DB using timestamp instead of changeID. For
example, keep 3 polling intervals of entries to allow HDFS getting the
changes before they are purged.
+ Sentry only sends full snapshot to HDFS at the first time when HDFS
starts, and then always sends delta changes to HDFS
+ High concurrency. Scale well with large number of clients
- Relative big code change for API between sentry server and sentry plugin
at HDFS.
- No easy way to detect that HDFS has received and processed all updates

3) Decisions

My suggestion is that we take approach 2.2) for short term and take the hit
of full snapshot when there is transaction failure. And take approach 2.3)
as long term solution.

Thanks,

Lina

Reply via email to