kevinrr888 commented on PR #4524: URL: https://github.com/apache/accumulo/pull/4524#issuecomment-2271363148
I have compiled and wrote all the follow on issues for this to be merged. This covers all the issues I marked in this PR as `TODO 4131` and all the unresolved review comments. _**Here is a synopsis of the follow on issues I will create:**_ * Refactor MultipleStoresIT to function more similarly to other Fate tests like FateIT * Add FateKey toString() method * Move MetaFateStore to org.apache.accumulo.core.fate.zookeeper * Deprecate AbstractFateStore.createDummyLockID(): this includes creating a path in ZK for utilities to get a ZK lock, changing `admin fate fail` and `admin fate delete` commands to get a LockID at this path and no longer require the Manager to be down, and only use a LockID for a store if write operations are expected: the store should fail on write operations if writes are not expected. * Replace AFS.verifyReserved with a condition on the RESERVATION_COLUMN to verify that it is reserved * Make WorkFinder and the TransactionRunners critical to the Manager * Refactor how the RESERVATION_COLUMN works for UserFateStore: create/delete column on reserve/unreserve * Additional fate test case from https://github.com/apache/accumulo/pull/4524#discussion_r1702301996 _**Here are some questions I still have:**_ 1) This was an existing TODO that you had marked in your "distributed FATE" WIP PR: ``` // TODO 4131 // TODO make the max time a function of the number of concurrent callers, as the number of // concurrent callers increases then increase the max wait time // TODO could support signaling within this instance for known events // TODO made the maxWait low so this would be responsive... that may put a lot of load in // the case there are lots of things waiting... ``` I made the max wait = curr num callers in seconds. Does this address the first TODO? For the other two TODOs, I was not sure what was to be done, so I left them. Can these be safely deleted without creating follow on issue(s)? If follow on issues should be made, perhaps it would be best for you to write these or explain it here because I am not sure what should be done. 2) Re one of your review comments: "Not a change to make in this PR. Thinking if we push all of the Fate data into a single node it would be nice to use JSON. That would make the data stored in ZK more human readable and it would make it easier to serialize and de-serialize." Could you elaborate on what you mean by "all of the Fate data into a single node". By "all" do you mean the Repos, TxInfo, Status, Reservation, and FateKey would be in one ZK node? Writing the issue, I want to be sure I'm understanding correctly. 3) Can https://github.com/apache/accumulo/pull/4524#discussion_r1663064467 be resolved? Or should an issue be made? 4) There was a TODO regarding whether the ZooUtil.LockID was the best type for the lock. Should I make a follow on issue regarding this or is ZooUtil.LockID okay? 5) Can this https://github.com/apache/accumulo/pull/4524#discussion_r1663064467 be resolved or should a follow on issue be made? 6) Can these be resolved: https://github.com/apache/accumulo/pull/4524#discussion_r1670934639, https://github.com/apache/accumulo/pull/4524#discussion_r1695958819 7) The DeadReservationCleaner runs every 30 seconds. Is this okay? Should this be longer? Shorter? Once the above questions are answered, I can complete my full list of new follow-on issues. This will allow me to remove all the `TODO 4131` in this PR and all the comments in this PR can be resolved as all of these will be addressed or have a follow on issue created for them. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
