[ https://issues.apache.org/jira/browse/THRIFT-862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bryan Duxbury closed THRIFT-862. -------------------------------- Resolution: Fixed Fix Version/s: 0.6 Assignee: Ning Liang (was: Bryan Duxbury) I just committed this. Thanks for the patch Ning! > Async client issues / improvements > ---------------------------------- > > Key: THRIFT-862 > URL: https://issues.apache.org/jira/browse/THRIFT-862 > Project: Thrift > Issue Type: Task > Components: Java - Library > Affects Versions: 0.4 > Reporter: Ning Liang > Assignee: Ning Liang > Fix For: 0.6 > > Attachments: thrift-862-v2.diff, thrift-862-v3.diff, thrift-862.diff > > > From Thrift-845: > Hey Eric, > You bring up some good points. We have indirect communication between the > client manager and client going through the method call object, which we can > refactor into direct communication. I'd like to keep the state machine > separate from the async client manager, though - the client manager only > coordinates transitioning the method call on ready. Responses inline for the > other comments: > 1. Timeouts per-transition aren't really what I want as a consumer of this > api. It looks like there are about five states, which as a consumer I need to > know since the effective timeout on the whole call is then 5x the one I set. > I'd rather be guaranteed I will be called back one way or another within the > timeout i set, counting time i spend in the selector's queue, etc. Seems like > that would be easy to implement. > * The naming is misleading - we can have idle socket timeout or an > overall method call timeout (more like a deadline). I coded for the former > case, so we can modify to the latter or rename. Modifying is trivial - we > just need to track the method call start time, as opposed to the last action > time. The effective deadline is arbitrarily long, since we're updating the > last action timestamp on each socket read/write, which doesn't necessarily > correspond to a transition between method states. > 2. You should not be allowed to set a timeout lower than > TAsyncClientManager.SELECT_TIME, and the value there should be much lower, > maybe 5ms? Because its per-transition as above, the true minimum timeout for > an entire call is 1s if each of its states took 200ms. We've actually run > clients with lower than 200ms timeouts. > * Agreed, though we won't be able to get to 5ms resolution. The reason is > because the JVM makes no guarantees on actual sleep time. Maybe a better > solution would be to use Selector.selectNow(), which returns immediately with > all ready keys, making our timeout granularity as small as possible. > 3. TAsyncMethodCall.onError does not do anything to remove itself from > TAsyncClientManager.timeoutWatchSet, so unless it shows up in selectedKeys() > it will stay there until its timeout. Not a big deal, but could make that set > prohibitively large if there are many errors within a timeout interval. > * TAsyncClientManager removes the method after a transition if the > method's client.hasError(). Again, somewhat confusing since it's relying on > the method calling the client's onError. > 4. We call TAsyncMethodCall.transition before adding it to timeoutWatchSet, > so lastTransitionTime is initialized before it is inspected. But, this is a > somewhat dangerous pattern to allow an uninitialized primitive to ever be > accessible. You should probably initialize it to the current time or a > sentinel on instantiation. > * Current time is good - if we end up using the deadline definition of > timeout, we don't even have to update the value. > 5. it might be safer for lastTransitionTime = System.currentTimeMillis() to > be at the top of transition(SelectionKey key) so you can guarantee the > timestamp is updated even for errant transitions...don't think it matter's > functionally in the current code, but it could in refactorings. you don't > want to inadvertently obscure an error with a timeout. > * Good point. I wanted the timestamp to reflect the last socket activity > - putting at the beginning of transition could make it inaccurate on long > socket reads/writes. Maybe we can use the finally block? > 6. TNonblockingServerSocket is the only non-test, non-tutorial thrift source > file that uses System.err.println instead of LOGGER.warn > * We can easily fix. > 7. You might want to catch Exception instead of Throwable? I don't think you > intend or want to catch things like OutOfMemoryError, etc. The docs say "An > Error is a subclass of Throwable that indicates serious problems that a > reasonable application should not try to catch" > * Also easily fixed, though I want to make sure we didn't do this for a > reason earlier. Bryan, why Throwable vs Exception? > 8. TAsyncClientManager.call should really throw an exception if its > SelectThread is not running (because its run method did not catch an > unexpected Throwable) instead of inserting it into a queue that will grow > unbounded and cause an OOM. You probably want a catch and throw block in run > that cleans up by calling onError on all pending calls and sets its done > flag. You probably also want to do this on ClosedSelectorException instead of > just catching and logging it in an infinite loop. It's odd that you catch it > on selectedKeys() but not select(). > * That catch got folded into the catch(Throwable) > 9. timeoutIdleMethods would be a bit faster if it only called > currentTimeMillis() once instead of per-check > * Good point. I was going for accuracy but even for large timeout watch > sets the iteration time should be negligible. > 10. Probably an unnecessary optimization, but instead of relying on > Object.hashCode for removing timed out methods, you could identify them by > their start timestamp from currentTimeMillis() + a few bits of a static > atomic sequence counter, put them in a TreeSet, and shortcut the iteration > for timed out methods when you hit the first one that's not too old. > * Definitely, this was the original plan. We'll get it in on next > iteration. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.