[ https://issues.apache.org/jira/browse/HDFS-11028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813059#comment-15813059 ]
James Clampffer commented on HDFS-11028: ---------------------------------------- In case we needed another example of why "delete this" and tenuous resource management in C++ are a recipe for pain: it looks like this can leak memory if the FileSystem destructor is called while this waits on the asio dns resolver. The bug existed before this patch but the cancel test executable in my patch provides a simple reproducer. Situation: In common/namenode_info.cc BulkResolve takes a set of NamenodeInfo objects and does a DNS lookup on each host to get a vector of endpoints. In order to be fast the function does an arbitrary amount of async lookups in parallel and joins at the end to make the API reasonably simple to use. In order to keep track of multiple pipelines a vector of std::pair<Pipeline<T>*, std::shared_ptr<std::promise<Status>>> is set up. Each pair represents a continuation pipeline that's doing the resolve work and the std::promise that will eventually contain the result status assuming the continuation runs to completion. This seemed like a reasonable way to encapsulate async work using continuations that needed to be joined but it turns out it's incredibly difficult to clean this up if it's been interrupted. -Can't simply call delete on the Pipeline<T> pointers contained in the vector because the continuation may have already called "delete this", if it has self-destructed the pointer remains non-null so double deleting will break things. -Can't loop though the vector can call cancel on all the Pipelines because some may have been destructed via "delete this". If the malloc implementation is being generous the call might give __cxa_pure_virtual exception but it's more likely to just trash the heap. -Can't check the status of the Pipeline because it's wrapped in a promise, so that will just block. Possible fixes: -Add a pointer-to-a-flag to the continuation state so the pipeline can indicate it self destructed, make sure the ResolveContinuation can actually deal with cancel semantics. -Rewrite dns lookup by allocating memory correctly and calling asio functions. I'll file another jira to rewrite the dns resolution code as I don't think an issue that's been around for so long should block this. The temporary fix is to avoid deleting the FileSystem object immediately after cancel. The pipeline will clean itself up when the resolver returns, but it risks invalid writes if the vector of endpoints disappears since it's holding a back_inserter i.e. it's a dangling pointer issue obfuscated by piles of abstraction. > libhdfs++: FileHandleImpl::CancelOperations needs to be able to cancel > pending connections > ------------------------------------------------------------------------------------------ > > Key: HDFS-11028 > URL: https://issues.apache.org/jira/browse/HDFS-11028 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client > Reporter: James Clampffer > Assignee: James Clampffer > Attachments: HDFS-11028.HDFS-8707.000.patch, > HDFS-11028.HDFS-8707.001.patch, HDFS-11028.HDFS-8707.002.patch > > > Cancel support is now reasonably robust except the case where a FileHandle > operation ends up causing the RpcEngine to try to create a new RpcConnection. > In HA configs it's common to have something like 10-20 failovers and a 20 > second failover delay (no exponential backoff just yet). This means that all > of the functions with synchronous interfaces can still block for many minutes > after an operation has been canceled, and often the cause of this is > something trivial like a bad config file. > The current design makes this sort of thing tricky to do because the > FileHandles need to be individually cancelable via CancelOperations, but they > share the RpcEngine that does the async magic. > Updated design: > Original design would end up forcing lots of reconnects. Not a huge issue on > an unauthenticated cluster but on a kerberized cluster this is a recipe for > Kerberos thinking we're attempting a replay attack. > User visible cancellation and internal resources cleanup are separable > issues. The former can be implemented by atomically swapping the callback of > the operation to be canceled with a no-op callback. The original callback is > then posted to the IoService with an OperationCanceled status and the user is > no longer blocked. For RPC cancels this is sufficient, it's not expensive to > keep a request around a little bit longer and when it's eventually invoked or > timed out it invokes the no-op callback and is ignored (other than a trace > level log notification). Connect cancels push a flag down into the RPC > engine to kill the connection and make sure it doesn't attempt to reconnect. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org