[jira] [Updated] (HDFS-12134) libhdfs++: Add a synchronization interface for the GSSAPI
[ https://issues.apache.org/jira/browse/HDFS-12134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-12134: --- Fix Version/s: HDFS-8707 > libhdfs++: Add a synchronization interface for the GSSAPI > - > > Key: HDFS-12134 > URL: https://issues.apache.org/jira/browse/HDFS-12134 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer >Priority: Major > Fix For: HDFS-8707 > > Attachments: HDFS-12134.HDFS-8707.000.patch, > HDFS-12134.HDFS-8707.001.patch, HDFS-12134.HDFS-8707.002.patch, > HDFS-12134.HDFS-8707.003.patch, HDFS-12134.HDFS-8707.004.patch > > > Bits of the GSSAPI that Cyrus Sasl uses aren't thread safe. There needs to > be a way for a client application to share a lock with this library in order > to prevent race conditions. It can be done using event callbacks through the > C API but we can provide something more robust (RAII) in the C++ API. > Proposed client supplied lock, pretty much the C++17 lockable concept. Use a > default if one isn't provided. This would be scoped at the process level > since it's unlikely that multiple instances of libgssapi unless someone puts > some effort in with dlopen/dlsym. > {code} > class LockProvider > { > virtual ~LockProvider() {} > // allow client application to deny access to the lock > virtual bool try_lock() = 0; > virtual void unlock() = 0; > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12134) libhdfs++: Add a synchronization interface for the GSSAPI
[ https://issues.apache.org/jira/browse/HDFS-12134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-12134: --- Resolution: Fixed Status: Resolved (was: Patch Available) Thanks for taking a final look at this [~mdeepak], I committed the patch to HDFS-8707. > libhdfs++: Add a synchronization interface for the GSSAPI > - > > Key: HDFS-12134 > URL: https://issues.apache.org/jira/browse/HDFS-12134 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-12134.HDFS-8707.000.patch, > HDFS-12134.HDFS-8707.001.patch, HDFS-12134.HDFS-8707.002.patch, > HDFS-12134.HDFS-8707.003.patch, HDFS-12134.HDFS-8707.004.patch > > > Bits of the GSSAPI that Cyrus Sasl uses aren't thread safe. There needs to > be a way for a client application to share a lock with this library in order > to prevent race conditions. It can be done using event callbacks through the > C API but we can provide something more robust (RAII) in the C++ API. > Proposed client supplied lock, pretty much the C++17 lockable concept. Use a > default if one isn't provided. This would be scoped at the process level > since it's unlikely that multiple instances of libgssapi unless someone puts > some effort in with dlopen/dlsym. > {code} > class LockProvider > { > virtual ~LockProvider() {} > // allow client application to deny access to the lock > virtual bool try_lock() = 0; > virtual void unlock() = 0; > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12134) libhdfs++: Add a synchronization interface for the GSSAPI
[ https://issues.apache.org/jira/browse/HDFS-12134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-12134: --- Attachment: HDFS-12134.HDFS-8707.004.patch It seems like one of the recent patches to the build system sometimes lets the hadoop-hdfs-native-client test only run the libhdfs (jvm/jni) side of things. I'll look into what could be doing this. Doing a clean rebuild with maven showed a deterministic error in a test that wasn't modified to reflect one of the code changes [~mdeepak] suggested. patch 004 should address it. > libhdfs++: Add a synchronization interface for the GSSAPI > - > > Key: HDFS-12134 > URL: https://issues.apache.org/jira/browse/HDFS-12134 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-12134.HDFS-8707.000.patch, > HDFS-12134.HDFS-8707.001.patch, HDFS-12134.HDFS-8707.002.patch, > HDFS-12134.HDFS-8707.003.patch, HDFS-12134.HDFS-8707.004.patch > > > Bits of the GSSAPI that Cyrus Sasl uses aren't thread safe. There needs to > be a way for a client application to share a lock with this library in order > to prevent race conditions. It can be done using event callbacks through the > C API but we can provide something more robust (RAII) in the C++ API. > Proposed client supplied lock, pretty much the C++17 lockable concept. Use a > default if one isn't provided. This would be scoped at the process level > since it's unlikely that multiple instances of libgssapi unless someone puts > some effort in with dlopen/dlsym. > {code} > class LockProvider > { > virtual ~LockProvider() {} > // allow client application to deny access to the lock > virtual bool try_lock() = 0; > virtual void unlock() = 0; > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12134) libhdfs++: Add a synchronization interface for the GSSAPI
[ https://issues.apache.org/jira/browse/HDFS-12134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-12134: --- Attachment: HDFS-12134.HDFS-8707.003.patch Thanks for reviewing again [~mdeepak]. I also have some burn-in now with external tests that used to hit gssapi issues and it looks like this took care of them. Reuploading the same patch since it looks like the CI build had issues. If that goes well I'll commit this to HDFS-8707. > libhdfs++: Add a synchronization interface for the GSSAPI > - > > Key: HDFS-12134 > URL: https://issues.apache.org/jira/browse/HDFS-12134 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-12134.HDFS-8707.000.patch, > HDFS-12134.HDFS-8707.001.patch, HDFS-12134.HDFS-8707.002.patch, > HDFS-12134.HDFS-8707.003.patch > > > Bits of the GSSAPI that Cyrus Sasl uses aren't thread safe. There needs to > be a way for a client application to share a lock with this library in order > to prevent race conditions. It can be done using event callbacks through the > C API but we can provide something more robust (RAII) in the C++ API. > Proposed client supplied lock, pretty much the C++17 lockable concept. Use a > default if one isn't provided. This would be scoped at the process level > since it's unlikely that multiple instances of libgssapi unless someone puts > some effort in with dlopen/dlsym. > {code} > class LockProvider > { > virtual ~LockProvider() {} > // allow client application to deny access to the lock > virtual bool try_lock() = 0; > virtual void unlock() = 0; > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12134) libhdfs++: Add a synchronization interface for the GSSAPI
[ https://issues.apache.org/jira/browse/HDFS-12134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-12134: --- Attachment: HDFS-12134.HDFS-8707.002.patch Forgot to update comments in header for last patch > libhdfs++: Add a synchronization interface for the GSSAPI > - > > Key: HDFS-12134 > URL: https://issues.apache.org/jira/browse/HDFS-12134 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-12134.HDFS-8707.000.patch, > HDFS-12134.HDFS-8707.001.patch, HDFS-12134.HDFS-8707.002.patch > > > Bits of the GSSAPI that Cyrus Sasl uses aren't thread safe. There needs to > be a way for a client application to share a lock with this library in order > to prevent race conditions. It can be done using event callbacks through the > C API but we can provide something more robust (RAII) in the C++ API. > Proposed client supplied lock, pretty much the C++17 lockable concept. Use a > default if one isn't provided. This would be scoped at the process level > since it's unlikely that multiple instances of libgssapi unless someone puts > some effort in with dlopen/dlsym. > {code} > class LockProvider > { > virtual ~LockProvider() {} > // allow client application to deny access to the lock > virtual bool try_lock() = 0; > virtual void unlock() = 0; > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12134) libhdfs++: Add a synchronization interface for the GSSAPI
[ https://issues.apache.org/jira/browse/HDFS-12134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-12134: --- Attachment: HDFS-12134.HDFS-8707.001.patch Thanks for the review [~mdeepak]! I have a new patch that should address most of your comments. bq. include/hdfspp/locks.h: Remove additional variable finalize_ and just use gssapiMtx for checks I left finalize in so a user can tell if the locks have already been set to something else. Moved to using the member mutex of LockManager in the same way LogManager works; I'm not sure we'd wan't to use the LockManager::gssapiMutex for this with nothing else synchronizing it. bq. Can we remove InitLocks method and initialize via options? This will prevent misuse of the API by invoking InitLocks multiple times. It's possible, but I think it would just move misuse to the options object. It's not a singleton and it's intended to represent some configuration data. If you have a long running process that connects to a few different clusters I think it's likely that you'll have multiple Options objects so you'd need the same sort of logic to check if it's already been set. bq. Make bool _locked an std::atomic_bool _locked use volatile to be safe for other Lock class fields. Turned this into a normal bool and just use lock guards with LockManager::state_lock. That forces memory fences which should take care of your concern. Volatile only forces loads and stores to be generated but doesn't prevent instruction or access reorder and cache coherence is depended on the architecture. bq. lib/common/locks.cc: 51 should be _mtx.try_lock() Good point. After looking at the sasl_engine code a bit more I decided that just dropping try_lock into the current code can lead to unsafe situations. If we can't acquire the lock during object destruction we don't attempt to retry locking later. I changed Mutex::try_lock to mutex::lock to make it clear that you should always get the lock if you wait long enough, let me know if this works for you. > libhdfs++: Add a synchronization interface for the GSSAPI > - > > Key: HDFS-12134 > URL: https://issues.apache.org/jira/browse/HDFS-12134 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-12134.HDFS-8707.000.patch, > HDFS-12134.HDFS-8707.001.patch > > > Bits of the GSSAPI that Cyrus Sasl uses aren't thread safe. There needs to > be a way for a client application to share a lock with this library in order > to prevent race conditions. It can be done using event callbacks through the > C API but we can provide something more robust (RAII) in the C++ API. > Proposed client supplied lock, pretty much the C++17 lockable concept. Use a > default if one isn't provided. This would be scoped at the process level > since it's unlikely that multiple instances of libgssapi unless someone puts > some effort in with dlopen/dlsym. > {code} > class LockProvider > { > virtual ~LockProvider() {} > // allow client application to deny access to the lock > virtual bool try_lock() = 0; > virtual void unlock() = 0; > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12134) libhdfs++: Add a synchronization interface for the GSSAPI
[ https://issues.apache.org/jira/browse/HDFS-12134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-12134: --- Status: Patch Available (was: Open) > libhdfs++: Add a synchronization interface for the GSSAPI > - > > Key: HDFS-12134 > URL: https://issues.apache.org/jira/browse/HDFS-12134 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-12134.HDFS-8707.000.patch > > > Bits of the GSSAPI that Cyrus Sasl uses aren't thread safe. There needs to > be a way for a client application to share a lock with this library in order > to prevent race conditions. It can be done using event callbacks through the > C API but we can provide something more robust (RAII) in the C++ API. > Proposed client supplied lock, pretty much the C++17 lockable concept. Use a > default if one isn't provided. This would be scoped at the process level > since it's unlikely that multiple instances of libgssapi unless someone puts > some effort in with dlopen/dlsym. > {code} > class LockProvider > { > virtual ~LockProvider() {} > // allow client application to deny access to the lock > virtual bool try_lock() = 0; > virtual void unlock() = 0; > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12134) libhdfs++: Add a synchronization interface for the GSSAPI
[ https://issues.apache.org/jira/browse/HDFS-12134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-12134: --- Attachment: HDFS-12134.HDFS-8707.000.patch First pass at this work. Don't have kerberized CI tests so I'm going to get some burn in on some code that can do concurrent SASL calls in a way that causes issues. Does include some basic unit tests. 2 parts to this: -Add a LockManager that can take pluggable locks from a user, or default to a basic one derived from std::mutex. Right now it only includes a GSSAPI mutex but more can be added if we want locking around config file access and things like that. -Wrap all calls into Cyrus SASL and GSasl APIs with lock guards to prevent concurrent access. Docs (or lack of docs) for both libraries seem to defer to looking at the implementation about which bits are thread safe so all calls are protected for now. I think in most cases it's library initialization and sasl context initialization that are unsafe but the other calls shouldn't block and I'd rather err on the side of caution here. Generally the process-wide mutex is acquired once per sasl API call and released right away. The only exception to this is GSasl initialization and cleanup does two calls and it made sense to do them as an atomic operation. And reviews or feedback is welcome. Eventually we need to get rid of the GSasl engine due to license conflicts (ASF vs GPL) but this didn't seem like the place to do it. > libhdfs++: Add a synchronization interface for the GSSAPI > - > > Key: HDFS-12134 > URL: https://issues.apache.org/jira/browse/HDFS-12134 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-12134.HDFS-8707.000.patch > > > Bits of the GSSAPI that Cyrus Sasl uses aren't thread safe. There needs to > be a way for a client application to share a lock with this library in order > to prevent race conditions. It can be done using event callbacks through the > C API but we can provide something more robust (RAII) in the C++ API. > Proposed client supplied lock, pretty much the C++17 lockable concept. Use a > default if one isn't provided. This would be scoped at the process level > since it's unlikely that multiple instances of libgssapi unless someone puts > some effort in with dlopen/dlsym. > {code} > class LockProvider > { > virtual ~LockProvider() {} > // allow client application to deny access to the lock > virtual bool try_lock() = 0; > virtual void unlock() = 0; > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org