Re: [Pki-devel] [PATCH] 0134 Block reads during reload of LDAP-based profiles

2016-09-14 Thread Fraser Tweedale
On Wed, Sep 14, 2016 at 07:16:32PM -0500, Endi Sukma Dewata wrote:
> On 9/14/2016 7:14 AM, Fraser Tweedale wrote:
> > Hi team,
> > 
> > The attached patch fixes (yet another) race condition in
> > LDAPProfileSubsystem.
> > 
> > https://fedorahosted.org/pki/ticket/2453
> > 
> > Additional context: https://fedorahosted.org/freeipa/ticket/6274
> > 
> > Thanks,
> > Fraser
> 
> The patch looks fine, but probably it can be simplified like this:
> 
> class LDAPProfileSubsystem {
> 
> void init() {
> 
> // load initial profiles
> repository = new LDAPProfileRepository();
> repository.load();
> 
> // monitor profile changes in the background
> monitor = new Thread(repository);
> monitor.start();
> }
> 
> IProfile getProfile(id) {
> return repository.getProfile(id);
> }
> }
> 
> class LDAPProfileRepository {
> 
> LinkedHashMap profiles = ...
> 
> void synchronized load() {
> 
> // create persistent search
> conn = dbFactory.getConn();
> results = conn.search(...);
> 
> // get number of profiles
> entry = results.next();
> numProfiles = entry.getAttribute("numSubordinates");
> 
> for (i=0; i // read profile
> entry = results.next();
> readProfile(entry);
> }
> }
> 
> void synchronized readProfile() {
> ...
> }
> 
> IProfile synchronized getProfile(id) {
> return profiles.get(id);
> }
> 
> void run() {
> 
> while (true) {
> try {
> // process profile changes
> while (results.hasMoreElements()) {
> entry = results.next();
> ...
> }
> } catch (...) {
> // reconnect
> load();
> }
> }
> }
> }
> 
> So the load() will block during initialization and will also block readers
> during reload after reconnect. We probably can replace "synchronized" with
> ReadWriteLock to allow concurrent readers.
> 
Yep, that's a good approach.

> Feel free to push the patch as is (assuming it's well tested). We can make
> further improvements later on.
> 
> One thing though, I highly suggest that we fix this issue on both Fedora and
> RHEL/CentOS platforms. The patch is non-trivial, so the behavior could be
> different if not applied consistently. Since PKI is developed mainly on
> Fedora but used on different platforms, it would be much easier to
> troubleshoot issues by keeping the behavior consistent across platforms,
> especially on anything related to concurrency.
> 
> We don't need to create new builds for all platforms at the same time, but
> we should at least push this patch to all 10.3 branches so it can be picked
> up in the next 10.3 build of the corresponding platform.
> 
The patch is (at this stage) not destined for 10.3 at all.  I'd
prefer to push it to master to be included in Fedora when 10.4 gets
released, and other platforms' builds whenever they rebase.

I might go ahead and implement your suggested change before merging,
too, although probably as a second patch.

Thanks for reviewing!

Cheers,
Fraser

___
Pki-devel mailing list
Pki-devel@redhat.com
https://www.redhat.com/mailman/listinfo/pki-devel


[Pki-devel] [PATCH] 0134 Block reads during reload of LDAP-based profiles

2016-09-14 Thread Fraser Tweedale
Hi team,

The attached patch fixes (yet another) race condition in
LDAPProfileSubsystem.

https://fedorahosted.org/pki/ticket/2453

Additional context: https://fedorahosted.org/freeipa/ticket/6274

Thanks,
Fraser
From 24a5ad6f84387055468e0125df90fea6635da484 Mon Sep 17 00:00:00 2001
From: Fraser Tweedale 
Date: Wed, 14 Sep 2016 19:39:36 +1000
Subject: [PATCH] Block reads during reload of LDAP-based profiles

LDAP disconnect (e.g. due to DS restart) causes LDAPProfileSubsystem
to drop all its profiles and reload them.  If a profile is read
during this time, e.g. to issue a certificate, it might not have
been reloaded thus causing the operation to fail.

Introduce the AsyncLoader class which allows a consumer to await the
completion of a (re)load, if one is happening.  Update the
getProfile and getProfileIds method to use it.

The existing 'initialLoadDone' CountDownLatch for blocking
LDAPProfileSubsystem init until the inital load of profiles is
completed was subsumed by AsyncLoader.

Fixes: https://fedorahosted.org/pki/ticket/2453
---
 .../src/com/netscape/certsrv/util/AsyncLoader.java | 86 ++
 .../cmscore/profile/LDAPProfileSubsystem.java  | 59 ++-
 2 files changed, 127 insertions(+), 18 deletions(-)
 create mode 100644 base/common/src/com/netscape/certsrv/util/AsyncLoader.java

diff --git a/base/common/src/com/netscape/certsrv/util/AsyncLoader.java 
b/base/common/src/com/netscape/certsrv/util/AsyncLoader.java
new file mode 100644
index 
..39f8efd3272607ed6ac219b1b42bf9a4cb076a80
--- /dev/null
+++ b/base/common/src/com/netscape/certsrv/util/AsyncLoader.java
@@ -0,0 +1,86 @@
+// --- BEGIN COPYRIGHT BLOCK ---
+// This program is free software; you can redistribute it and/or modify
+// it under the terms of the GNU General Public License as published by
+// the Free Software Foundation; version 2 of the License.
+//
+// This program is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+//
+// You should have received a copy of the GNU General Public License along
+// with this program; if not, write to the Free Software Foundation, Inc.,
+// 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+//
+// (C) 2016  Red Hat, Inc.
+// All rights reserved.
+// --- END COPYRIGHT BLOCK ---
+
+package com.netscape.certsrv.util;
+
+import java.util.concurrent.CountDownLatch;
+import java.util.concurrent.locks.ReentrantLock;
+
+/** A locking mechanism for loading or reloading an initially
+ * unknown number of items.
+ *
+ * The "producer" is the thread that loads items, informing the
+ * Loader when each item is loaded and how many items there are
+ * (when that fact becomes known).
+ *
+ * Other threads can await the completion of a (re)loading
+ * process.
+ */
+public class AsyncLoader {
+private CountDownLatch producerInitialised = new CountDownLatch(1);
+private ReentrantLock loadingLock = new ReentrantLock();
+private Integer numItems = null;
+private int numItemsLoaded = 0;
+
+/**
+ * Acquire the lock as a producer.
+ */
+public void startLoading() {
+numItems = null;
+numItemsLoaded = 0;
+loadingLock.lock();
+producerInitialised.countDown();
+}
+
+/**
+ * Increment the number of items loaded by 1.  If the number
+ * of items is known and that many items have been loaded,
+ * unlock the loader.
+ */
+public void increment() {
+numItemsLoaded += 1;
+checkLoadDone();
+}
+
+/**
+ * Set the number of items.  If the number of items already
+ * loaded is equal to or greater than the number, unlock the
+ * loader.
+ */
+public void setNumItems(Integer n) {
+numItems = n;
+checkLoadDone();
+}
+
+private void checkLoadDone() {
+if (numItems != null && numItemsLoaded >= numItems) {
+while (loadingLock.isHeldByCurrentThread())
+loadingLock.unlock();
+}
+}
+
+public void awaitLoadDone() throws InterruptedException {
+/* A consumer may await upon the Loader immediately after
+ * starting the producer.  To ensure that the producer
+ * has time to acquire the lock, we use a CountDownLatch.
+ */
+producerInitialised.await();
+loadingLock.lock();
+loadingLock.unlock();
+}
+}
diff --git 
a/base/server/cmscore/src/com/netscape/cmscore/profile/LDAPProfileSubsystem.java
 
b/base/server/cmscore/src/com/netscape/cmscore/profile/LDAPProfileSubsystem.java
index 
6dea1a0d88beaefeea489ea58ad9ad13d2da8bd7..fd5aa64eed8385ad18a307b6addaee6222d9f9cf
 100644
--- 
a/base/server/cmscore/src/com/netscape/cmscore/profile/LDAPProfileSubsystem.java
+++