[GitHub] [lucene-solr] murblanc commented on a change in pull request #1504: SOLR-14462: cache more than one autoscaling session

2020-05-20 Thread GitBox


murblanc commented on a change in pull request #1504:
URL: https://github.com/apache/lucene-solr/pull/1504#discussion_r428457444



##
File path: 
solr/solrj/src/test/org/apache/solr/client/solrj/cloud/autoscaling/TestPolicy.java
##
@@ -1709,13 +1708,134 @@ public DistribStateManager getDistribStateManager() {
 assertEquals(2, s1.getRefCount());
 
 s2[0].release();
-assertFalse(sessionRef.getSessionWrapper() == 
PolicyHelper.SessionWrapper.DEFAULT_INSTANCE);
+assertFalse(sessionRef.isEmpty());
 s1.release();
-assertTrue(sessionRef.getSessionWrapper() == 
PolicyHelper.SessionWrapper.DEFAULT_INSTANCE);
+assertTrue(sessionRef.isEmpty());
 
 
   }
 
+  @Test

Review comment:
   Yes. Not sure it's worth it. We'll see what others think.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1504: SOLR-14462: cache more than one autoscaling session

2020-05-20 Thread GitBox


murblanc commented on a change in pull request #1504:
URL: https://github.com/apache/lucene-solr/pull/1504#discussion_r428457206



##
File path: 
solr/solrj/src/java/org/apache/solr/client/solrj/cloud/autoscaling/PolicyHelper.java
##
@@ -382,45 +383,78 @@ static MapWriter loggingInfo(Policy policy, 
SolrCloudManager cloudManager, Sugge
   }
 
   public enum Status {
-NULL,
-//it is just created and not yet used or all operations on it has been 
completed fully
-UNUSED,
-COMPUTING, EXECUTING
+/**
+ * A command is actively using and modifying the session to compute 
placements
+ */
+COMPUTING,
+/**
+ * A command is not done yet processing its changes but no longer updates 
or even uses the session
+ */
+EXECUTING
   }
 
   /**
-   * This class stores a session for sharing purpose. If a process creates a 
session to
-   * compute operations,
-   * 1) see if there is a session that is available in the cache,
-   * 2) if yes, check if it is expired
-   * 3) if it is expired, create a new session
-   * 4) if it is not expired, borrow it
-   * 5) after computing operations put it back in the cache
+   * This class stores sessions for sharing purposes. If a process requires a 
session to
+   * compute operations:
+   * 
+   * see if there is an available non expired session in the cache,
+   * if yes, borrow it.
+   * if no, create a new one and borrow it.
+   * after computing (update) operations are done, {@link 
#returnSession(SessionWrapper)} back to the cache so it's
+   * again available for borrowing.
+   * after all borrowers are done computing then executing with the 
session, {@link #release(SessionWrapper)} it,
+   * which removes it from the cache.
+   * 
*/
   static class SessionRef {
+/**
+ * Lock protecting access to {@link #sessionWrapperSet} and to {@link 
#creationsInProgress}
+ */
 private final Object lockObj = new Object();
-private SessionWrapper sessionWrapper = SessionWrapper.DEFAULT_INSTANCE;
 
+/**
+ * Sessions currently in use in {@link Status#COMPUTING} or {@link 
Status#EXECUTING} states. As soon as all
+ * uses of a session are over, that session is removed from this set. 
Sessions not actively in use are NOT kept around.
+ *
+ * Access should only be done under the protection of {@link 
#lockObj}
+ */
+private Set sessionWrapperSet = 
Collections.newSetFromMap(new IdentityHashMap<>());
+
+
+/**
+ * Number of sessions currently being created but not yeet present in 
{@link #sessionWrapperSet}.

Review comment:
   Think different





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1504: SOLR-14462: cache more than one autoscaling session

2020-05-20 Thread GitBox


murblanc commented on a change in pull request #1504:
URL: https://github.com/apache/lucene-solr/pull/1504#discussion_r428457066



##
File path: 
solr/solrj/src/java/org/apache/solr/client/solrj/cloud/autoscaling/PolicyHelper.java
##
@@ -382,45 +383,78 @@ static MapWriter loggingInfo(Policy policy, 
SolrCloudManager cloudManager, Sugge
   }
 
   public enum Status {
-NULL,
-//it is just created and not yet used or all operations on it has been 
completed fully
-UNUSED,
-COMPUTING, EXECUTING
+/**
+ * A command is actively using and modifying the session to compute 
placements
+ */
+COMPUTING,
+/**
+ * A command is not done yet processing its changes but no longer updates 
or even uses the session
+ */
+EXECUTING
   }
 
   /**
-   * This class stores a session for sharing purpose. If a process creates a 
session to
-   * compute operations,
-   * 1) see if there is a session that is available in the cache,
-   * 2) if yes, check if it is expired
-   * 3) if it is expired, create a new session
-   * 4) if it is not expired, borrow it
-   * 5) after computing operations put it back in the cache
+   * This class stores sessions for sharing purposes. If a process requires a 
session to
+   * compute operations:
+   * 
+   * see if there is an available non expired session in the cache,
+   * if yes, borrow it.
+   * if no, create a new one and borrow it.
+   * after computing (update) operations are done, {@link 
#returnSession(SessionWrapper)} back to the cache so it's
+   * again available for borrowing.
+   * after all borrowers are done computing then executing with the 
session, {@link #release(SessionWrapper)} it,
+   * which removes it from the cache.
+   * 
*/
   static class SessionRef {
+/**
+ * Lock protecting access to {@link #sessionWrapperSet} and to {@link 
#creationsInProgress}
+ */
 private final Object lockObj = new Object();
-private SessionWrapper sessionWrapper = SessionWrapper.DEFAULT_INSTANCE;
 
+/**
+ * Sessions currently in use in {@link Status#COMPUTING} or {@link 
Status#EXECUTING} states. As soon as all
+ * uses of a session are over, that session is removed from this set. 
Sessions not actively in use are NOT kept around.
+ *
+ * Access should only be done under the protection of {@link 
#lockObj}
+ */
+private Set sessionWrapperSet = 
Collections.newSetFromMap(new IdentityHashMap<>());
+
+
+/**
+ * Number of sessions currently being created but not yeet present in 
{@link #sessionWrapperSet}.
+ *
+ * Access should only be done under the protection of {@link 
#lockObj}
+ */
+private int creationsInProgress = 0;
 
 public SessionRef() {
 }
 
-
-//only for debugging
-SessionWrapper getSessionWrapper() {
-  return sessionWrapper;
+// used only by tests
+boolean isEmpty() {
+  synchronized (lockObj) {
+return sessionWrapperSet.isEmpty();
+  }
 }
 
 /**
  * All operations suggested by the current session object
  * is complete. Do not even cache anything
  */
 private void release(SessionWrapper sessionWrapper) {
+  boolean present;
   synchronized (lockObj) {
-if (sessionWrapper.createTime == this.sessionWrapper.createTime && 
this.sessionWrapper.refCount.get() <= 0) {
-  log.debug("session set to NULL");
-  this.sessionWrapper = SessionWrapper.DEFAULT_INSTANCE;
-} // else somebody created a new session b/c of expiry . So no need to 
do anything about it
+present = sessionWrapperSet.remove(sessionWrapper);
+  }
+  if (!present) {
+log.warn("released session {} not found in session set", 
sessionWrapper.getCreateTime());
+  } else {
+  TimeSource timeSource = 
sessionWrapper.session.cloudManager.getTimeSource();

Review comment:
   Look ok to me





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1504: SOLR-14462: cache more than one autoscaling session

2020-05-20 Thread GitBox


murblanc commented on a change in pull request #1504:
URL: https://github.com/apache/lucene-solr/pull/1504#discussion_r428455296



##
File path: 
solr/solrj/src/java/org/apache/solr/client/solrj/cloud/autoscaling/PolicyHelper.java
##
@@ -429,87 +463,149 @@ private void release(SessionWrapper sessionWrapper) {
  * The session can be used by others while the caller is performing 
operations
  */
 private void returnSession(SessionWrapper sessionWrapper) {
-  TimeSource timeSource = sessionWrapper.session != null ? 
sessionWrapper.session.cloudManager.getTimeSource() : TimeSource.NANO_TIME;
+  boolean present;
   synchronized (lockObj) {
 sessionWrapper.status = Status.EXECUTING;
-if (log.isDebugEnabled()) {
-  log.debug("returnSession, curr-time {} sessionWrapper.createTime {}, 
this.sessionWrapper.createTime {} "
-  , time(timeSource, MILLISECONDS),
-  sessionWrapper.createTime,
-  this.sessionWrapper.createTime);
-}
-if (sessionWrapper.createTime == this.sessionWrapper.createTime) {
-  //this session was used for computing new operations and this can 
now be used for other
-  // computing
-  this.sessionWrapper = sessionWrapper;
+present = sessionWrapperSet.contains(sessionWrapper);
 
-  //one thread who is waiting for this need to be notified.
-  lockObj.notify();
-} else {
-  log.debug("create time NOT SAME {} ", 
SessionWrapper.DEFAULT_INSTANCE.createTime);
-  //else just ignore it
-}
+// wake up single thread waiting for a session return (ok if not woken 
up, wait is short)
+// Important to wake up a single one, otherwise of multiple waiting 
threads, all but one will immediately create new sessions
+lockObj.notify();
   }
 
+  // Logging
+  if (present) {
+if (log.isDebugEnabled()) {
+  log.debug("returnSession {}", sessionWrapper.getCreateTime());
+}
+  } else {
+log.warn("returning unknown session {} ", 
sessionWrapper.getCreateTime());
+  }
 }
 
-
-public SessionWrapper get(SolrCloudManager cloudManager) throws 
IOException, InterruptedException {
+/**
+ * Method returning an available session that can be used for {@link 
Status#COMPUTING}, either from the
+ * {@link #sessionWrapperSet} cache or by creating a new one. The status 
of the returned session is set to {@link Status#COMPUTING}.
+ *
+ * Some waiting is done in two cases:
+ * 
+ *   A candidate session is present in {@link #sessionWrapperSet} but 
is still {@link Status#COMPUTING}, a random wait
+ *   is observed to see if the session gets freed to save a session 
creation and allow session reuse,
+ *   It is necessary to create a new session but there are already 
sessions in the process of being created, a
+ *   random wait is observed (if no waiting already occurred waiting for a 
session to become free) before creation
+ *   takes place, just in case one of the created sessions got used then 
{@link #returnSession(SessionWrapper)} in the meantime.
+ * 
+ *
+ * The random wait prevents the "thundering herd" effect when all threads 
needing a session at the same time create a new
+ * one even though some differentiated waits could have led to better 
reuse and less session creations.
+ *
+ * @param allowWait usually true except in tests that know 
there's no point in waiting because nothing
+ *  will happen...
+ */
+public SessionWrapper get(SolrCloudManager cloudManager, boolean 
allowWait) throws IOException, InterruptedException {
   TimeSource timeSource = cloudManager.getTimeSource();
+  long oldestUpdateTimeNs = 
TimeUnit.SECONDS.convert(timeSource.getTimeNs(), TimeUnit.NANOSECONDS) - 
SESSION_EXPIRY;
+  int zkVersion = 
cloudManager.getDistribStateManager().getAutoScalingConfig().getZkVersion();
+
   synchronized (lockObj) {
-if (sessionWrapper.status == Status.NULL ||
-sessionWrapper.zkVersion != 
cloudManager.getDistribStateManager().getAutoScalingConfig().getZkVersion() ||
-TimeUnit.SECONDS.convert(timeSource.getTimeNs() - 
sessionWrapper.lastUpdateTime, TimeUnit.NANOSECONDS) > SESSION_EXPIRY) {
-  //no session available or the session is expired
-  return createSession(cloudManager);
-} else {
+SessionWrapper sw = getAvailableSession(zkVersion, oldestUpdateTimeNs);
+
+// Best case scenario: an available session
+if (sw != null) {
+  if (log.isDebugEnabled()) {
+log.debug("reusing session {}", sw.getCreateTime());
+  }
+  return sw;
+}
+
+// Wait for a while before deciding what to do if waiting could help...
+if ((creationsInProgress != 0 || hasCandidateSession(zkVersion, 
oldestUpdateTimeNs)) && 

[GitHub] [lucene-solr] murblanc commented on a change in pull request #1504: SOLR-14462: cache more than one autoscaling session

2020-05-20 Thread GitBox


murblanc commented on a change in pull request #1504:
URL: https://github.com/apache/lucene-solr/pull/1504#discussion_r427930548



##
File path: 
solr/solrj/src/java/org/apache/solr/client/solrj/cloud/autoscaling/PolicyHelper.java
##
@@ -429,87 +440,124 @@ private void release(SessionWrapper sessionWrapper) {
  * The session can be used by others while the caller is performing 
operations
  */
 private void returnSession(SessionWrapper sessionWrapper) {
-  TimeSource timeSource = sessionWrapper.session != null ? 
sessionWrapper.session.cloudManager.getTimeSource() : TimeSource.NANO_TIME;
+  boolean present;
   synchronized (lockObj) {
 sessionWrapper.status = Status.EXECUTING;
-if (log.isDebugEnabled()) {
-  log.debug("returnSession, curr-time {} sessionWrapper.createTime {}, 
this.sessionWrapper.createTime {} "
-  , time(timeSource, MILLISECONDS),
-  sessionWrapper.createTime,
-  this.sessionWrapper.createTime);
-}
-if (sessionWrapper.createTime == this.sessionWrapper.createTime) {
-  //this session was used for computing new operations and this can 
now be used for other
-  // computing
-  this.sessionWrapper = sessionWrapper;
+present = sessionWrapperSet.contains(sessionWrapper);
 
-  //one thread who is waiting for this need to be notified.
-  lockObj.notify();
-} else {
-  log.debug("create time NOT SAME {} ", 
SessionWrapper.DEFAULT_INSTANCE.createTime);
-  //else just ignore it
-}
+// wake up single thread waiting for a session return (ok if not woken 
up, wait is short)
+lockObj.notify();
   }
 
+  // Logging
+  if (present) {
+if (log.isDebugEnabled()) {
+  log.debug("returnSession {}", sessionWrapper.getCreateTime());
+}
+  } else {
+log.warn("returning unknown session {} ", 
sessionWrapper.getCreateTime());
+  }
 }
 
 
-public SessionWrapper get(SolrCloudManager cloudManager) throws 
IOException, InterruptedException {
+public SessionWrapper get(SolrCloudManager cloudManager, boolean 
allowWait) throws IOException, InterruptedException {

Review comment:
   Thanks. I feel it makes the flow a bit harder to read and the savings 
are not huge so I prefer to stick to the original structure of this method.
   (the memory impact is negligible IMO. There's also an additional call to 
hasNonExpiredSession in the proposal but again no big deal)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1504: SOLR-14462: cache more than one autoscaling session

2020-05-20 Thread GitBox


murblanc commented on a change in pull request #1504:
URL: https://github.com/apache/lucene-solr/pull/1504#discussion_r427924889



##
File path: 
solr/solrj/src/java/org/apache/solr/client/solrj/cloud/autoscaling/PolicyHelper.java
##
@@ -429,87 +440,124 @@ private void release(SessionWrapper sessionWrapper) {
  * The session can be used by others while the caller is performing 
operations
  */
 private void returnSession(SessionWrapper sessionWrapper) {
-  TimeSource timeSource = sessionWrapper.session != null ? 
sessionWrapper.session.cloudManager.getTimeSource() : TimeSource.NANO_TIME;
+  boolean present;
   synchronized (lockObj) {
 sessionWrapper.status = Status.EXECUTING;
-if (log.isDebugEnabled()) {
-  log.debug("returnSession, curr-time {} sessionWrapper.createTime {}, 
this.sessionWrapper.createTime {} "
-  , time(timeSource, MILLISECONDS),
-  sessionWrapper.createTime,
-  this.sessionWrapper.createTime);
-}
-if (sessionWrapper.createTime == this.sessionWrapper.createTime) {
-  //this session was used for computing new operations and this can 
now be used for other
-  // computing
-  this.sessionWrapper = sessionWrapper;
+present = sessionWrapperSet.contains(sessionWrapper);
 
-  //one thread who is waiting for this need to be notified.
-  lockObj.notify();
-} else {
-  log.debug("create time NOT SAME {} ", 
SessionWrapper.DEFAULT_INSTANCE.createTime);
-  //else just ignore it
-}
+// wake up single thread waiting for a session return (ok if not woken 
up, wait is short)
+lockObj.notify();
   }
 
+  // Logging
+  if (present) {
+if (log.isDebugEnabled()) {
+  log.debug("returnSession {}", sessionWrapper.getCreateTime());
+}
+  } else {
+log.warn("returning unknown session {} ", 
sessionWrapper.getCreateTime());
+  }
 }
 
 
-public SessionWrapper get(SolrCloudManager cloudManager) throws 
IOException, InterruptedException {
+public SessionWrapper get(SolrCloudManager cloudManager, boolean 
allowWait) throws IOException, InterruptedException {
   TimeSource timeSource = cloudManager.getTimeSource();
+  long oldestUpdateTimeNs = 
TimeUnit.SECONDS.convert(timeSource.getTimeNs(), TimeUnit.NANOSECONDS) - 
SESSION_EXPIRY;
+  int zkVersion = 
cloudManager.getDistribStateManager().getAutoScalingConfig().getZkVersion();
+
   synchronized (lockObj) {
-if (sessionWrapper.status == Status.NULL ||
-sessionWrapper.zkVersion != 
cloudManager.getDistribStateManager().getAutoScalingConfig().getZkVersion() ||
-TimeUnit.SECONDS.convert(timeSource.getTimeNs() - 
sessionWrapper.lastUpdateTime, TimeUnit.NANOSECONDS) > SESSION_EXPIRY) {
-  //no session available or the session is expired
+// If nothing in the cache can possibly work, create a new session
+if (!hasNonExpiredSession(zkVersion, oldestUpdateTimeNs)) {
   return createSession(cloudManager);
-} else {
+}
+
+// Try to find a session available right away
+SessionWrapper sw = getAvailableSession(zkVersion, oldestUpdateTimeNs);
+
+if (sw != null) {
+  if (log.isDebugEnabled()) {
+log.debug("reusing session {}", sw.getCreateTime());
+  }
+  return sw;
+} else if (allowWait) {
+  // No session available, but if we wait a bit, maybe one can become 
available
+  // wait 1 to 10 secs in case a session is returned. Random to spread 
wakeup otherwise sessions not reused
+  long waitForMs = (long) (Math.random() * 9 * 1000 + 1000);
+
+  if (log.isDebugEnabled()) {
+log.debug("No sessions are available, all busy COMPUTING. starting 
wait of {}ms", waitForMs);
+  }
   long waitStart = time(timeSource, MILLISECONDS);
-  //the session is not expired
-  log.debug("reusing a session {}", this.sessionWrapper.createTime);
-  if (this.sessionWrapper.status == Status.UNUSED || 
this.sessionWrapper.status == Status.EXECUTING) {
-this.sessionWrapper.status = Status.COMPUTING;
-return sessionWrapper;
-  } else {
-//status= COMPUTING it's being used for computing. computing is
-if (log.isDebugEnabled()) {
-  log.debug("session being used. waiting... current time {} ", 
time(timeSource, MILLISECONDS));
-}
-try {
-  lockObj.wait(10 * 1000);//wait for a max of 10 seconds
-} catch (InterruptedException e) {
-  log.info("interrupted... ");
-}
+  try {
+lockObj.wait(waitForMs);
+  } catch (InterruptedException e) {
+Thread.currentThread().interrupt();
+  }
+
+  if 

[GitHub] [lucene-solr] murblanc commented on a change in pull request #1504: SOLR-14462: cache more than one autoscaling session

2020-05-20 Thread GitBox


murblanc commented on a change in pull request #1504:
URL: https://github.com/apache/lucene-solr/pull/1504#discussion_r427923626



##
File path: 
solr/solrj/src/java/org/apache/solr/client/solrj/cloud/autoscaling/PolicyHelper.java
##
@@ -429,87 +440,124 @@ private void release(SessionWrapper sessionWrapper) {
  * The session can be used by others while the caller is performing 
operations
  */
 private void returnSession(SessionWrapper sessionWrapper) {
-  TimeSource timeSource = sessionWrapper.session != null ? 
sessionWrapper.session.cloudManager.getTimeSource() : TimeSource.NANO_TIME;
+  boolean present;
   synchronized (lockObj) {
 sessionWrapper.status = Status.EXECUTING;
-if (log.isDebugEnabled()) {
-  log.debug("returnSession, curr-time {} sessionWrapper.createTime {}, 
this.sessionWrapper.createTime {} "
-  , time(timeSource, MILLISECONDS),
-  sessionWrapper.createTime,
-  this.sessionWrapper.createTime);
-}
-if (sessionWrapper.createTime == this.sessionWrapper.createTime) {
-  //this session was used for computing new operations and this can 
now be used for other
-  // computing
-  this.sessionWrapper = sessionWrapper;
+present = sessionWrapperSet.contains(sessionWrapper);
 
-  //one thread who is waiting for this need to be notified.
-  lockObj.notify();
-} else {
-  log.debug("create time NOT SAME {} ", 
SessionWrapper.DEFAULT_INSTANCE.createTime);
-  //else just ignore it
-}
+// wake up single thread waiting for a session return (ok if not woken 
up, wait is short)
+lockObj.notify();
   }
 
+  // Logging
+  if (present) {
+if (log.isDebugEnabled()) {
+  log.debug("returnSession {}", sessionWrapper.getCreateTime());
+}
+  } else {
+log.warn("returning unknown session {} ", 
sessionWrapper.getCreateTime());
+  }
 }
 
 
-public SessionWrapper get(SolrCloudManager cloudManager) throws 
IOException, InterruptedException {
+public SessionWrapper get(SolrCloudManager cloudManager, boolean 
allowWait) throws IOException, InterruptedException {
   TimeSource timeSource = cloudManager.getTimeSource();
+  long oldestUpdateTimeNs = 
TimeUnit.SECONDS.convert(timeSource.getTimeNs(), TimeUnit.NANOSECONDS) - 
SESSION_EXPIRY;
+  int zkVersion = 
cloudManager.getDistribStateManager().getAutoScalingConfig().getZkVersion();
+
   synchronized (lockObj) {
-if (sessionWrapper.status == Status.NULL ||
-sessionWrapper.zkVersion != 
cloudManager.getDistribStateManager().getAutoScalingConfig().getZkVersion() ||
-TimeUnit.SECONDS.convert(timeSource.getTimeNs() - 
sessionWrapper.lastUpdateTime, TimeUnit.NANOSECONDS) > SESSION_EXPIRY) {
-  //no session available or the session is expired
+// If nothing in the cache can possibly work, create a new session
+if (!hasNonExpiredSession(zkVersion, oldestUpdateTimeNs)) {
   return createSession(cloudManager);
-} else {
+}
+
+// Try to find a session available right away
+SessionWrapper sw = getAvailableSession(zkVersion, oldestUpdateTimeNs);
+
+if (sw != null) {
+  if (log.isDebugEnabled()) {
+log.debug("reusing session {}", sw.getCreateTime());
+  }
+  return sw;
+} else if (allowWait) {
+  // No session available, but if we wait a bit, maybe one can become 
available
+  // wait 1 to 10 secs in case a session is returned. Random to spread 
wakeup otherwise sessions not reused
+  long waitForMs = (long) (Math.random() * 9 * 1000 + 1000);
+
+  if (log.isDebugEnabled()) {
+log.debug("No sessions are available, all busy COMPUTING. starting 
wait of {}ms", waitForMs);
+  }
   long waitStart = time(timeSource, MILLISECONDS);
-  //the session is not expired
-  log.debug("reusing a session {}", this.sessionWrapper.createTime);
-  if (this.sessionWrapper.status == Status.UNUSED || 
this.sessionWrapper.status == Status.EXECUTING) {
-this.sessionWrapper.status = Status.COMPUTING;
-return sessionWrapper;
-  } else {
-//status= COMPUTING it's being used for computing. computing is
-if (log.isDebugEnabled()) {
-  log.debug("session being used. waiting... current time {} ", 
time(timeSource, MILLISECONDS));
-}
-try {
-  lockObj.wait(10 * 1000);//wait for a max of 10 seconds
-} catch (InterruptedException e) {
-  log.info("interrupted... ");
-}
+  try {
+lockObj.wait(waitForMs);
+  } catch (InterruptedException e) {
+Thread.currentThread().interrupt();
+  }
+
+  if 

[GitHub] [lucene-solr] murblanc commented on a change in pull request #1504: SOLR-14462: cache more than one autoscaling session

2020-05-20 Thread GitBox


murblanc commented on a change in pull request #1504:
URL: https://github.com/apache/lucene-solr/pull/1504#discussion_r427921563



##
File path: 
solr/solrj/src/java/org/apache/solr/client/solrj/cloud/autoscaling/PolicyHelper.java
##
@@ -429,87 +440,124 @@ private void release(SessionWrapper sessionWrapper) {
  * The session can be used by others while the caller is performing 
operations
  */
 private void returnSession(SessionWrapper sessionWrapper) {
-  TimeSource timeSource = sessionWrapper.session != null ? 
sessionWrapper.session.cloudManager.getTimeSource() : TimeSource.NANO_TIME;
+  boolean present;
   synchronized (lockObj) {
 sessionWrapper.status = Status.EXECUTING;
-if (log.isDebugEnabled()) {
-  log.debug("returnSession, curr-time {} sessionWrapper.createTime {}, 
this.sessionWrapper.createTime {} "
-  , time(timeSource, MILLISECONDS),
-  sessionWrapper.createTime,
-  this.sessionWrapper.createTime);
-}
-if (sessionWrapper.createTime == this.sessionWrapper.createTime) {
-  //this session was used for computing new operations and this can 
now be used for other
-  // computing
-  this.sessionWrapper = sessionWrapper;
+present = sessionWrapperSet.contains(sessionWrapper);
 
-  //one thread who is waiting for this need to be notified.
-  lockObj.notify();
-} else {
-  log.debug("create time NOT SAME {} ", 
SessionWrapper.DEFAULT_INSTANCE.createTime);
-  //else just ignore it
-}
+// wake up single thread waiting for a session return (ok if not woken 
up, wait is short)
+lockObj.notify();
   }
 
+  // Logging
+  if (present) {
+if (log.isDebugEnabled()) {
+  log.debug("returnSession {}", sessionWrapper.getCreateTime());
+}
+  } else {
+log.warn("returning unknown session {} ", 
sessionWrapper.getCreateTime());
+  }
 }
 
 
-public SessionWrapper get(SolrCloudManager cloudManager) throws 
IOException, InterruptedException {
+public SessionWrapper get(SolrCloudManager cloudManager, boolean 
allowWait) throws IOException, InterruptedException {
   TimeSource timeSource = cloudManager.getTimeSource();
+  long oldestUpdateTimeNs = 
TimeUnit.SECONDS.convert(timeSource.getTimeNs(), TimeUnit.NANOSECONDS) - 
SESSION_EXPIRY;
+  int zkVersion = 
cloudManager.getDistribStateManager().getAutoScalingConfig().getZkVersion();
+
   synchronized (lockObj) {
-if (sessionWrapper.status == Status.NULL ||
-sessionWrapper.zkVersion != 
cloudManager.getDistribStateManager().getAutoScalingConfig().getZkVersion() ||
-TimeUnit.SECONDS.convert(timeSource.getTimeNs() - 
sessionWrapper.lastUpdateTime, TimeUnit.NANOSECONDS) > SESSION_EXPIRY) {
-  //no session available or the session is expired
+// If nothing in the cache can possibly work, create a new session
+if (!hasNonExpiredSession(zkVersion, oldestUpdateTimeNs)) {
   return createSession(cloudManager);
-} else {
+}
+
+// Try to find a session available right away
+SessionWrapper sw = getAvailableSession(zkVersion, oldestUpdateTimeNs);
+
+if (sw != null) {
+  if (log.isDebugEnabled()) {
+log.debug("reusing session {}", sw.getCreateTime());
+  }
+  return sw;
+} else if (allowWait) {
+  // No session available, but if we wait a bit, maybe one can become 
available
+  // wait 1 to 10 secs in case a session is returned. Random to spread 
wakeup otherwise sessions not reused
+  long waitForMs = (long) (Math.random() * 9 * 1000 + 1000);
+
+  if (log.isDebugEnabled()) {
+log.debug("No sessions are available, all busy COMPUTING. starting 
wait of {}ms", waitForMs);
+  }
   long waitStart = time(timeSource, MILLISECONDS);
-  //the session is not expired
-  log.debug("reusing a session {}", this.sessionWrapper.createTime);
-  if (this.sessionWrapper.status == Status.UNUSED || 
this.sessionWrapper.status == Status.EXECUTING) {
-this.sessionWrapper.status = Status.COMPUTING;
-return sessionWrapper;
-  } else {
-//status= COMPUTING it's being used for computing. computing is
-if (log.isDebugEnabled()) {
-  log.debug("session being used. waiting... current time {} ", 
time(timeSource, MILLISECONDS));
-}
-try {
-  lockObj.wait(10 * 1000);//wait for a max of 10 seconds
-} catch (InterruptedException e) {
-  log.info("interrupted... ");
-}
+  try {
+lockObj.wait(waitForMs);
+  } catch (InterruptedException e) {
+Thread.currentThread().interrupt();
+  }
+
+  if 

[GitHub] [lucene-solr] murblanc commented on a change in pull request #1504: SOLR-14462: cache more than one autoscaling session

2020-05-20 Thread GitBox


murblanc commented on a change in pull request #1504:
URL: https://github.com/apache/lucene-solr/pull/1504#discussion_r427919669



##
File path: 
solr/solrj/src/java/org/apache/solr/client/solrj/cloud/autoscaling/PolicyHelper.java
##
@@ -382,45 +383,55 @@ static MapWriter loggingInfo(Policy policy, 
SolrCloudManager cloudManager, Sugge
   }
 
   public enum Status {
-NULL,
-//it is just created and not yet used or all operations on it has been 
completed fully
-UNUSED,
-COMPUTING, EXECUTING
+COMPUTING, // A command is actively using and modifying the session to 
compute placements
+EXECUTING // A command is not done yet processing its changes but no 
longer uses the session
   }
 
   /**
-   * This class stores a session for sharing purpose. If a process creates a 
session to
-   * compute operations,
-   * 1) see if there is a session that is available in the cache,
-   * 2) if yes, check if it is expired
-   * 3) if it is expired, create a new session
-   * 4) if it is not expired, borrow it
-   * 5) after computing operations put it back in the cache
+   * This class stores sessions for sharing purposes. If a process requirees a 
session to

Review comment:
   Thanks. I have the MacBook Pro butterfly keyboard, it's a catastrophe!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org