Github user nsoft commented on a diff in the pull request:

    https://github.com/apache/lucene-solr/pull/433#discussion_r213397248
  
    --- Diff: 
solr/core/src/java/org/apache/solr/update/processor/TimeRoutedAliasUpdateProcessor.java
 ---
    @@ -230,6 +188,95 @@ public void processAdd(AddUpdateCommand cmd) throws 
IOException {
         }
       }
     
    +
    +  private String createCollectionsIfRequired(Instant docTimestamp, String 
targetCollection, String printableId) {
    +    // Even though it is possible that multiple requests hit this code in 
the 1-2 sec that
    +    // it takes to create a collection, it's an established anti-pattern 
to feed data with a very large number
    +    // of client connections. This in mind, we only guard against spamming 
the overseer within a batch of
    +    // updates. We are intentionally tolerating a low level of redundant 
requests in favor of simpler code. Most
    +    // super-sized installations with many update clients will likely be 
multi-tenant and multiple tenants
    +    // probably don't write to the same alias. As such, we have deferred 
any solution to the "many clients causing
    +    // collection creation simultaneously" problem until such time as 
someone actually has that problem in a
    +    // real world use case that isn't just an anti-pattern.
    +    try {
    +      CreationType creationType = requiresCreateCollection(docTimestamp, 
timeRoutedAlias.getPreemptiveCreateWindow());
    +      switch (creationType) {
    +        case SYNCHRONOUS:
    +          // This next line blocks until all collections required by the 
current document have been created
    +          return maintain(targetCollection, docTimestamp, printableId, 
false);
    +        case ASYNC_PREEMPTIVE:
    +          // Note: creating an executor and throwing it away is slightly 
expensive, but this is only likely to happen
    +          // once per hour/day/week (depending on time slice size for the 
TRA). If the executor were retained, it
    +          // would need to be shut down in a close hook to avoid test 
failures due to thread leaks which is slightly
    +          // more complicated from a code maintenance and readability 
stand point. An executor must used instead of a
    +          // thread to ensure we pick up the proper MDC logging stuff from 
ExecutorUtil. T
    +          if (preemptiveCreationExecutor == null) {
    +            DefaultSolrThreadFactory threadFactory = new 
DefaultSolrThreadFactory("TRA-preemptive-creation");
    +            preemptiveCreationExecutor = 
newMDCAwareSingleThreadExecutor(threadFactory);
    +            preemptiveCreationExecutor.execute(() -> {
    +              maintain(targetCollection, docTimestamp, printableId, true);
    +              preemptiveCreationExecutor.shutdown();
    +              preemptiveCreationExecutor = null;
    +            });
    +          }
    +          return targetCollection;
    +        case NONE:
    +          return targetCollection; // just for clarity...
    +        default:
    +          return targetCollection; // could use fall through, but fall 
through is fiddly for later editors.
    +      }
    +      // do nothing if creationType == NONE
    +    } catch (SolrException e) {
    +      throw e;
    +    } catch (Exception e) {
    +      throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, e);
    +    }
    +  }
    +
    +  /**
    +   * Determine if the a new collection will be required based on the 
document timestamp. Passing null for
    +   * preemptiveCreateInterval tells you if the document is beyond all 
existing collections with a response of
    +   * {@link CreationType#NONE} or {@link CreationType#SYNCHRONOUS}, and 
passing a valid date math for
    +   * preemptiveCreateMath additionally distinguishes the case where the 
document is close enough to the end of
    +   * the TRA to trigger preemptive creation but not beyond all existing 
collections with a value of
    +   * {@link CreationType#ASYNC_PREEMPTIVE}.
    +   *
    +   * @param routeTimestamp The timestamp from the document
    +   * @param preemptiveCreateMath The date math indicating the {@link 
TimeRoutedAlias#preemptiveCreateMath}
    +   * @return a {@code CreationType} indicating if and how to create a 
collection
    +   */
    +  private CreationType requiresCreateCollection(Instant routeTimestamp,  
String preemptiveCreateMath) {
    --- End diff --
    
    +1


---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to