[GitHub] drill pull request #984: DRILL-5783 Made a unit test for generated Priority ...

paul-rogers Wed, 18 Oct 2017 15:24:24 -0700

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/984#discussion_r145292056
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/TopN/TopNBatch.java
 ---
    @@ -335,20 +336,42 @@ private void purge() throws SchemaChangeException {
         logger.debug("Took {} us to purge", 
watch.elapsed(TimeUnit.MICROSECONDS));
       }
     
    -  public PriorityQueue createNewPriorityQueue(FragmentContext context, 
List<Ordering> orderings,
    -                                                     VectorAccessible 
batch, MappingSet mainMapping, MappingSet leftMapping, MappingSet rightMapping)
    -          throws ClassTransformationException, IOException, 
SchemaChangeException{
    -    CodeGenerator<PriorityQueue> cg = 
CodeGenerator.get(PriorityQueue.TEMPLATE_DEFINITION, 
context.getFunctionRegistry(), context.getOptions());
    +  private PriorityQueue createNewPriorityQueue(VectorAccessible batch, int 
limit)
    +    throws SchemaChangeException, ClassTransformationException, 
IOException {
    +    return createNewPriorityQueue(
    +      mainMapping, leftMapping, rightMapping, context.getOptionSet(), 
context.getFunctionRegistry(), context.getDrillbitContext().getCompiler(),
    +      config.getOrderings(), batch, unionTypeEnabled, codegenDump, limit, 
oContext.getAllocator(), schema.getSelectionVectorMode());
    +  }
    +
    +  public static MappingSet createMainMappingSet() {
    +    return new MappingSet((String) null, null, 
ClassGenerator.DEFAULT_SCALAR_MAP, ClassGenerator.DEFAULT_SCALAR_MAP);
    +  }
    +
    +  public static MappingSet createLeftMappingSet() {
    +    return new MappingSet("leftIndex", null, 
ClassGenerator.DEFAULT_SCALAR_MAP, ClassGenerator.DEFAULT_SCALAR_MAP);
    +  }
    +
    +  public static MappingSet createRightMappingSet() {
    +    return new MappingSet("rightIndex", null, 
ClassGenerator.DEFAULT_SCALAR_MAP, ClassGenerator.DEFAULT_SCALAR_MAP);
    +  }
    +
    +  public static PriorityQueue createNewPriorityQueue(
    +    MappingSet mainMapping, MappingSet leftMapping, MappingSet 
rightMapping,
    --- End diff --
    
    One thought worth sharing here. Today, our code cache uses source code as 
the key to look for an existing compiled version of a class. But, this is 
inefficient: we have to generate the source code, transform it, and store two 
copies to act as keys.
    
    Better would be to have a "class definition" object that holds all 
parameters needed to generate the code. Store these in the cache. Likely, these 
will be much smaller than the generated code. Using these avoids the need to 
generate the code to see if we already have a copy of that code.
    
    The class definition idea works as long as the definition is a closed set: 
everything needed to generate the code is in that one definition object.
    
    Your function here is close: it passes in a closed set of items. The open 
bits are the expression and system/session options. For options, we can copy 
the few values we need into our definition. The expression might require a bit 
more thought to capture.
    
    So, if we can evolve your idea further, we can get a potential large 
per-schema performance improvement by avoiding code generation when doing a 
repeat query.
    
    Too much to fix here; but something to think about moving forward.

---

[GitHub] drill pull request #984: DRILL-5783 Made a unit test for generated Priority ...

Reply via email to