Github user paul-rogers commented on a diff in the pull request:
https://github.com/apache/drill/pull/984#discussion_r144945872
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/TopN/PriorityQueue.java
---
@@ -20,22 +20,58 @@
import org.apache.drill.exec.compile.TemplateClassDefinition;
import org.apache.drill.exec.exception.SchemaChangeException;
import org.apache.drill.exec.memory.BufferAllocator;
-import org.apache.drill.exec.ops.FragmentContext;
import org.apache.drill.exec.physical.impl.sort.RecordBatchData;
import org.apache.drill.exec.record.VectorContainer;
import org.apache.drill.exec.record.selection.SelectionVector4;
public interface PriorityQueue {
- public void add(FragmentContext context, RecordBatchData batch) throws
SchemaChangeException;
- public void init(int limit, FragmentContext context, BufferAllocator
allocator, boolean hasSv2) throws SchemaChangeException;
- public void generate() throws SchemaChangeException;
- public VectorContainer getHyperBatch();
- public SelectionVector4 getHeapSv4();
- public SelectionVector4 getFinalSv4();
- public boolean validate();
- public void resetQueue(VectorContainer container, SelectionVector4
vector4) throws SchemaChangeException;
- public void cleanup();
-
- public static TemplateClassDefinition<PriorityQueue> TEMPLATE_DEFINITION
= new TemplateClassDefinition<PriorityQueue>(PriorityQueue.class,
PriorityQueueTemplate.class);
+ /**
--- End diff --
TopN has a priority queue. So does the sort. Are there others? Should we
have a single, shared implementation that we can optimize and test once, but
benefit from multiple times? Or, is the TopN version different enough from the
sort version that we need two versions? If we need two, should improvements
here (especially for code gen) be applied to the sort version?
---