Jianfeng Jia has posted comments on this change.

Change subject: Implemented the memory-bounded HashGroupby and HashJoin for 
BigObject
......................................................................


Patch Set 7:

(11 comments)

@Yingyi, I'd like to address the IFrame related changes to a separate patch. 
Otherwise this one will become too big.

https://asterix-gerrit.ics.uci.edu/#/c/398/7/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/physical/ExternalGroupByPOperator.java
File 
algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/physical/ExternalGroupByPOperator.java:

Line 256:                 true);
> Do we always need to sort the output?
No. We don't need to sort if using ExternalHashGroupby operator. Did you mean 
that "true"? That "true" is about "isFinalStage".


https://asterix-gerrit.ics.uci.edu/#/c/398/7/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/comm/FixedSizeFrame.java
File 
hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/comm/FixedSizeFrame.java:

Line 43:     public ByteBuffer getBuffer() {
> What is the user-level abstraction after this change?
I've updated all the IFrame to the IFrameReader. Haven't done it for the 
IFrameWriter. I think it should be a separate patch, because it will be a big 
change again.


https://asterix-gerrit.ics.uci.edu/#/c/398/7/hyracks/hyracks-dataflow-common/src/main/java/org/apache/hyracks/dataflow/common/comm/io/FrameTupleAppenderWrapper.java
File 
hyracks/hyracks-dataflow-common/src/main/java/org/apache/hyracks/dataflow/common/comm/io/FrameTupleAppenderWrapper.java:

Line 45:     public IFrameTupleAppender getFrameTupleAppender() {
> This method is never called by others. 
Done


https://asterix-gerrit.ics.uci.edu/#/c/398/7/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/AbstractTupleBufferAccessor.java
File 
hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/AbstractTupleBufferAccessor.java:

Line 132:         throw new IllegalAccessError("Should never call this reset");
> Again, about the abstraction, should the user-code, e.g., operators and fun
I will address this as a separate CR.


https://asterix-gerrit.ics.uci.edu/#/c/398/7/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/DeletableFramePool.java
File 
hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/DeletableFramePool.java:

Line 54:     public ByteBuffer allocateFrame(int frameSize) throws 
HyracksDataException {
> Should this return IFrame, if IFrame is the abstraction that hides all the 
Will address this issue in another CR.


https://asterix-gerrit.ics.uci.edu/#/c/398/7/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/FrameFreeSlotLastFit.java
File 
hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/FrameFreeSlotLastFit.java:

Line 45:         frameSpaces = new FrameSpace[initial_frame_number];
> variable name should be camel case.
Done


Line 55:         for (int i = size - 1; i >= 0; i--) {
> Can that be a binary search, or based on a SortedMap?
This why it is called "lastFit". BinarySearch is introduced in "bestFit" code.
Actually, based on my experiment the lastFit is the best. BinarySearch will 
introduce more overhead. Especially when we assume most of the case is the 
normal object case, the size of each Frame is equal, then picking the last one 
is the most efficient way.


https://asterix-gerrit.ics.uci.edu/#/c/398/7/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/IDeletableFramePool.java
File 
hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/IDeletableFramePool.java:

Line 26:     void deAllocateBuffer(ByteBuffer buffer);
> Should that be
Will cover IFrame related changes in a separate patch.


https://asterix-gerrit.ics.uci.edu/#/c/398/7/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/IFramePool.java
File 
hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/IFramePool.java:

Line 40:     ByteBuffer allocateFrame(int frameSize) throws 
HyracksDataException;
> Should this return IFrame?
Will cover the IFrame related change in a separate patch


https://asterix-gerrit.ics.uci.edu/#/c/398/7/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/IPartitionedTupleBufferManager.java
File 
hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/IPartitionedTupleBufferManager.java:

Line 61:     void flushPartition(int pid, IFrameWriter writer) throws 
HyracksDataException;
> Please add a comment here about what this function does. How does it differ
Done


https://asterix-gerrit.ics.uci.edu/#/c/398/7/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/structures/NonSerializableHashTable.java
File 
hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/structures/NonSerializableHashTable.java:

Line 24: public class NonSerializableHashTable implements 
IBatchDeleteTable<TuplePointer> {
> Why do we switch to this NoSerializable implementation for ExternalGroupBy?
removed


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/398
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I248f3a374fdacad7d57e49cf18d8233745e55460
Gerrit-PatchSet: 7
Gerrit-Project: hyracks
Gerrit-Branch: master
Gerrit-Owner: Jianfeng Jia <[email protected]>
Gerrit-Reviewer: Jenkins <[email protected]>
Gerrit-Reviewer: Jianfeng Jia <[email protected]>
Gerrit-Reviewer: Pouria Pirzadeh <[email protected]>
Gerrit-Reviewer: Preston Carman <[email protected]>
Gerrit-Reviewer: Till Westmann <[email protected]>
Gerrit-Reviewer: Yingyi Bu <[email protected]>
Gerrit-HasComments: Yes

Reply via email to