Yingyi Bu has posted comments on this change.

Change subject: Impelemented the memory-bounded HashGroupby and HashJoin for 
BigObject
......................................................................


Patch Set 7:

(10 comments)

Here are some high-level comments:

1. What is the current abstraction for user code, e.g., operators and 
functions?  I prefer to have IFrame be the only abstraction that user code need 
to deal with.  It seems to me that we should wrap a byte buffer inside IFrame 
and IFrame has the ByteBuffer interface but hides the details of re-sizing.  Is 
it do-able?

2. It looks to me switching ExternalGroupByOperatorDescriptor to use an object 
based hash table implementation will lead to performance regressions.  But 
maybe I have a wrong understanding of the code.

Detailed comments are inlined.

https://asterix-gerrit.ics.uci.edu/#/c/398/7/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/physical/ExternalGroupByPOperator.java
File 
algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/physical/ExternalGroupByPOperator.java:

Line 23: import java.util.List;
Please use the AsterixDB codestyle as described in 
https://asterixdb.incubator.apache.org/dev-setup.html.

I guess both Till and Ian know how to set the style up in IntelligentJ:-)


Line 256:                 true);
Do we always need to sort the output?


https://asterix-gerrit.ics.uci.edu/#/c/398/7/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/comm/FixedSizeFrame.java
File 
hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/comm/FixedSizeFrame.java:

Line 43:     public ByteBuffer getBuffer() {
What is the user-level abstraction after this change?
Will an operator implementor still see ByteBuffer or s/he should only see 
IFrame?  My naive guess is that a user doesn't need to know ByteBuffer any 
more, but that seems not true?


https://asterix-gerrit.ics.uci.edu/#/c/398/7/hyracks/hyracks-dataflow-common/src/main/java/org/apache/hyracks/dataflow/common/comm/io/FrameTupleAppenderWrapper.java
File 
hyracks/hyracks-dataflow-common/src/main/java/org/apache/hyracks/dataflow/common/comm/io/FrameTupleAppenderWrapper.java:

Line 45:     public IFrameTupleAppender getFrameTupleAppender() {
This method is never called by others. 
 Should we remove it?


https://asterix-gerrit.ics.uci.edu/#/c/398/7/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/AbstractTupleBufferAccessor.java
File 
hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/AbstractTupleBufferAccessor.java:

Line 132:         throw new IllegalAccessError("Should never call this reset");
Again, about the abstraction, should the user-code, e.g., operators and 
functions see ByteBuffer at all?


https://asterix-gerrit.ics.uci.edu/#/c/398/7/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/DeletableFramePool.java
File 
hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/DeletableFramePool.java:

Line 54:     public ByteBuffer allocateFrame(int frameSize) throws 
HyracksDataException {
Should this return IFrame, if IFrame is the abstraction that hides all the 
details regarding to dynamic-resizing of ByteBuffers.


https://asterix-gerrit.ics.uci.edu/#/c/398/7/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/FrameFreeSlotLastFit.java
File 
hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/FrameFreeSlotLastFit.java:

Line 55:         for (int i = size - 1; i >= 0; i--) {
Can that be a binary search, or based on a SortedMap?


https://asterix-gerrit.ics.uci.edu/#/c/398/7/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/IDeletableFramePool.java
File 
hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/IDeletableFramePool.java:

Line 26:     void deAllocateBuffer(ByteBuffer buffer);
Should that be
deAllocateFrame(IFrame frame)?

I prefer to have only one abstraction for user code, either ByteBuffer or 
IFrame.


https://asterix-gerrit.ics.uci.edu/#/c/398/7/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/IFramePool.java
File 
hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/IFramePool.java:

Line 40:     ByteBuffer allocateFrame(int frameSize) throws 
HyracksDataException;
Should this return IFrame?


https://asterix-gerrit.ics.uci.edu/#/c/398/7/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/structures/NonSerializableHashTable.java
File 
hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/structures/NonSerializableHashTable.java:

Line 24: public class NonSerializableHashTable implements 
IBatchDeleteTable<TuplePointer> {
Why do we switch to this NoSerializable implementation for ExternalGroupBy?
It seems to me this class will have object creation problems...


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/398
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I248f3a374fdacad7d57e49cf18d8233745e55460
Gerrit-PatchSet: 7
Gerrit-Project: hyracks
Gerrit-Branch: master
Gerrit-Owner: Jianfeng Jia <[email protected]>
Gerrit-Reviewer: Ian2 Maxon <[email protected]>
Gerrit-Reviewer: Jenkins <[email protected]>
Gerrit-Reviewer: Jianfeng Jia <[email protected]>
Gerrit-Reviewer: Pouria Pirzadeh <[email protected]>
Gerrit-Reviewer: Preston Carman <[email protected]>
Gerrit-Reviewer: Till Westmann <[email protected]>
Gerrit-Reviewer: Yingyi Bu <[email protected]>
Gerrit-HasComments: Yes

Reply via email to