leerho opened a new pull request #319:
URL: https://github.com/apache/incubator-datasketches-java/pull/319


   This makes some significant API changes in the Tuple package hierarchy that 
will require a bump to a 2.0.0 release.
   
   1. The first task was that the root tuple directory was cluttered with two 
different groups of classes that made it difficult for anyone to figure out 
what is going on.  One group of classes form the base generic classes of the 
tuple sketch on which the concrete extensions "adouble" (a single double), 
"aninteger" (a single integer), and "strings" (array of strings) depend.  These 
three concrete extensions are already in their own sub directories. 
   
   The second, largest group of classes were a dedicated non-generic 
implementation of the tuple sketch, which implemented an array of doubles.  All 
of these classes had "ArrayOfDoubles" in their name.  These classes shared no 
code with the root generic tuple classes except for a few methods in the 
SerializerDeserializer and the Util classes.  By making a few methods public, I 
was able to move all of the "ArrayOfDoubles" classes into their own 
subdirectory.  This creates an incompatible API break, which will force us to 
move to a 2.0.0 for the next version.   Now the tuple root directory is much 
cleaner and easier to navigate and understand.  There are several reasons for 
this separate dedicated implementation. First, we felt that a configurable 
array of doubles would be a relatively common use case.  Second, we wanted a 
full concrete example of the tuple sketch as an example of what it would look 
like including both on-heap and off-heap variants.   It is this ArrayOfDoubles 
implementation that has been integrated into Druid, for example. 
   
   2.  The next task was to update the set operations to allow integration with 
the Theta sketches. It turns out that modifying the generic Union and 
Intersection classes only required adding one method to each.  I did some minor 
code cleanup and code documentation at the same time.
   
   The AnotB operator is another story.  We have never been really happy with 
how this was implemented the first time.  The current API is clumsy.  So I have 
taken the opportunity to redesign the API for this class.  It still has the 
current API methods but deprecated.  With the new modified class the user has 
several ways of performing AnotB.
   
   As stateless operations:
   
       - With Tuple: resultSk = aNotB(skTupleA, skTupleB);
       - With Theta: resultSk = aNotB(skTupleA, skThetaB);
   
   As stateful, sequential operations:
   
       - void setA(skTupleA);
       - void notB(skTupleB);   or   void notB(skThetaB);   //These are 
interchangable.
       - ...
       - void notB(skTupleB);   or   void notB(skThetaB);   //These are 
interchangable.
       - resultSk = getResult(reset = false);  // This allows getting an 
intermediate result
       - void notB(skTupleB);   or   void notB(skThetaB);   //Continue...
       - resultSK = getResult(reset = true); //This returns the result and 
clears the internal state to empty.
   
   
   The test class for AnotB was also completely rewritten.
   
   
   
   3. The Intersection code required a major rewrite.  There were too many 
problems to list but this rewrite should clear up some major discrepancies and 
errors caused by improper null and empty handling.  


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to