leerho commented on pull request #319:
URL: 
https://github.com/apache/incubator-datasketches-java/pull/319#issuecomment-637035665


   This is an excellent review! Thank you!  I think the protocol is for you to 
"Resolve" the issues you found. 
   
   One issue you didn't bring up, is the policy on how _nulls_ should be 
interpreted with the different set operations.  So far, our policy has been to 
treat nulls as equivalent to the Empty Sketch, i.e., Theta = 1.0 and count = 0. 
 With this policy nulls and empties have no effect on a Union.  However, 
submitting a null or empty to an Intersection or as the first argument to an 
AnotB operation has major impact on the result.  
   
   My colleagues and I have had many discussions on this issue and I think 
where have left the issue for now is that consistency in implementing a policy 
is very important.  Currently, our policy has been: We try really hard to not 
return nulls as they can propagate in other calculations and become bugs that 
are difficult to trace.  We also prefer to ignore nulls on input, if possible, 
because real data is dirty and has lots of nulls and missing values.  Also we 
prefer not to throw exceptions on receiving a null, if possible, because 
throwing exceptions in a system environment is a big headache for system 
operators.  
   
   However, the intersection and difference operators are a quandry.   If a 
user does not like our treatment of nulls, they can catch nulls before entering 
the set operator, If we returned nulls the user would have to catch these null 
cases on output.  There is no perfect answer here.  
   
   I would be interested in your view.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to