leerho commented on pull request #319: URL: https://github.com/apache/incubator-datasketches-java/pull/319#issuecomment-637035665
This is an excellent review! Thank you! I think the protocol is for you to "Resolve" the issues you found. One issue you didn't bring up, is the policy on how _nulls_ should be interpreted with the different set operations. So far, our policy has been to treat nulls as equivalent to the Empty Sketch, i.e., Theta = 1.0 and count = 0. With this policy nulls and empties have no effect on a Union. However, submitting a null or empty to an Intersection or as the first argument to an AnotB operation has major impact on the result. My colleagues and I have had many discussions on this issue and I think where have left the issue for now is that consistency in implementing a policy is very important. Currently, our policy has been: We try really hard to not return nulls as they can propagate in other calculations and become bugs that are difficult to trace. We also prefer to ignore nulls on input, if possible, because real data is dirty and has lots of nulls and missing values. Also we prefer not to throw exceptions on receiving a null, if possible, because throwing exceptions in a system environment is a big headache for system operators. However, the intersection and difference operators are a quandry. If a user does not like our treatment of nulls, they can catch nulls before entering the set operator, If we returned nulls the user would have to catch these null cases on output. There is no perfect answer here. I would be interested in your view. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
