orhankislal opened a new pull request #484: MFVSketch: Remove duplicate results on GPDB URL: https://github.com/apache/madlib/pull/484 JIRA: MADLIB-1412 mfvsketch merge function did not filter out duplicate entries if a value is in the mfv list for multiple segments. This commit fixes the issue by keeping a list of added values and filtering out duplicate entries. We also noticed the use of memcmp() for comparing two Datums to check for equality. There is a note in the postgres/gpdb source saying this is unsafe, so we converted it over to calling the postgres function datumIsEqual() which can handle either little-endian or big-endian byte order, making it more portable. Co-authored-by: Orhan Kislal <[email protected]> Closes # <!-- Thanks for sending a pull request! Here are some tips for you: 1. Refer to this link for contribution guidelines https://cwiki.apache.org/confluence/display/MADLIB/Contribution+Guidelines 2. Please Provide the Module Name, a JIRA Number and a short description about your changes. --> - [ ] Add the module name, JIRA# to PR/commit and description. - [ ] Add tests for the change.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
