eyala opened a new pull request, #46:
URL: https://github.com/apache/datafu/pull/46

   A new method for when you want to de-duplicate records, but not lose any 
"real" data.
   
   For example if a server creates events with an autogenerated event id, and 
sometimes
   events are duplicated. You don't want double rows just for the event ids, 
but if any of the other fields are distinct you want to keep the rows (with 
their original event ids) - otherwise you'd just drop the event id column. In 
order to keep at least one value you need to tediously list all the other 
columns.
   
   JIRA: https://issues.apache.org/jira/browse/DATAFU-177


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@datafu.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to