Hi everyone,

All three issues have been completed and merged - anything else worth waiting 
for or can we start a release?

Cheers,
Eyal

On 2024/12/03 15:41:00 Eyal Allweil wrote:
> Hi all,
> 
> We've had a relatively inactive year, but now we have three issues/pull
> requests that need to be reviewed. Once we merge them I think we can
> release a new version, which will support Spark all the way to Spark 3.4.x
> and bring us up to date with their releases. I'm writing a short
> description here - anyone who can, please review them. If you don't have
> time for an in-depth code review, checking the interface/documentation of
> the two new methods is also important.
> 
> The issues are:
> 
> DATAFU-176 <https://issues.apache.org/jira/browse/DATAFU-176> - do
> dedupTopN with combiner. This is like our dedupTopN, but uses the combiner
> to deal with extreme skew efficiently. A use case that came up at PayPal.
> 
> DATAFU-177 <https://issues.apache.org/jira/browse/DATAFU-177>- Add
> dedupByAllExcept. This method is for deduplicating otherwise identical rows
> with differing ids. Also a use case that came up at PayPal.
> 
> DATAFU-179 <https://issues.apache.org/jira/browse/DATAFU-179> - support
> Spark 3.3.x and 3.4.x. Self-explanatory. I did this one, and it didn't take
> much, but I'd be glad for some good double-checking.
> 
> A note to interested non-committers - *we welcome your review and comments*!
> Feel free to write in either the Jira issues or the Github PRs.
> 
> Cheers,
> Eyal
> 

Reply via email to