Hi all,

Thought it would be a good idea to broadcast some backlog issues that
interested contributors can pick up. I went through these issues and left
some brief comments to help any interested contributors in case they are
unfamiliar with the related codebase.

The idea here is to raise visibility on some old issues that might not be
as simple as to require the "good first issue" label, so existing
contributors looking to dive a bit deeper into the codebase can pick these
up.

List as follows:

   - Add retract_batch method for median accumulator:
   https://github.com/apache/datafusion/issues/7664
      - The most straightforward and easiest (I hope!)
   - Support ANY operator: https://github.com/apache/datafusion/issues/2548
      - Bit more tricky, especially in terms of code organization
   - Support ALL operator: https://github.com/apache/datafusion/issues/2547
      - Quite similar to above
   - var(distinct) support: https://github.com/apache/datafusion/issues/2410
      - Should be straightforward, lots of existing implementations to
      reference; most tedious part would probably be the tests
   - array_union and array_intersect cannot handle NULL columnar data:
   https://github.com/apache/datafusion/issues/9706
      - Good debugging opportunity here


I'll be happy to help review any PRs for the above and provide guidance
where possible.

If there's interest to this I can try do it on a somewhat regular basis,
grooming some old backlog issues to see if we can get some interest into
them.

Cheers, Jeffrey

Reply via email to