Github user viirya commented on the issue:
https://github.com/apache/spark/pull/14847
@ioana-delaney Thanks for review!
I replied few points first. I will add the tests you mentioned later.
4. This feature is motivated from the bucketed (and sorted, of course)
table jira. For this case, the Filter can be applied on the sorted data. At
that time, we can leverage it and optimize the filtering. Another case I can
think is cached data, when you cache sorted data as I did in the current tests,
the Filter won't be pushed down and will work on the sorted data directly.
5. Yeah. From the view of `StopAfter` operator, it only cares about if the
child is sorted or not. If the bucketed table is sorted too, it can support it.
Of course I will add a test for it.
6. Yes it does. Currently if the bucketed table is inserted or appended, we
can't guarantee its sort order. So it will be skipped.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]