Thanks for the detailed reply, Andrew. I was also considering default methods, but it turns out that Filter is not an interface, but an abstract class, so it doesn't apply.
Children not implementing a marker interface or marker method would inherit the marker method implementation from the closest parent the same way they would inherit the marker interface, so I think they are equivalent in this aspect, too. I think that marker interface(s) and overridable non-abstract getter(s) in Filter are mostly equivalent from both logical and source compatibility aspects. The only difference is that the marker interfaces cannot be removed in a subclass, while the getter can be overridden anywhere, but with well-chosen defaults it shouldn't be much of a limitation. Now that I think about it, we could cache the markers' values in an array when creating the filter lists, so even the cost of looking them up doesn't matter as it wouldn't happen in the hot code path. Using the marker interfaces is more elegant, and discourages problematic subclassing, so I am leaning towards that. Istvan On Wed, May 29, 2024 at 2:30 AM Andrew Purtell <apurt...@apache.org> wrote: > Actually source compatibility with default methods would be fine too. I > forget this is the main reason default methods were invented. The code of > derived classes would not need to be changed, unless the returned value of > the new method should be changed, and this is no worse than having a marker > interface, which would also require code changes to implement non-default > behaviors. > > A marker interface does remain as an option. It might make a difference in > chained use cases. Consider a chain of filter instances that mixes derived > code that is unaware of isHinting() and base code that is. The filter chain > can be examined for the presence or absence of the marker interface and > would not need to rely on every filter in the chain passing return values > of isHinting back. > > Marker interfaces can also be added to denote stateful or stateless > filters, if distinguishing between them would be useful, perhaps down the > road. > > On Tue, May 28, 2024 at 5:13 PM Andrew Purtell <apurt...@apache.org> > wrote: > > > I think you've clearly put a lot of time into the analysis and it is > > plausible. > > > > Adding isHinting as a default method will preserve binary compatibility. > > Source compatibility for derived custom filters would be broken though > and > > that probably prevents this going back into a releasing code line. > > > > Have you considered adding a marker interface instead? That would > preserve > > both source and binary compatibility. It wouldn't require any changes to > > derived custom filters. A runtime instanceof test would determine if the > > filter is a hinting filter or not. No need for a new method, default or > > otherwise. > > > > On Tue, May 28, 2024 at 12:41 AM Istvan Toth <st...@apache.org> wrote: > > > >> I have recently opened HBASE-28622 > >> <https://issues.apache.org/jira/browse/HBASE-28622> , which has turned > >> out > >> to be another aspect of the problem discussed in HBASE-20565 > >> <https://issues.apache.org/jira/browse/HBASE-20565> . > >> > >> The problem is discussed in detail in HBASE-20565 > >> <https://issues.apache.org/jira/browse/HBASE-20565> , but it boils down > >> to > >> the API design decision that the filters returning SEEK_NEXT_USING_HINT > >> rely on filterCell() getting called. > >> > >> On the other hand, some filters maintain an internal row state that sets > >> counters for calls of filterCell(), which interacts with the results of > >> previous filters in a filterList. > >> > >> When filters return different results for filterRowkey(), then filters > >> returning SEEK_NEXT_USING_HINT that have returned false must have > >> filterCell() called, otherwise the scan will degenerate into a full > scan. > >> > >> On the other hand, filters that maintain an internal row state must only > >> be > >> called if all previous filters have INCLUDEed the Cell, otherwise their > >> internal state will be off. (This still has caveats, as described in > >> HBASE-20565 <https://issues.apache.org/jira/browse/HBASE-20565>) > >> > >> In my opinion, the current code from HBASE-20565 > >> <https://issues.apache.org/jira/browse/HBASE-20565> strikes a bad > balance > >> between features, as while it fixes some use cases for row stateful > >> filters, it also often negates the performance benefits of the filters > >> providing hints, which in practice makes them unusable in many filter > list > >> combinations. > >> > >> Without completely re-designing the filter system, I think that the best > >> solution would be adding a method to distinguish the filters that can > >> return hints from the rest of them. (This was also suggested in > >> HBASE-20565 > >> <https://issues.apache.org/jira/browse/HBASE-20565> , but it was not > >> implemented) > >> > >> In theory, we have four combinations of hinting and row stateful > filters, > >> but currently we have no filters that are both hinting and row stateful, > >> and I don't think that there is valid use case for those. The ones that > >> are > >> neither hinting nor stateful could be handled as either, but treating > them > >> as non-hinting seems faster. > >> > >> Once we have that, we can improve the filterList behaviour a lot: > >> - in filterRowKey(), if any hinting filter returns false, then we could > >> return false > >> - in filterCell(), rather than returning on the first non-include > result, > >> we could process the remaining hinting filters, while skipping the > >> non-hinting ones. > >> > >> The code changes are minimal, we just need to add a new method like > >> isHinting() to the Filter class, and change the above two methods. > >> > >> We could add this even in 2.5, by defaulting isHinting() to return false > >> in > >> the Filter class, which would preserve the current API and behaviour for > >> existing custom filters. > >> > >> I was looking at it from the AND filter perspective, but if needed, > >> similar > >> changes could be made to the OR filter. > >> > >> What do you think ? > >> Is this a good idea ? > >> > >> Istvan > >> > > > -- > Best regards, > Andrew > > Unrest, ignorance distilled, nihilistic imbeciles - > It's what we’ve earned > Welcome, apocalypse, what’s taken you so long? > Bring us the fitting end that we’ve been counting on > - A23, Welcome, Apocalypse > -- *István Tóth* | Sr. Staff Software Engineer *Email*: st...@cloudera.com cloudera.com <https://www.cloudera.com> [image: Cloudera] <https://www.cloudera.com/> [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image: Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: Cloudera on LinkedIn] <https://www.linkedin.com/company/cloudera> ------------------------------ ------------------------------