Hi thanks all for your discussion.

What is inputIndex in NestedFieldReferenceExpression?

I know inputIndex has special usage in FieldReferenceExpression, but
it is only for Join operators, and it is only for SQL optimization. It
looks like there is no requirement for Nested.

Best,
Jingsong

On Mon, Aug 28, 2023 at 1:13 PM Venkatakrishnan Sowrirajan
<vsowr...@asu.edu> wrote:
>
> Thanks for all the feedback and discussion everyone. Looks like we have
> reached a consensus here.
>
> Just to summarize:
>
> 1. Introduce a new *ReferenceExpression* (or *BaseReferenceExpression*)
> abstract class which will be extended by both *FieldReferenceExpression*
> and *NestedFieldReferenceExpression* (to be introduced as part of this FLIP)
> 2. No need of *supportsNestedFilters *check as the current
> *SupportsFilterPushDown* should already ignore unknown expressions (
> *NestedFieldReferenceExpression* for example) and return them as
> *remainingFilters.
> *Maybe this should be clarified explicitly in the Javadoc of
> *SupportsFilterPushDown.
> *I will file a separate JIRA to fix the documentation.
> 3. Refactor *SupportsProjectionPushDown* to use *ReferenceExpression *instead
> of existing 2-d arrays to consolidate and be consistent with other
> Supports*PushDown APIs - *outside the scope of this FLIP*
> 4. Similarly *SupportsAggregatePushDown* should also be evolved whenever
> nested fields support is added to use the *ReferenceExpression - **outside
> the scope of this FLIP*
>
> Does this sound good? Please let me know if I have missed anything here. If
> there are no concerns, I will start a vote tomorrow. I will also get the
> FLIP-356 wiki updated. Thanks everyone once again!
>
> Regards
> Venkata krishnan
>
>
> On Thu, Aug 24, 2023 at 8:19 PM Becket Qin <becket....@gmail.com> wrote:
>
> > Hi Jark,
> >
> > How about having a separate NestedFieldReferenceExpression, and
> > > abstracting a common base class "ReferenceExpression" for
> > > NestedFieldReferenceExpression and FieldReferenceExpression? This makes
> > > unifying expressions in
> > > "SupportsProjectionPushdown#applyProjections(List<ReferenceExpression>
> > > ...)"
> > > possible.
> >
> >
> > I'd be fine with this. It at least provides a consistent API style /
> > formality.
> >
> >  Re: Yunhong,
> >
> > 3. Finally, I think we need to look at the costs and benefits of unifying
> > > the SupportsFilterPushDown and SupportsProjectionPushDown (or others)
> > from
> > > the perspective of interface implementers. A stable API can reduce user
> > > development and change costs, if the current API can fully meet the
> > > functional requirements at the framework level, I personal suggest
> > reducing
> > > the impact on connector developers.
> > >
> >
> > I agree that the cost and benefit should be measured. And the measurement
> > should be in the long term instead of short term. That is why we always
> > need to align on the ideal end state first.
> > Meeting functionality requirements is the bare minimum bar for an API.
> > Simplicity, intuitiveness, robustness and evolvability are also important.
> > In addition, for projects with many APIs, such as Flink, a consistent API
> > style is also critical for the user adoption as well as bug avoidance. It
> > is very helpful for the community to agree on some API design conventions /
> > principles.
> > For example, in this particular case, via our discussion, hopefully we sort
> > of established the following API design conventions / principles for all
> > the Supports*PushDown interfaces.
> >
> > 1. By default, expressions should be used if applicable instead of other
> > representations.
> > 2. In general, the pushdown method should not assume all the pushdowns will
> > succeed. So the applyX() method should return a boolean or List<X>, to
> > handle the cases that some of the pushdowns cannot be fulfilled by the
> > implementation.
> >
> > Establishing such conventions and principles demands careful thinking for
> > the aspects I mentioned earlier in addition to the API functionalities.
> > This helps lower the bar of understanding, reduces the chance of having
> > loose ends in the API, and will benefit all the participants in the project
> > over time. I think this is the right way to achieve real API stability.
> > Otherwise, we may end up chasing our tails to find ways not to change the
> > existing non-ideal APIs.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Fri, Aug 25, 2023 at 9:33 AM yh z <zhengyunhon...@gmail.com> wrote:
> >
> > > Hi, Venkat,
> > >
> > > Thanks for the FLIP, it sounds good to support nested fields filter
> > > pushdown. Based on the design of flip and the above options, I would like
> > > to make a few suggestions:
> > >
> > > 1.  At present, introducing NestedFieldReferenceExpression looks like a
> > > better solution, which can fully meet our requirements while reducing
> > > modifications to base class FieldReferenceExpression. In the long run, I
> > > tend to abstract a basic class for NestedFieldReferenceExpression and
> > > FieldReferenceExpression as u suggested.
> > >
> > > 2. Personally, I don't recommend introducing *supportsNestedFilters() in
> > > supportsFilterPushdown. We just need to better declare the return value
> > of
> > > the method *applyFilters.
> > >
> > > 3. Finally, I think we need to look at the costs and benefits of unifying
> > > the SupportsFilterPushDown and SupportsProjectionPushDown (or others)
> > from
> > > the perspective of interface implementers. A stable API can reduce user
> > > development and change costs, if the current API can fully meet the
> > > functional requirements at the framework level, I personal suggest
> > reducing
> > > the impact on connector developers.
> > >
> > > Regards,
> > > Yunhong Zheng (Swuferhong)
> > >
> > >
> > > Venkatakrishnan Sowrirajan <vsowr...@asu.edu> 于2023年8月25日周五 01:25写道:
> > >
> > > > To keep it backwards compatible, introduce another API *applyAggregates
> > > > *with
> > > > *List<ReferenceExpression> *when nested field support is added and
> > > > deprecate the current API. This will by default throw an exception. In
> > > > flink planner, *applyAggregates *with nested fields and if it throws
> > > > exception then *applyAggregates* without nested fields.
> > > >
> > > > Regards
> > > > Venkata krishnan
> > > >
> > > >
> > > > On Thu, Aug 24, 2023 at 10:13 AM Venkatakrishnan Sowrirajan <
> > > > vsowr...@asu.edu> wrote:
> > > >
> > > > > Jark,
> > > > >
> > > > > How about having a separate NestedFieldReferenceExpression, and
> > > > >> abstracting a common base class "ReferenceExpression" for
> > > > >> NestedFieldReferenceExpression and FieldReferenceExpression? This
> > > makes
> > > > >> unifying expressions in
> > > > >>
> > "SupportsProjectionPushdown#applyProjections(List<ReferenceExpression>
> > > > >> ...)"
> > > > >> possible.
> > > > >
> > > > > This should be fine for *SupportsProjectionPushDown* and
> > > > > *SupportsFilterPushDown*. One concern in the case of
> > > > > *SupportsAggregatePushDown* with nested fields support (to be added
> > in
> > > > > the future), with this proposal, the API will become backwards
> > > > incompatible
> > > > > as the *args *for the aggregate function is
> > > > *List<FieldReferenceExpression>
> > > > > *that needs to change to *List<ReferenceExpression>*.
> > > > >
> > > > > Regards
> > > > > Venkata krishnan
> > > > >
> > > > >
> > > > > On Thu, Aug 24, 2023 at 1:18 AM Jark Wu <imj...@gmail.com> wrote:
> > > > >
> > > > >> Hi Becket,
> > > > >>
> > > > >> I think it is the second case, that a FieldReferenceExpression is
> > > > >> constructed
> > > > >> by the framework and passed to the connector (interfaces listed by
> > > > >> Venkata[1]
> > > > >> and Catalog#listPartitionsByFilter). Besides, understanding the
> > nested
> > > > >> field
> > > > >> is optional for users/connectors (just treat it as an unknown
> > > expression
> > > > >> if
> > > > >> the
> > > > >> connector doesn't want to support it).
> > > > >>
> > > > >> If we extend FieldReferenceExpression, in the case of "where
> > > col.nested
> > > > >
> > > > >> 10",
> > > > >> for the connectors already supported filter/delete pushdown, they
> > may
> > > > >> wrongly
> > > > >> pushdown "col > 10" instead of "nested > 10" because they still
> > treat
> > > > >> FieldReferenceExpression as a top-level column. This problem can be
> > > > >> resolved
> > > > >> by introducing an additional "supportedNestedPushdown" for each
> > > > interface,
> > > > >> but that method is not elegant and is hard to remove in the future,
> > > and
> > > > >> this could
> > > > >> be avoided if we have a separate NestedFieldReferenceExpression.
> > > > >>
> > > > >> If we want to extend FieldReferenceExpression, we have to add
> > > > protections
> > > > >> for every related API in one shot. Besides, FieldReferenceExpression
> > > is
> > > > a
> > > > >> fundamental class in the planner, we have to go through all the code
> > > > that
> > > > >> is using it to make sure it properly handling it if it is a nested
> > > field
> > > > >> which
> > > > >> is a big effort for the community.
> > > > >>
> > > > >> If we were designing this API on day 1, I fully support merging them
> > > in
> > > > a
> > > > >> FieldReferenceExpression. But in this case, I'm thinking about how
> > to
> > > > >> provide
> > > > >> users with a smooth migration path, and allow the community to
> > > gradually
> > > > >> put efforts into evolving the API, and not block the "Nested Fields
> > > > Filter
> > > > >> Pushdown"
> > > > >> requirement.
> > > > >>
> > > > >> How about having a separate NestedFieldReferenceExpression, and
> > > > >> abstracting a common base class "ReferenceExpression" for
> > > > >> NestedFieldReferenceExpression and FieldReferenceExpression? This
> > > makes
> > > > >> unifying expressions in
> > > > >>
> > "SupportsProjectionPushdown#applyProjections(List<ReferenceExpression>
> > > > >> ...)"
> > > > >> possible.
> > > > >>
> > > > >> Best,
> > > > >> Jark
> > > > >>
> > > > >> On Thu, 24 Aug 2023 at 07:00, Venkatakrishnan Sowrirajan <
> > > > >> vsowr...@asu.edu>
> > > > >> wrote:
> > > > >>
> > > > >> > Becket and Jark,
> > > > >> >
> > > > >> >  Deprecate all the other
> > > > >> > > methods except tryApplyFilters() and tryApplyProjections().
> > > > >> >
> > > > >> > For *SupportsProjectionPushDown*, we still need a
> > > > >> > *supportsNestedProjections* API on the table source as some of the
> > > > table
> > > > >> > sources might not be able to handle nested fields and therefore
> > the
> > > > >> Flink
> > > > >> > planner should not push down the nested projections or else the
> > > > >> > *applyProjection
> > > > >> > *API has to be appropriately changed to return
> > > > >> > *unconvertibleProjections *similar
> > > > >> > to *SupportsFilterPushDown*.
> > > > >> >
> > > > >> > Or we have to introduce two different applyProjections()
> > > > >> > > methods for FieldReferenceExpression /
> > > > NestedFieldReferenceExpression
> > > > >> > > respectively.
> > > > >> >
> > > > >> > Agree this is not preferred. Given that *supportNestedProjections
> > > > >> *cannot
> > > > >> > be deprecated/removed based on the current API form, extending
> > > > >> > *FieldReferenceExpression* to support nested fields should be
> > okay.
> > > > >> >
> > > > >> > Another alternative could be to change *applyProjections *to take
> > > > >> > List<ResolvedExpression> and on the connector side they choose to
> > > > handle
> > > > >> > *FieldReferenceExpression* and *NestedFieldReferenceExpression *as
> > > > >> > applicable and return the remainingProjections. In the case of
> > > nested
> > > > >> field
> > > > >> > projections not supported, it should return them back but only
> > > > >> projecting
> > > > >> > the top level fields. IMO, this is also *not preferred*.
> > > > >> >
> > > > >> > *SupportsAggregatePushDown*
> > > > >> >
> > > > >> > *AggregateExpression *currently takes in a list of
> > > > >> > *FieldReferenceExpression* as args for the aggregate function, if
> > in
> > > > >> future
> > > > >> > *SupportsAggregatePushDown* adds support for aggregate pushdown on
> > > > >> nested
> > > > >> > fields then the AggregateExpression API also has to change if a
> > new
> > > > >> > NestedFieldReferenceExpression is introduced for nested fields.
> > > > >> >
> > > > >> > If we add a
> > > > >> > > flag for each new filter,
> > > > >> > > the interface will be filled with lots of flags (e.g.,
> > > > >> supportsBetween,
> > > > >> > > supportsIN)
> > > > >> >
> > > > >> > In an ideal situation, I completely agree with you. But in the
> > > current
> > > > >> > state, *supportsNestedFilters* can act as a bridge to reach the
> > > > eventual
> > > > >> > desired state which is to have a clean and consistent set of APIs
> > > > >> > throughout all Supports*PushDown.
> > > > >> >
> > > > >> > Also shared some thoughts on the end state API
> > > > >> > <
> > > > >> >
> > > > >>
> > > >
> > >
> > https://urldefense.com/v3/__https://docs.google.com/document/d/1stLRPKOcxlEv8eHblkrOh0Zf5PLM-h76WMhEINHOyPY/edit?usp=sharing__;!!IKRxdwAv5BmarQ!ZZ2nS1PYlXLnEGFcikS3NsYG7tMaV3wU_z7FmvihNwQBmoLZk2WmcpuRWszK0FFmsInh9A6cndkJrQ$
> > > > >> > >
> > > > >> > with extension to the *FieldReferenceExpression* to support nested
> > > > >> fields.
> > > > >> > Please take a look.
> > > > >> >
> > > > >> > Regards
> > > > >> > Venkata krishnan
> > > > >> >
> > > > >> > On Tue, Aug 22, 2023 at 5:02 PM Becket Qin <becket....@gmail.com>
> > > > >> wrote:
> > > > >> >
> > > > >> > > Hi Jark,
> > > > >> > >
> > > > >> > > Regarding the migration path, it would be useful to scrutinize
> > the
> > > > use
> > > > >> > case
> > > > >> > > of FiledReferenceExpression and ResolvedExpressions. There are
> > two
> > > > >> kinds
> > > > >> > of
> > > > >> > > use cases:
> > > > >> > >
> > > > >> > > 1. A ResolvedExpression is constructed by the user or connector
> > /
> > > > >> plugin
> > > > >> > > developers.
> > > > >> > > 2. A ResolvedExpression is constructed by the framework and
> > passed
> > > > to
> > > > >> > user
> > > > >> > > or connector / plugin developers.
> > > > >> > >
> > > > >> > > For the first case, both of the approaches provide the same
> > > > migration
> > > > >> > > experience.
> > > > >> > >
> > > > >> > > For the second case, generally speaking, introducing
> > > > >> > > NestedFieldReferenceExpression and extending
> > > > FieldReferenceExpression
> > > > >> > would
> > > > >> > > have the same impact for backwards compatibility.
> > > > >> SupportsFilterPushDown
> > > > >> > is
> > > > >> > > a special case here because understanding the filter expressions
> > > is
> > > > >> > > optional for the source implementation. In other use cases, if
> > > > >> > > understanding the reference to a nested field is a must have,
> > the
> > > > user
> > > > >> > code
> > > > >> > > has to be changed, regardless of which approach we take to
> > support
> > > > >> nested
> > > > >> > > fields.
> > > > >> > >
> > > > >> > > Therefore, I think we have to check each public API where the
> > > nested
> > > > >> > field
> > > > >> > > reference is exposed. If we have many public APIs where
> > > > understanding
> > > > >> > > nested fields is optional for the user  / plugin / connector
> > > > >> developers,
> > > > >> > > having a separate NestedFieldReferenceExpression would have a
> > more
> > > > >> smooth
> > > > >> > > migration. Otherwise, there seems to be no difference between
> > the
> > > > two
> > > > >> > > approaches.
> > > > >> > >
> > > > >> > > Migration path aside, the main reason I prefer extending
> > > > >> > > FieldReferenceExpression over a new
> > NestedFieldReferenceExpression
> > > > is
> > > > >> > > because this makes the SupportsProjectionPushDown interface
> > > simpler.
> > > > >> > > Otherwise, we have to treat it as a special case that does not
> > > match
> > > > >> the
> > > > >> > > overall API style. Or we have to introduce two different
> > > > >> > applyProjections()
> > > > >> > > methods for FieldReferenceExpression /
> > > > NestedFieldReferenceExpression
> > > > >> > > respectively. This issue further extends to implementation in
> > > > >> addition to
> > > > >> > > public API. A single FieldReferenceExpression might help
> > simplify
> > > > the
> > > > >> > > implementation code a little bit. For example, in a recursive
> > > > >> processing
> > > > >> > of
> > > > >> > > a row with nested rows, we may not need to switch between
> > > > >> > > FieldReferenceExpression and NestedFieldReferenceExpression
> > > > depending
> > > > >> on
> > > > >> > > whether the record being processed is a top level record or
> > nested
> > > > >> > record.
> > > > >> > >
> > > > >> > > Thanks,
> > > > >> > >
> > > > >> > > Jiangjie (Becket) Qin
> > > > >> > >
> > > > >> > >
> > > > >> > > On Tue, Aug 22, 2023 at 11:43 PM Jark Wu <imj...@gmail.com>
> > > wrote:
> > > > >> > >
> > > > >> > > > Hi Becket,
> > > > >> > > >
> > > > >> > > > I totally agree we should try to have a consistent API for a
> > > final
> > > > >> > state.
> > > > >> > > > The only concern I have mentioned is the "smooth" migration
> > > path.
> > > > >> > > > The FiledReferenceExpression is widely used in many public
> > APIs,
> > > > >> > > > not only in the SupportsFilterPushDown. Yes, we can change
> > every
> > > > >> > > > methods in 2-steps, but is it good to change API back and
> > forth
> > > > for
> > > > >> > this?
> > > > >> > > > Personally, I'm fine with a separate
> > > > NestedFieldReferenceExpression
> > > > >> > > class.
> > > > >> > > > TBH, I prefer the separated way because it makes the reference
> > > > >> > expression
> > > > >> > > > more clear and concise.
> > > > >> > > >
> > > > >> > > > Best,
> > > > >> > > > Jark
> > > > >> > > >
> > > > >> > > >
> > > > >> > > > On Tue, 22 Aug 2023 at 16:53, Becket Qin <
> > becket....@gmail.com>
> > > > >> wrote:
> > > > >> > > >
> > > > >> > > > > Thanks for the reply, Jark.
> > > > >> > > > >
> > > > >> > > > > I think it will be helpful to understand the final state we
> > > want
> > > > >> to
> > > > >> > > > > eventually achieve first, then we can discuss the steps
> > > towards
> > > > >> that
> > > > >> > > > final
> > > > >> > > > > state.
> > > > >> > > > >
> > > > >> > > > > It looks like there are two proposed end states now:
> > > > >> > > > >
> > > > >> > > > > 1. Have a separate NestedFieldReferenceExpression class;
> > keep
> > > > >> > > > > SupportsFilterPushDown and SupportsProjectionPushDown the
> > > same.
> > > > >> It is
> > > > >> > > > just
> > > > >> > > > > a one step change.
> > > > >> > > > >    - Regarding the supportsNestedFilterPushDown() method, if
> > > our
> > > > >> > > contract
> > > > >> > > > > with the connector developer today is "The implementation
> > > should
> > > > >> > ignore
> > > > >> > > > > unrecognized expressions by putting them into the remaining
> > > > >> filters,
> > > > >> > > > > instead of throwing exceptions". Then there is no need for
> > > this
> > > > >> > > method. I
> > > > >> > > > > am not sure about the current contract. We should probably
> > > make
> > > > it
> > > > >> > > clear
> > > > >> > > > in
> > > > >> > > > > the interface Java doc.
> > > > >> > > > >
> > > > >> > > > > 2. Extend the existing FiledReferenceExpression class to
> > > support
> > > > >> > nested
> > > > >> > > > > fields; SupportsFilterPushDown only has one method of
> > > > >> > > > > applyFilters(List<ResolvedExpression>);
> > > > SupportsProjectionPushDown
> > > > >> > only
> > > > >> > > > has
> > > > >> > > > > one method of
> > applyProjections(List<FieldReferenceExpression>,
> > > > >> > > DataType).
> > > > >> > > > > It could just be two steps if we are not too obsessed with
> > the
> > > > >> exact
> > > > >> > > > names
> > > > >> > > > > of "applyFilters" and "applyProjections". More specifically,
> > > it
> > > > >> takes
> > > > >> > > two
> > > > >> > > > > steps to achieve this final state:
> > > > >> > > > >     a. introduce a new method
> > > > >> > tryApplyFilters(List<ResolvedExpression>)
> > > > >> > > > to
> > > > >> > > > > SupportsFilterPushDown, which may have
> > > FiledReferenceExpression
> > > > >> with
> > > > >> > > > nested
> > > > >> > > > > fields. The default implementation throws an exception. The
> > > > >> runtime
> > > > >> > > will
> > > > >> > > > > first call tryApplyFilters() with nested fields. In case of
> > > > >> > exception,
> > > > >> > > it
> > > > >> > > > > calls the existing applyFilters() without including the
> > nested
> > > > >> > filters.
> > > > >> > > > > Similarly, in SupportsProjectionPushDown, introduce a
> > > > >> > > > > tryApplyProjections<List<NestedFieldReference> method
> > > returning
> > > > a
> > > > >> > > Result.
> > > > >> > > > > The Result also contains the accepted and unapplicable
> > > > >> projections.
> > > > >> > The
> > > > >> > > > > default implementation also throws an exception. Deprecate
> > all
> > > > the
> > > > >> > > other
> > > > >> > > > > methods except tryApplyFilters() and tryApplyProjections().
> > > > >> > > > >     b. remove the deprecated methods in the next major
> > version
> > > > >> bump.
> > > > >> > > > >
> > > > >> > > > > Now the question is putting the migration steps aside, which
> > > end
> > > > >> > state
> > > > >> > > do
> > > > >> > > > > we prefer? While the first end state is acceptable for me,
> > > > >> > personally,
> > > > >> > > I
> > > > >> > > > > prefer the latter if we are designing from scratch. It is
> > > clean,
> > > > >> > > > consistent
> > > > >> > > > > and intuitive. Given the size of Flink, keeping APIs in the
> > > same
> > > > >> > style
> > > > >> > > > over
> > > > >> > > > > time is important. The migration is also not that
> > complicated.
> > > > >> > > > >
> > > > >> > > > > Thanks,
> > > > >> > > > >
> > > > >> > > > > Jiangjie (Becket) Qin
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > > On Tue, Aug 22, 2023 at 2:23 PM Jark Wu <imj...@gmail.com>
> > > > wrote:
> > > > >> > > > >
> > > > >> > > > > > Hi Venkat,
> > > > >> > > > > >
> > > > >> > > > > > Thanks for the proposal.
> > > > >> > > > > >
> > > > >> > > > > > I have some minor comments about the FLIP.
> > > > >> > > > > >
> > > > >> > > > > > 1. I think we don't need to
> > > > >> > > > > > add SupportsFilterPushDown#supportsNestedFilters() method,
> > > > >> > > > > > because connectors can skip nested filters by putting them
> > > in
> > > > >> > > > > > Result#remainingFilters().
> > > > >> > > > > > And this is backward-compatible because unknown
> > expressions
> > > > were
> > > > >> > > added
> > > > >> > > > to
> > > > >> > > > > > the remaining filters.
> > > > >> > > > > > Planner should push predicate expressions as more as
> > > possible.
> > > > >> If
> > > > >> > we
> > > > >> > > > add
> > > > >> > > > > a
> > > > >> > > > > > flag for each new filter,
> > > > >> > > > > > the interface will be filled with lots of flags (e.g.,
> > > > >> > > supportsBetween,
> > > > >> > > > > > supportsIN).
> > > > >> > > > > >
> > > > >> > > > > > 2. NestedFieldReferenceExpression#nestedFieldName should
> > be
> > > an
> > > > >> > array
> > > > >> > > of
> > > > >> > > > > > field names?
> > > > >> > > > > > Each string represents a field name part of the field
> > path.
> > > > Just
> > > > >> > keep
> > > > >> > > > > > aligning with `nestedFieldIndexArray`.
> > > > >> > > > > >
> > > > >> > > > > > 3. My concern about making FieldReferenceExpression
> > support
> > > > >> nested
> > > > >> > > > fields
> > > > >> > > > > > is the compatibility.
> > > > >> > > > > > It is a public API and users/connectors are already using
> > > it.
> > > > >> > People
> > > > >> > > > > > assumed it is a top-level column
> > > > >> > > > > > reference, and applied logic on it. But that's not true
> > now
> > > > and
> > > > >> > this
> > > > >> > > > may
> > > > >> > > > > > lead to unexpected errors.
> > > > >> > > > > > Having a separate NestedFieldReferenceExpression sounds
> > > safer
> > > > to
> > > > >> > me.
> > > > >> > > > > Mixing
> > > > >> > > > > > them in a class may
> > > > >> > > > > >  confuse users what's the meaning of getFieldName() and
> > > > >> > > > getFieldIndex().
> > > > >> > > > > >
> > > > >> > > > > >
> > > > >> > > > > > Regarding using NestedFieldReferenceExpression in
> > > > >> > > > > > SupportsProjectionPushDown, do you
> > > > >> > > > > > have any concerns @Timo Walther <twal...@apache.org> ?
> > > > >> > > > > >
> > > > >> > > > > > Best,
> > > > >> > > > > > Jark
> > > > >> > > > > >
> > > > >> > > > > >
> > > > >> > > > > >
> > > > >> > > > > > On Tue, 22 Aug 2023 at 05:55, Venkatakrishnan Sowrirajan <
> > > > >> > > > > vsowr...@asu.edu
> > > > >> > > > > > >
> > > > >> > > > > > wrote:
> > > > >> > > > > >
> > > > >> > > > > > > Sounds like a great suggestion, Becket. +1. Agree with
> > > > >> cleaning
> > > > >> > up
> > > > >> > > > the
> > > > >> > > > > > APIs
> > > > >> > > > > > > and making it consistent in all the pushdown APIs.
> > > > >> > > > > > >
> > > > >> > > > > > > Your suggested approach seems fine to me, unless anyone
> > > else
> > > > >> has
> > > > >> > > any
> > > > >> > > > > > other
> > > > >> > > > > > > concerns. Just have couple of clarifying questions:
> > > > >> > > > > > >
> > > > >> > > > > > > 1. Do you think we should standardize the APIs across
> > all
> > > > the
> > > > >> > > > pushdown
> > > > >> > > > > > > supports like SupportsPartitionPushdown,
> > > > >> SupportsDynamicFiltering
> > > > >> > > etc
> > > > >> > > > > in
> > > > >> > > > > > > the end state?
> > > > >> > > > > > >
> > > > >> > > > > > > The current proposal works if we do not want to migrate
> > > > >> > > > > > > > SupportsFilterPushdown to also use
> > > > >> > NestedFieldReferenceExpression
> > > > >> > > > in
> > > > >> > > > > > the
> > > > >> > > > > > > > long term.
> > > > >> > > > > > > >
> > > > >> > > > > > > Did you mean *FieldReferenceExpression* instead of
> > > > >> > > > > > > *NestedFieldReferenceExpression*?
> > > > >> > > > > > >
> > > > >> > > > > > > 2. Extend the FieldReferenceExpression to support nested
> > > > >> fields.
> > > > >> > > > > > > >     - Change the index field type from int to int[].
> > > > >> > > > > > >
> > > > >> > > > > > >     - Add a new method int[] getFieldIndexArray().
> > > > >> > > > > > > >     - Deprecate the int getFieldIndex() method, the
> > code
> > > > >> will
> > > > >> > be
> > > > >> > > > > > removed
> > > > >> > > > > > > in
> > > > >> > > > > > > > the next major version bump.
> > > > >> > > > > > >
> > > > >> > > > > > > I assume getFieldIndex would return fieldIndexArray[0],
> > > > right?
> > > > >> > > > > > >
> > > > >> > > > > > > Thanks
> > > > >> > > > > > > Venkat
> > > > >> > > > > > >
> > > > >> > > > > > > On Fri, Aug 18, 2023 at 4:47 PM Becket Qin <
> > > > >> becket....@gmail.com
> > > > >> > >
> > > > >> > > > > wrote:
> > > > >> > > > > > >
> > > > >> > > > > > > > Thanks for the proposal, Venkata.
> > > > >> > > > > > > >
> > > > >> > > > > > > > The current proposal works if we do not want to
> > migrate
> > > > >> > > > > > > > SupportsFilterPushdown to also use
> > > > >> > NestedFieldReferenceExpression
> > > > >> > > > in
> > > > >> > > > > > the
> > > > >> > > > > > > > long term.
> > > > >> > > > > > > >
> > > > >> > > > > > > Did you mean *FieldReferenceExpression* instead of
> > > > >> > > > > > > *NestedFieldReferenceExpression*?
> > > > >> > > > > > >
> > > > >> > > > > > > >
> > > > >> > > > > > > > Otherwise, the alternative solution briefly mentioned
> > in
> > > > the
> > > > >> > > > rejected
> > > > >> > > > > > > > alternatives would be the following:
> > > > >> > > > > > > > Phase 1:
> > > > >> > > > > > > > 1. Introduce a supportsNestedFilters() method to the
> > > > >> > > > > > > SupportsFilterPushdown
> > > > >> > > > > > > > interface. (same as current proposal).
> > > > >> > > > > > > > 2. Extend the FieldReferenceExpression to support
> > nested
> > > > >> > fields.
> > > > >> > > > > > > >     - Change the index field type from int to int[].
> > > > >> > > > > > >
> > > > >> > > > > > >     - Add a new method int[] getFieldIndexArray().
> > > > >> > > > > > > >     - Deprecate the int getFieldIndex() method, the
> > code
> > > > >> will
> > > > >> > be
> > > > >> > > > > > removed
> > > > >> > > > > > > in
> > > > >> > > > > > > > the next major version bump.
> > > > >> > > > > > >
> > > > >> > > > > > >
> > > > >> > > > > > >
> > > > >> > > > > > > 3. In the SupportsProjectionPushDown interface
> > > > >> > > > > > > >     - add a new method
> > > > >> > > > > applyProjection(List<FieldReferenceExpression>,
> > > > >> > > > > > > > DataType), with default implementation invoking
> > > > >> > > > > > applyProjection(int[][],
> > > > >> > > > > > > > DataType)
> > > > >> > > > > > > >     - deprecate the current applyProjection(int[][],
> > > > >> DataType)
> > > > >> > > > method
> > > > >> > > > > > > >
> > > > >> > > > > > > > Phase 2 (in the next major version bump)
> > > > >> > > > > > > > 1. remove the deprecated methods.
> > > > >> > > > > > > >
> > > > >> > > > > > > > Phase 3 (optional)
> > > > >> > > > > > > > 1. deprecate and remove the supportsNestedFilters() /
> > > > >> > > > > > > > supportsNestedProjection() methods from the
> > > > >> > > SupportsFilterPushDown
> > > > >> > > > /
> > > > >> > > > > > > > SupportsProjectionPushDown interfaces.
> > > > >> > > > > > > >
> > > > >> > > > > > > > Personally I prefer this alternative. It takes longer
> > to
> > > > >> finish
> > > > >> > > the
> > > > >> > > > > > work,
> > > > >> > > > > > > > but the API eventually becomes clean and consistent.
> > > But I
> > > > >> can
> > > > >> > > live
> > > > >> > > > > > with
> > > > >> > > > > > > > the current proposal.
> > > > >> > > > > > > >
> > > > >> > > > > > > > Thanks,
> > > > >> > > > > > > >
> > > > >> > > > > > > > Jiangjie (Becket) Qin
> > > > >> > > > > > > >
> > > > >> > > > > > > > On Sat, Aug 19, 2023 at 12:09 AM Venkatakrishnan
> > > > Sowrirajan
> > > > >> <
> > > > >> > > > > > > > vsowr...@asu.edu> wrote:
> > > > >> > > > > > > >
> > > > >> > > > > > > > > Gentle ping for reviews/feedback.
> > > > >> > > > > > > > >
> > > > >> > > > > > > > > On Tue, Aug 15, 2023, 5:37 PM Venkatakrishnan
> > > > Sowrirajan <
> > > > >> > > > > > > > vsowr...@asu.edu
> > > > >> > > > > > > > > >
> > > > >> > > > > > > > > wrote:
> > > > >> > > > > > > > >
> > > > >> > > > > > > > > > Hi All,
> > > > >> > > > > > > > > >
> > > > >> > > > > > > > > > I am opening this thread to discuss FLIP-356:
> > > Support
> > > > >> > Nested
> > > > >> > > > > Fields
> > > > >> > > > > > > > > > Filter Pushdown. The FLIP can be found at
> > > > >> > > > > > > > > >
> > > > >> > > > > > > > >
> > > > >> > > > > > > >
> > > > >> > > > > > >
> > > > >> > > > > >
> > > > >> > > > >
> > > > >> > > >
> > > > >> > >
> > > > >> >
> > > > >>
> > > >
> > >
> > https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/FLINK/FLIP-356*3A*Support*Nested*Fields*Filter*Pushdown__;JSsrKysr!!IKRxdwAv5BmarQ!clxXJwshKpn559SAkQiieqgGe0ZduXCzUKCmYLtFIbQLmrmEEgdmuEIM8ZM1M3O_uGqOploU4ailqGpukAg$
> > > > >> > > > > > > > > >
> > > > >> > > > > > > > > > This FLIP adds support for pushing down nested
> > > fields
> > > > >> > filters
> > > > >> > > > to
> > > > >> > > > > > the
> > > > >> > > > > > > > > > underlying TableSource. In our data lake, we find
> > a
> > > > lot
> > > > >> of
> > > > >> > > > > datasets
> > > > >> > > > > > > > have
> > > > >> > > > > > > > > > nested fields and also user queries with filters
> > > > >> defined on
> > > > >> > > the
> > > > >> > > > > > > nested
> > > > >> > > > > > > > > > fields. This would drastically improve the
> > > performance
> > > > >> for
> > > > >> > > > those
> > > > >> > > > > > sets
> > > > >> > > > > > > > of
> > > > >> > > > > > > > > > queries.
> > > > >> > > > > > > > > >
> > > > >> > > > > > > > > > Appreciate any comments or feedback you may have
> > on
> > > > this
> > > > >> > > > > proposal.
> > > > >> > > > > > > > > >
> > > > >> > > > > > > > > > Regards
> > > > >> > > > > > > > > > Venkata krishnan
> > > > >> > > > > > > > > >
> > > > >> > > > > > > > >
> > > > >> > > > > > > >
> > > > >> > > > > > >
> > > > >> > > > > >
> > > > >> > > > >
> > > > >> > > >
> > > > >> > >
> > > > >> >
> > > > >>
> > > > >
> > > >
> > >
> >

Reply via email to