Thanks Tison! One consideration we need to have is that we cannot have a
breaking non-backwards compatible change for existing users with custom
payloads.
It will be a rough experience.. Guessing some effort would be needed to
upgrade existing payload classes right?

On Wed, Apr 22, 2020 at 10:14 PM tison <[email protected]> wrote:

> Thanks for your feedback!
>
> Here is a quick summary about the type parameter of HoodieRecordPayload.
>
> 1. The type parameter is used to refer "self type" which is generally good
> for
> access subclass methods/fields without unchecked casting.
>
> 2. However, the concrete payload is accessed by `combineAndGetUpdateValue`
> or `getInsertValue` which doesn't get benefit from the type parameter. BTW,
> the
> return type of these two methods suffers from unchecked casting also, which
> is
> filed as HUDI-834[1].
>
> 3. So, the only usage of the type parameter(self type) is `preCombine`
> method.
> Though, at the real use point we pass raw typed HoodieRecordPayload as
> the parameter[2]. I'm no sure the behavior when casting failed and want to
> ask for advice. So far, we cast at runtime and a cast exception will be
> thrown
> if failed.
>
> Given the cases above, I tend to change the definition
> of HoodieRecordPayload
> as
>
>   public interface HoodieRecordPayload { }
>
> if we can well-define how to do in `preCombine` when type mismatch. The
> alternative,
> i.e., a full paramterized self type type parameter will affect quite a wide
> range of code
> where we have to cascade change other raw usages such as
> HoodieTable/HoodieRecord/HoodieWriteHandle and so on; while the self type
> parameter
> doesn't bring a lot benefit.
>
> Best,
> tison.
>
> [1] https://issues.apache.org/jira/browse/HUDI-834
> [2] org/apache/hudi/common/table/log/HoodieMergedLogRecordScanner.java:114
>
>
>
> Vinoth Chandar <[email protected]> 于2020年4月23日周四 上午12:04写道:
>
> > +1 raising a JIRA and summarizing some findings would be great.
> >
> > On Wed, Apr 22, 2020 at 8:38 AM leesf <[email protected]> wrote:
> >
> > > hi tison,
> > >
> > > It is always better to make the codebase more clear, so it would be
> great
> > > if you would do an investigation.
> > >
> > > tison <[email protected]> 于2020年4月22日周三 下午2:42写道:
> > >
> > > > Hi Vinoth,
> > > >
> > > > >much of the code actually does not depend on the templatized type at
> > all
> > > >
> > > > Agree.
> > > >
> > > > Let's say I'm ok with untyped HoodieRecordPayload since all payload
> is
> > > > effectively untyped(Object) which we deal with type and cast at
> > runtime.
> > > >
> > > > Though, it is noisy we implement it in a half-generic form.
> Meanwhile,
> > > > wildcard doesn't work for every case since <? capture
> > > HoodieRecordPayload>
> > > > is NOT compatible with <? capture HoodieRecordPayload> and any
> > > > exact T extends HoodieRecordPayload.
> > > >
> > > > We don't actually use the type parameter heavily, so it is an
> > alternative
> > > > that we define HoodieRecordPayload just
> > > >
> > > >   public class HoodieRecordPayload { }
> > > >
> > > > If the community think it is a worth effort, I'm glad to do more
> > > > investigation
> > > > and evaluate its impact, also find a practical way.
> > > >
> > > > Best,
> > > > tison.
> > > >
> > > >
> > > > Vinoth Chandar <[email protected]> 于2020年4月22日周三 下午2:22写道:
> > > >
> > > > > Hi Tison,
> > > > >
> > > > > Thanks for raising this.. In most places doing a HoodieTable<?>
> > > wildcard
> > > > > should be totally acceptable, since much of the code actually does
> > not
> > > > > depend on the templatized type at all..
> > > > >
> > > > > Def, worth taking another look holistically and see if we can
> address
> > > > > this..
> > > > >
> > > > > My 2c.
> > > > > Vinoth
> > > > >
> > > > >
> > > > > On Mon, Apr 20, 2020 at 11:32 PM tison <[email protected]>
> wrote:
> > > > >
> > > > > > Hi developers,
> > > > > >
> > > > > > Recently when I went through Hudi codebase, I found some annoying
> > > > warning
> > > > > > 'raw use of
> > > > > > parameterized class' and some other generic type related.
> > > > > >
> > > > > > It seems the root of this wide usage of raw type is
> > > HoodieRecordPayload
> > > > > > where
> > > > > >
> > > > > >   HoodieRecordPayload<T extends HoodieRecordPayload>
> > > > > >
> > > > > > should have been
> > > > > >
> > > > > >   HoodieRecordPayload<T extends HoodieRecordPayload<T>>
> > > > > >
> > > > > > I'm not sure what community think of this wide usage of raw type
> > and
> > > > > given
> > > > > > it affect most of our
> > > > > > code I'd like to start this thread for advice. Personally
> explicit
> > > type
> > > > > > signature benefits from strong
> > > > > > type system but it might cause some signature breaking changes as
> > > well
> > > > as
> > > > > > require a huge
> > > > > > engineering effort.
> > > > > >
> > > > > > Shall Hudi live with these raw types? Or we will have a plan
> > migrate
> > > to
> > > > > > explicit generic type?
> > > > > >
> > > > > > Best,
> > > > > > tison.
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to