Hi Tison,

But custom payloads are commonly used to plugin logic to merge base and
delta records. So, yes, its already out there :/

Let me take a closer pass at this and circle back.

Thanks
Vinoth

On Thu, Apr 23, 2020 at 7:44 AM tison <[email protected]> wrote:

> Thanks for your reply Vinoth.
>
> First of all I'd like to know about "custom payloads". Previously
> I tend to regard such payloads as internal classes. If we already
> exposing it to users, it is possibly we can do little thing to change
> anything about the interface. A reasonable way would be we
> deprecate HoodieRecordPayload which lacks of type design
> and do adapt logic with a new payload class. Although, I think it
> would be then useless to do it and we have to live with current
> HoodieRecordPayload. Given it is one of primary class and already
> user facing, we might live with it and the raw usage forever.
>
> Best,
> tison.
>
>
> Vinoth Chandar <[email protected]> 于2020年4月23日周四 下午10:31写道:
>
> > Thanks Tison! One consideration we need to have is that we cannot have a
> > breaking non-backwards compatible change for existing users with custom
> > payloads.
> > It will be a rough experience.. Guessing some effort would be needed to
> > upgrade existing payload classes right?
> >
> > On Wed, Apr 22, 2020 at 10:14 PM tison <[email protected]> wrote:
> >
> > > Thanks for your feedback!
> > >
> > > Here is a quick summary about the type parameter of
> HoodieRecordPayload.
> > >
> > > 1. The type parameter is used to refer "self type" which is generally
> > good
> > > for
> > > access subclass methods/fields without unchecked casting.
> > >
> > > 2. However, the concrete payload is accessed by
> > `combineAndGetUpdateValue`
> > > or `getInsertValue` which doesn't get benefit from the type parameter.
> > BTW,
> > > the
> > > return type of these two methods suffers from unchecked casting also,
> > which
> > > is
> > > filed as HUDI-834[1].
> > >
> > > 3. So, the only usage of the type parameter(self type) is `preCombine`
> > > method.
> > > Though, at the real use point we pass raw typed HoodieRecordPayload as
> > > the parameter[2]. I'm no sure the behavior when casting failed and want
> > to
> > > ask for advice. So far, we cast at runtime and a cast exception will be
> > > thrown
> > > if failed.
> > >
> > > Given the cases above, I tend to change the definition
> > > of HoodieRecordPayload
> > > as
> > >
> > >   public interface HoodieRecordPayload { }
> > >
> > > if we can well-define how to do in `preCombine` when type mismatch. The
> > > alternative,
> > > i.e., a full paramterized self type type parameter will affect quite a
> > wide
> > > range of code
> > > where we have to cascade change other raw usages such as
> > > HoodieTable/HoodieRecord/HoodieWriteHandle and so on; while the self
> type
> > > parameter
> > > doesn't bring a lot benefit.
> > >
> > > Best,
> > > tison.
> > >
> > > [1] https://issues.apache.org/jira/browse/HUDI-834
> > > [2]
> > org/apache/hudi/common/table/log/HoodieMergedLogRecordScanner.java:114
> > >
> > >
> > >
> > > Vinoth Chandar <[email protected]> 于2020年4月23日周四 上午12:04写道:
> > >
> > > > +1 raising a JIRA and summarizing some findings would be great.
> > > >
> > > > On Wed, Apr 22, 2020 at 8:38 AM leesf <[email protected]> wrote:
> > > >
> > > > > hi tison,
> > > > >
> > > > > It is always better to make the codebase more clear, so it would be
> > > great
> > > > > if you would do an investigation.
> > > > >
> > > > > tison <[email protected]> 于2020年4月22日周三 下午2:42写道:
> > > > >
> > > > > > Hi Vinoth,
> > > > > >
> > > > > > >much of the code actually does not depend on the templatized
> type
> > at
> > > > all
> > > > > >
> > > > > > Agree.
> > > > > >
> > > > > > Let's say I'm ok with untyped HoodieRecordPayload since all
> payload
> > > is
> > > > > > effectively untyped(Object) which we deal with type and cast at
> > > > runtime.
> > > > > >
> > > > > > Though, it is noisy we implement it in a half-generic form.
> > > Meanwhile,
> > > > > > wildcard doesn't work for every case since <? capture
> > > > > HoodieRecordPayload>
> > > > > > is NOT compatible with <? capture HoodieRecordPayload> and any
> > > > > > exact T extends HoodieRecordPayload.
> > > > > >
> > > > > > We don't actually use the type parameter heavily, so it is an
> > > > alternative
> > > > > > that we define HoodieRecordPayload just
> > > > > >
> > > > > >   public class HoodieRecordPayload { }
> > > > > >
> > > > > > If the community think it is a worth effort, I'm glad to do more
> > > > > > investigation
> > > > > > and evaluate its impact, also find a practical way.
> > > > > >
> > > > > > Best,
> > > > > > tison.
> > > > > >
> > > > > >
> > > > > > Vinoth Chandar <[email protected]> 于2020年4月22日周三 下午2:22写道:
> > > > > >
> > > > > > > Hi Tison,
> > > > > > >
> > > > > > > Thanks for raising this.. In most places doing a HoodieTable<?>
> > > > > wildcard
> > > > > > > should be totally acceptable, since much of the code actually
> > does
> > > > not
> > > > > > > depend on the templatized type at all..
> > > > > > >
> > > > > > > Def, worth taking another look holistically and see if we can
> > > address
> > > > > > > this..
> > > > > > >
> > > > > > > My 2c.
> > > > > > > Vinoth
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Apr 20, 2020 at 11:32 PM tison <[email protected]>
> > > wrote:
> > > > > > >
> > > > > > > > Hi developers,
> > > > > > > >
> > > > > > > > Recently when I went through Hudi codebase, I found some
> > annoying
> > > > > > warning
> > > > > > > > 'raw use of
> > > > > > > > parameterized class' and some other generic type related.
> > > > > > > >
> > > > > > > > It seems the root of this wide usage of raw type is
> > > > > HoodieRecordPayload
> > > > > > > > where
> > > > > > > >
> > > > > > > >   HoodieRecordPayload<T extends HoodieRecordPayload>
> > > > > > > >
> > > > > > > > should have been
> > > > > > > >
> > > > > > > >   HoodieRecordPayload<T extends HoodieRecordPayload<T>>
> > > > > > > >
> > > > > > > > I'm not sure what community think of this wide usage of raw
> > type
> > > > and
> > > > > > > given
> > > > > > > > it affect most of our
> > > > > > > > code I'd like to start this thread for advice. Personally
> > > explicit
> > > > > type
> > > > > > > > signature benefits from strong
> > > > > > > > type system but it might cause some signature breaking
> changes
> > as
> > > > > well
> > > > > > as
> > > > > > > > require a huge
> > > > > > > > engineering effort.
> > > > > > > >
> > > > > > > > Shall Hudi live with these raw types? Or we will have a plan
> > > > migrate
> > > > > to
> > > > > > > > explicit generic type?
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > tison.
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to