Hi Tison, But custom payloads are commonly used to plugin logic to merge base and delta records. So, yes, its already out there :/
Let me take a closer pass at this and circle back. Thanks Vinoth On Thu, Apr 23, 2020 at 7:44 AM tison <[email protected]> wrote: > Thanks for your reply Vinoth. > > First of all I'd like to know about "custom payloads". Previously > I tend to regard such payloads as internal classes. If we already > exposing it to users, it is possibly we can do little thing to change > anything about the interface. A reasonable way would be we > deprecate HoodieRecordPayload which lacks of type design > and do adapt logic with a new payload class. Although, I think it > would be then useless to do it and we have to live with current > HoodieRecordPayload. Given it is one of primary class and already > user facing, we might live with it and the raw usage forever. > > Best, > tison. > > > Vinoth Chandar <[email protected]> 于2020年4月23日周四 下午10:31写道: > > > Thanks Tison! One consideration we need to have is that we cannot have a > > breaking non-backwards compatible change for existing users with custom > > payloads. > > It will be a rough experience.. Guessing some effort would be needed to > > upgrade existing payload classes right? > > > > On Wed, Apr 22, 2020 at 10:14 PM tison <[email protected]> wrote: > > > > > Thanks for your feedback! > > > > > > Here is a quick summary about the type parameter of > HoodieRecordPayload. > > > > > > 1. The type parameter is used to refer "self type" which is generally > > good > > > for > > > access subclass methods/fields without unchecked casting. > > > > > > 2. However, the concrete payload is accessed by > > `combineAndGetUpdateValue` > > > or `getInsertValue` which doesn't get benefit from the type parameter. > > BTW, > > > the > > > return type of these two methods suffers from unchecked casting also, > > which > > > is > > > filed as HUDI-834[1]. > > > > > > 3. So, the only usage of the type parameter(self type) is `preCombine` > > > method. > > > Though, at the real use point we pass raw typed HoodieRecordPayload as > > > the parameter[2]. I'm no sure the behavior when casting failed and want > > to > > > ask for advice. So far, we cast at runtime and a cast exception will be > > > thrown > > > if failed. > > > > > > Given the cases above, I tend to change the definition > > > of HoodieRecordPayload > > > as > > > > > > public interface HoodieRecordPayload { } > > > > > > if we can well-define how to do in `preCombine` when type mismatch. The > > > alternative, > > > i.e., a full paramterized self type type parameter will affect quite a > > wide > > > range of code > > > where we have to cascade change other raw usages such as > > > HoodieTable/HoodieRecord/HoodieWriteHandle and so on; while the self > type > > > parameter > > > doesn't bring a lot benefit. > > > > > > Best, > > > tison. > > > > > > [1] https://issues.apache.org/jira/browse/HUDI-834 > > > [2] > > org/apache/hudi/common/table/log/HoodieMergedLogRecordScanner.java:114 > > > > > > > > > > > > Vinoth Chandar <[email protected]> 于2020年4月23日周四 上午12:04写道: > > > > > > > +1 raising a JIRA and summarizing some findings would be great. > > > > > > > > On Wed, Apr 22, 2020 at 8:38 AM leesf <[email protected]> wrote: > > > > > > > > > hi tison, > > > > > > > > > > It is always better to make the codebase more clear, so it would be > > > great > > > > > if you would do an investigation. > > > > > > > > > > tison <[email protected]> 于2020年4月22日周三 下午2:42写道: > > > > > > > > > > > Hi Vinoth, > > > > > > > > > > > > >much of the code actually does not depend on the templatized > type > > at > > > > all > > > > > > > > > > > > Agree. > > > > > > > > > > > > Let's say I'm ok with untyped HoodieRecordPayload since all > payload > > > is > > > > > > effectively untyped(Object) which we deal with type and cast at > > > > runtime. > > > > > > > > > > > > Though, it is noisy we implement it in a half-generic form. > > > Meanwhile, > > > > > > wildcard doesn't work for every case since <? capture > > > > > HoodieRecordPayload> > > > > > > is NOT compatible with <? capture HoodieRecordPayload> and any > > > > > > exact T extends HoodieRecordPayload. > > > > > > > > > > > > We don't actually use the type parameter heavily, so it is an > > > > alternative > > > > > > that we define HoodieRecordPayload just > > > > > > > > > > > > public class HoodieRecordPayload { } > > > > > > > > > > > > If the community think it is a worth effort, I'm glad to do more > > > > > > investigation > > > > > > and evaluate its impact, also find a practical way. > > > > > > > > > > > > Best, > > > > > > tison. > > > > > > > > > > > > > > > > > > Vinoth Chandar <[email protected]> 于2020年4月22日周三 下午2:22写道: > > > > > > > > > > > > > Hi Tison, > > > > > > > > > > > > > > Thanks for raising this.. In most places doing a HoodieTable<?> > > > > > wildcard > > > > > > > should be totally acceptable, since much of the code actually > > does > > > > not > > > > > > > depend on the templatized type at all.. > > > > > > > > > > > > > > Def, worth taking another look holistically and see if we can > > > address > > > > > > > this.. > > > > > > > > > > > > > > My 2c. > > > > > > > Vinoth > > > > > > > > > > > > > > > > > > > > > On Mon, Apr 20, 2020 at 11:32 PM tison <[email protected]> > > > wrote: > > > > > > > > > > > > > > > Hi developers, > > > > > > > > > > > > > > > > Recently when I went through Hudi codebase, I found some > > annoying > > > > > > warning > > > > > > > > 'raw use of > > > > > > > > parameterized class' and some other generic type related. > > > > > > > > > > > > > > > > It seems the root of this wide usage of raw type is > > > > > HoodieRecordPayload > > > > > > > > where > > > > > > > > > > > > > > > > HoodieRecordPayload<T extends HoodieRecordPayload> > > > > > > > > > > > > > > > > should have been > > > > > > > > > > > > > > > > HoodieRecordPayload<T extends HoodieRecordPayload<T>> > > > > > > > > > > > > > > > > I'm not sure what community think of this wide usage of raw > > type > > > > and > > > > > > > given > > > > > > > > it affect most of our > > > > > > > > code I'd like to start this thread for advice. Personally > > > explicit > > > > > type > > > > > > > > signature benefits from strong > > > > > > > > type system but it might cause some signature breaking > changes > > as > > > > > well > > > > > > as > > > > > > > > require a huge > > > > > > > > engineering effort. > > > > > > > > > > > > > > > > Shall Hudi live with these raw types? Or we will have a plan > > > > migrate > > > > > to > > > > > > > > explicit generic type? > > > > > > > > > > > > > > > > Best, > > > > > > > > tison. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
