Thanks Tison! One consideration we need to have is that we cannot have a breaking non-backwards compatible change for existing users with custom payloads. It will be a rough experience.. Guessing some effort would be needed to upgrade existing payload classes right?
On Wed, Apr 22, 2020 at 10:14 PM tison <[email protected]> wrote: > Thanks for your feedback! > > Here is a quick summary about the type parameter of HoodieRecordPayload. > > 1. The type parameter is used to refer "self type" which is generally good > for > access subclass methods/fields without unchecked casting. > > 2. However, the concrete payload is accessed by `combineAndGetUpdateValue` > or `getInsertValue` which doesn't get benefit from the type parameter. BTW, > the > return type of these two methods suffers from unchecked casting also, which > is > filed as HUDI-834[1]. > > 3. So, the only usage of the type parameter(self type) is `preCombine` > method. > Though, at the real use point we pass raw typed HoodieRecordPayload as > the parameter[2]. I'm no sure the behavior when casting failed and want to > ask for advice. So far, we cast at runtime and a cast exception will be > thrown > if failed. > > Given the cases above, I tend to change the definition > of HoodieRecordPayload > as > > public interface HoodieRecordPayload { } > > if we can well-define how to do in `preCombine` when type mismatch. The > alternative, > i.e., a full paramterized self type type parameter will affect quite a wide > range of code > where we have to cascade change other raw usages such as > HoodieTable/HoodieRecord/HoodieWriteHandle and so on; while the self type > parameter > doesn't bring a lot benefit. > > Best, > tison. > > [1] https://issues.apache.org/jira/browse/HUDI-834 > [2] org/apache/hudi/common/table/log/HoodieMergedLogRecordScanner.java:114 > > > > Vinoth Chandar <[email protected]> 于2020年4月23日周四 上午12:04写道: > > > +1 raising a JIRA and summarizing some findings would be great. > > > > On Wed, Apr 22, 2020 at 8:38 AM leesf <[email protected]> wrote: > > > > > hi tison, > > > > > > It is always better to make the codebase more clear, so it would be > great > > > if you would do an investigation. > > > > > > tison <[email protected]> 于2020年4月22日周三 下午2:42写道: > > > > > > > Hi Vinoth, > > > > > > > > >much of the code actually does not depend on the templatized type at > > all > > > > > > > > Agree. > > > > > > > > Let's say I'm ok with untyped HoodieRecordPayload since all payload > is > > > > effectively untyped(Object) which we deal with type and cast at > > runtime. > > > > > > > > Though, it is noisy we implement it in a half-generic form. > Meanwhile, > > > > wildcard doesn't work for every case since <? capture > > > HoodieRecordPayload> > > > > is NOT compatible with <? capture HoodieRecordPayload> and any > > > > exact T extends HoodieRecordPayload. > > > > > > > > We don't actually use the type parameter heavily, so it is an > > alternative > > > > that we define HoodieRecordPayload just > > > > > > > > public class HoodieRecordPayload { } > > > > > > > > If the community think it is a worth effort, I'm glad to do more > > > > investigation > > > > and evaluate its impact, also find a practical way. > > > > > > > > Best, > > > > tison. > > > > > > > > > > > > Vinoth Chandar <[email protected]> 于2020年4月22日周三 下午2:22写道: > > > > > > > > > Hi Tison, > > > > > > > > > > Thanks for raising this.. In most places doing a HoodieTable<?> > > > wildcard > > > > > should be totally acceptable, since much of the code actually does > > not > > > > > depend on the templatized type at all.. > > > > > > > > > > Def, worth taking another look holistically and see if we can > address > > > > > this.. > > > > > > > > > > My 2c. > > > > > Vinoth > > > > > > > > > > > > > > > On Mon, Apr 20, 2020 at 11:32 PM tison <[email protected]> > wrote: > > > > > > > > > > > Hi developers, > > > > > > > > > > > > Recently when I went through Hudi codebase, I found some annoying > > > > warning > > > > > > 'raw use of > > > > > > parameterized class' and some other generic type related. > > > > > > > > > > > > It seems the root of this wide usage of raw type is > > > HoodieRecordPayload > > > > > > where > > > > > > > > > > > > HoodieRecordPayload<T extends HoodieRecordPayload> > > > > > > > > > > > > should have been > > > > > > > > > > > > HoodieRecordPayload<T extends HoodieRecordPayload<T>> > > > > > > > > > > > > I'm not sure what community think of this wide usage of raw type > > and > > > > > given > > > > > > it affect most of our > > > > > > code I'd like to start this thread for advice. Personally > explicit > > > type > > > > > > signature benefits from strong > > > > > > type system but it might cause some signature breaking changes as > > > well > > > > as > > > > > > require a huge > > > > > > engineering effort. > > > > > > > > > > > > Shall Hudi live with these raw types? Or we will have a plan > > migrate > > > to > > > > > > explicit generic type? > > > > > > > > > > > > Best, > > > > > > tison. > > > > > > > > > > > > > > > > > > > > >
