Hi!
I want to participate in hudi development more active and i need your advice 
for this to start.
At the moment i'm not really familiar with hudi to fix complex bugs, develop 
new outstanding features or make some global optimizations by myself. I take 
from jira only simple tickets, or create my own and solve them. 
While solving these easy tasks i face lots of little smelly things in the code 
(such as: typos, concatenation in logging, useless 
variables/fields/arguments/exceptions, missing annotations, raw type usage, 
etc) that i would like to fix immediately. But it is not welcome in community 
to mix in single PR realization of target jira-task and such refactoring.
Also i would like to understand the code and hudi functionality better to be 
able to make more serious contribution in the future. 
And while figuring out hudi codebase i want not just to get better 
understanding for myself, but also to do something useful for hudi project. 

So, my intention is to figure out with hudi globally starting from making 
micro-refactoring. And my plan is:
start from simple: attentively review all code base and methodically do lots of 
trivial cosmetic micro-fixes, that make code cleaner (examples of improvements 
are listed above);
during p.1 note for myself places in code (methods, classes, families) that 
needs more complex refactoring or should be optimized;
make refactoring/optimizations from p.2 and for each case create it's own 
jira-task and PR (or MINOR PR if there are not many changes);
.......
PROFIT!!!!!!
In case of p.1 i have some questions to ask you:
do we need such clean up?
if yes, what is the best approach to contribute lots of safe non-breaking 
micro-fixes to clean up the code? I mean dividing such changes by jira-tasks 
and PRs.
what is acceptable number of files changed by single MINOR PR?
If i make 1 PR per module, then even on middle-sized module there will be too 
many diffs, that reviewer won't like at all (and will never approve it). If I 
additionally divide amount of changes by multiple PRs there will be too many 
trivial PRs that produce extra load on ci.

Patiently waiting for your advice.

Reply via email to