+1. hudi-delta gives me the feeling that it has something to do with other
frameworks... I’d vote for another name hudi-deltastreamer or hudi-streamer
or hudi-stream.

On Wed, Mar 4, 2020 at 2:29 AM vino yang <[email protected]> wrote:

> Hi folks,
>
> Currently, it seems the content of hudi-utilities looks a bit mix.
> Summarize all of them, there are two aspects list below:
>
>
>    - delta streamer and its relevant packages, e.g. deltastreamer, sources,
>    schema, transform, these packages are served for delta streamer.
>    - Some utility tools such as
>    HDFSParquetImporter、HiveIncrementalPuller、HoodieCleaner and so on
>
>
> We are trying to refactor the computing engine relevant business logic.
> Delta Streamer (especially, the sources package is a start point of a job
> of Spark/Flink) will be affected. Doing this restructure can make the work
> more clear and focus.
>
> I would like to start a proposal to restructure the hudi-utilites module.
> Considering delta streamer is a great feature for hudi, the logic is very
> much in the hudi-utilites. Can we raise its importance via making the delta
> streamer as a single module? It could be named e.g. hudi-delta or something
> else. Then let the hudi-utilities be a real utilities module to host
> HDFSParquetImporter、HiveIncrementalPuller、HoodieCleaner tools.
>
> In short, we can do these restructure works:
>
>
>    - create a new module, named “hudi-delta” (or other name?) and move the
>    deltastreamer, sources, schema, transform … packages into this module
>    - leave HDFSParquetImporter、HiveIncrementalPuller、HoodieCleaner … in the
>    current place (utilities module)
>
> What do you think?
>
> Any comments and suggestions are welcome and appreciated.
>
> Best,
> Vino
>

Reply via email to