Thanks!

nishith agarwal <n3.nas...@gmail.com> 于2019年2月27日周三 下午2:56写道:

> Hi Kaka,
>
> Hudi automatically does file sizing for you. As you ingest more inserts the
> existing file will be automatically sized. You can play with a few configs
> :
>
> https://hudi.apache.org/configurations.html#withStorageConfig -> This
> config allows you to set a max size for your output file.
> https://hudi.apache.org/configurations.html#compactionSmallFileSize ->
> This
> config allows you to set a minimum file size that will be automatically
> sized.
>
> As you can guess, the limitFileSize >= compactionFileSize.
> Hope this helps.
>
> Thanks,
> Nishith
>
> On Tue, Feb 26, 2019 at 6:52 PM kaka chen <kaka11.c...@gmail.com> wrote:
>
> > Hi All,
> >
> > I found Insert will generate at least one file each time when each spark
> or
> > spark streaming batch.
> > Is it expected result? If it is, how to control these small files, is
> hudi
> > provide some tools to compact it?
> >
> > Thanks,
> > Frank
> >
>

Reply via email to