Thanks! nishith agarwal <n3.nas...@gmail.com> 于2019年2月27日周三 下午2:56写道:
> Hi Kaka, > > Hudi automatically does file sizing for you. As you ingest more inserts the > existing file will be automatically sized. You can play with a few configs > : > > https://hudi.apache.org/configurations.html#withStorageConfig -> This > config allows you to set a max size for your output file. > https://hudi.apache.org/configurations.html#compactionSmallFileSize -> > This > config allows you to set a minimum file size that will be automatically > sized. > > As you can guess, the limitFileSize >= compactionFileSize. > Hope this helps. > > Thanks, > Nishith > > On Tue, Feb 26, 2019 at 6:52 PM kaka chen <kaka11.c...@gmail.com> wrote: > > > Hi All, > > > > I found Insert will generate at least one file each time when each spark > or > > spark streaming batch. > > Is it expected result? If it is, how to control these small files, is > hudi > > provide some tools to compact it? > > > > Thanks, > > Frank > > >