Hi Sky, Currently I don't think it's possible to customize file names automatically with each insert (someone can correct me if I'm wrong). As for the filename convention, it's basically: <fragment instance id>_<unique_number>_data.<file_number_written_by_the_same_sink>.parq
Code references: https://github.com/apache/impala/blob/master/be/src/exec/hdfs-table-sink.cc#L229-L245 https://github.com/apache/impala/blob/master/be/src/exec/hdfs-table-sink.cc#L346-L348 - Sailesh On Sun, Dec 3, 2017 at 11:58 PM, sky <[email protected]> wrote: > Hi all, > What is the relationship between the name of the parquet data file in > HDFS and each time insert? What is the definition format of the name of the > data file? Can you customize the name of the corresponding data file for > each insert?
