Hi,
wouldn't it make sence to number the seg1 seg2 files like hadoop numbers
the parts

i.e.

seg0000001
seg0000002

etc.

Further it would make sence for me to be able to put some date / timestamp
part in the base path, so that for example every day the seg counter is
reset and the files are written in one directory per day.

On hadoop side a map reduce job can than merge the part files together if
they are too small, while respecting the order just by sorting them
alphabetically.

Regards

Martin

Reply via email to