On 1 Dec 2017, at 3:44, VinShar wrote:

yes this was my understanding also but then i found that Spark's DataFrame does has a method which appends to Parquet ( df.write.parquet(destName, mode="append")). below is an article that throws some light on this. i was
wondering if there is a way to achieve the same through NiFi.

http://aseigneurin.github.io/2017/03/14/incrementally-loaded-parquet-files.html

You should not believe all that bloggers write :)

In the blog they are writing to the `permit-inspections.parquet` **folder**. It’s not a parquet file.

The parquet files are contained in the folder. The append mode you are referring to simply writes new parquet files in the folder, without touching the existing ones.

If they would have used the `overwrite` option, then the existing folder would have been emptied before.

Cheers,

Giovanni

Reply via email to