On 1 Dec 2017, at 3:44, VinShar wrote:
yes this was my understanding also but then i found that Spark's
DataFrame
does has a method which appends to Parquet (
df.write.parquet(destName,
mode="append")). below is an article that throws some light on this. i
was
wondering if there is a way to achieve the same through NiFi.
http://aseigneurin.github.io/2017/03/14/incrementally-loaded-parquet-files.html
You should not believe all that bloggers write :)
In the blog they are writing to the `permit-inspections.parquet`
**folder**. It’s not a parquet file.
The parquet files are contained in the folder. The append mode you are
referring to simply writes new parquet files in the folder, without
touching the existing ones.
If they would have used the `overwrite` option, then the existing folder
would have been emptied before.
Cheers,
Giovanni