Hi Kaniska, In order to use append mode with aggregations, you need to set an event time watermark (using `withWatermark`). Otherwise, Spark doesn't know when to output an aggregation result as "final".
Best, Burak On Mon, Jun 19, 2017 at 11:03 AM, kaniska Mandal <kaniska.man...@gmail.com> wrote: > Hi, > > My goal is to ~ > (1) either chain streaming aggregations in a single query OR > (2) run multiple streaming aggregations and save data in some meaningful > format to execute low latency / failsafe OLAP queries > > So my first choice is parquet format , but I failed to make it work ! > > I am using spark-streaming_2.11-2.1.1 > > I am facing the following error - > org.apache.spark.sql.AnalysisException: Append output mode not supported > when there are streaming aggregations on streaming DataFrames/DataSets; > > - for the following syntax > > StreamingQuery streamingQry = tagBasicAgg.writeStream() > > .format("parquet") > > .trigger(ProcessingTime.create("10 seconds")) > > .queryName("tagAggSummary") > > .outputMode("append") > > .option("checkpointLocation", "/tmp/summary/checkpoints/") > > .option("path", "/data/summary/tags/") > > .start(); > But, parquet doesn't support 'complete' outputMode. > > So is parquet supported only for batch queries , NOT for streaming queries > ? > > - note that console outputmode working fine ! > > Any help will be much appreciated. > > Thanks > Kaniska > >