Hi,

I want to save an aggregate to a file without using any window, watermark
or groupBy. So, my aggregation is at entire column level.

df = spark.sql("select avg(col1) as aver from ds")


Now, the challenge is as follows -

1) If I use outputMode = Append, but "*Append output mode not supported
when there are streaming aggregations on streaming DataFrames/DataSets
without watermark*"

query2 = df \
    .writeStream \
    .format("parquet") \
    .option("path", "/home/aakashbasu/Downloads/Kafka_Testing/Temp_AvgStore/") \
    .option("checkpointLocation", "/home/aakashbasu/Downloads/Kafka_Testing/") \
    .trigger(processingTime='3 seconds') \
    .start()



2) If I use outputMode = Complete, but "*Data source parquet does not
support Complete output mode;*"

query2 = df \
    .writeStream \
    .outputMode("complete") \
    .format("parquet") \
    .option("path", "/home/aakashbasu/Downloads/Kafka_Testing/Temp_AvgStore/") \
    .option("checkpointLocation", "/home/aakashbasu/Downloads/Kafka_Testing/") \
    .trigger(processingTime='3 seconds') \
    .start()


What to do? How to go about it?

Thanks,
Aakash.

Reply via email to