Can you use partitioning ( by day ) ? That will  make it easier to drop
data older than x days outside streaming job.

Sunil Parmar

On Wed, Mar 14, 2018 at 11:36 AM, Lian Jiang <jiangok2...@gmail.com> wrote:

> I have a spark structured streaming job which dump data into a parquet
> file. To avoid the parquet file grows infinitely, I want to discard 3 month
> old data. Does spark streaming supports this? Or I need to stop the
> streaming job, trim the parquet file and restart the streaming job? Thanks
> for any hints.
>

Reply via email to