Thanks TD. Will it be available in pyspark too? On 1 Dec 2016 19:55, "Tathagata Das" <tathagata.das1...@gmail.com> wrote:
> In the meantime, if you are interested, you can read the design doc in the > corresponding JIRA - https://issues.apache.org/jira/browse/SPARK-18124 > > On Thu, Dec 1, 2016 at 12:53 AM, Tathagata Das < > tathagata.das1...@gmail.com> wrote: > >> That feature is coming in 2.1.0. We have added watermarking, that will >> track the event time of the data and accordingly close old windows, output >> its corresponding aggregate and then drop its corresponding state. But in >> that case, you will have to use append mode, and aggregated data of a >> particular window will be evicted only when the windows is closed. You will >> be able to control the threshold on how long to wait for late, out-of-order >> data before closing a window. >> >> We will be updated the docs soon to explain this. >> >> On Tue, Nov 29, 2016 at 8:30 PM, Xinyu Zhang <wsz...@163.com> wrote: >> >>> Hi >>> >>> I want to use window operations. However, if i don't remove any data, >>> the "complete" table will become larger and larger as time goes on. So I >>> want to remove some outdated data in the complete table that I would never >>> use. >>> Is there any method to meet my requirement? >>> >>> Thanks! >>> >>> >>> >>> >>> >> >> >