Re: Structured Streaming + initialState

2017-05-06 Thread Patrick McGloin
The initial state is stored in a Parquet file which is effectively a static
Dataset.  I seen there is a Jira open for full joins on streaming plus
static Datasets for Structured Streaming (SPARK-20002
).  So once that Jira is
completed it would be possible.

For mapGroupsWithState it would be great if you could provide an
initialState Dataset with Key -> State initial values.

On 5 May 2017 at 23:49, Tathagata Das  wrote:

> Can you explain how your initial state is stored? is it a file, or its in
> a database?
> If its in a database, then when initialize the GroupState, you can fetch
> it from the database.
>
> On Fri, May 5, 2017 at 7:35 AM, Patrick McGloin  > wrote:
>
>> Hi all,
>>
>> With Spark Structured Streaming, is there a possibility to set an
>> "initial state" for a query?
>>
>> Using a join between a streaming Dataset and a static Dataset does not
>> support full joins.
>>
>> Using mapGroupsWithState to create a GroupState does not support an
>> initialState (as the Spark Streaming StateSpec did).
>>
>> Are there any plans to add support for initial states?  Or is there
>> already a way to do so?
>>
>> Best regards,
>> Patrick
>>
>
>


Re: Structured Streaming + initialState

2017-05-05 Thread Tathagata Das
Can you explain how your initial state is stored? is it a file, or its in a
database?
If its in a database, then when initialize the GroupState, you can fetch it
from the database.

On Fri, May 5, 2017 at 7:35 AM, Patrick McGloin 
wrote:

> Hi all,
>
> With Spark Structured Streaming, is there a possibility to set an "initial
> state" for a query?
>
> Using a join between a streaming Dataset and a static Dataset does not
> support full joins.
>
> Using mapGroupsWithState to create a GroupState does not support an
> initialState (as the Spark Streaming StateSpec did).
>
> Are there any plans to add support for initial states?  Or is there
> already a way to do so?
>
> Best regards,
> Patrick
>


Structured Streaming + initialState

2017-05-05 Thread Patrick McGloin
Hi all,

With Spark Structured Streaming, is there a possibility to set an "initial
state" for a query?

Using a join between a streaming Dataset and a static Dataset does not
support full joins.

Using mapGroupsWithState to create a GroupState does not support an
initialState (as the Spark Streaming StateSpec did).

Are there any plans to add support for initial states?  Or is there already
a way to do so?

Best regards,
Patrick