Just checking if anyone has any pointers for dynamically updating query state in structured streaming. Thanks
On Thursday, February 8, 2018 2:58 PM, M Singh <mans2si...@yahoo.com.INVALID> wrote: Hi Spark Experts: I am trying to use a stateful udf with spark structured streaming that needs to update the state periodically. Here is the scenario: 1. I have a udf with a variable with default value (eg: 1) This value is applied to a column (eg: subtract the variable from the column value )2. The variable is to be updated periodically asynchronously (eg: reading a file every 5 minutes) and the new rows will have the new value applied to the column value. Spark natively supports broadcast variables, but I could not find a way to update the broadcasted variables dynamically or rebroadcast them once so that the udf internal state can be updated while the structure streaming application is running. I can try to read the variable from the file on each invocation of the udf but it will not scale since each invocation open/read/close the file. Please let me know if there is any documentation/example to support this scenario. Thanks