[
https://issues.apache.org/jira/browse/SPARK-17407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468372#comment-15468372
]
Seth Hendrickson commented on SPARK-17407:
------------------------------------------
[~chriddyp] The file stream source checks for new files by using the file name.
Appending new rows to a file that already exists has no effect, by design. We
can discuss whether the design ought to change, but as far as I can see nothing
is "wrong" here. If you want to update a streaming csv dataframe, then just add
new files.
> Unable to update structured stream from CSV
> -------------------------------------------
>
> Key: SPARK-17407
> URL: https://issues.apache.org/jira/browse/SPARK-17407
> Project: Spark
> Issue Type: Question
> Components: PySpark
> Affects Versions: 2.0.0
> Environment: Mac OSX
> Spark 2.0.0
> Reporter: Chris Parmer
> Priority: Trivial
> Labels: beginner, newbie
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> I am creating a simple example of a Structured Stream from a CSV file with an
> in-memory output stream.
> When I add rows the CSV file, my output stream does not update with the new
> data. From this example:
> https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/4012078893478893/3202384642551446/5985939988045659/latest.html,
> I expected that subsequent queries on the same output stream would contain
> updated results.
> Here is a reproducable code example: https://plot.ly/~chris/17703
> Thanks for the help here!
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]