I couldn't spot it anywhere on the web so it doesn't look to be contributed
yet, but note that the HDFS APIs are already available per
https://issues.apache.org/jira/browse/HDFS-6634 (you can see the test case
for an implementation guideline in Java:
https://github.com/apache/hadoop/blob/trunk/hado
Hi All,
I am using pyspark streaming to ETL raw data files as they land on HDFS.
While researching this topic I found this great presentation about Spark
and Spark Streaming at Uber
(http://www.slideshare.net/databricks/spark-meetup-at-uber), where they
mention this INotifyDStream that sounds very