Re: Writing custom Structured Streaming receiver

2018-06-05 Thread alz2
I'm implementing a simple Structured Streaming Source with the V2 API in Java. I've taken the Offset logic (regarding startOffset, endOffset, lastCommittedOffset, etc) from the socket source and also your receivers. However, upon start up for some reason Spark says that the initial offset or -1,

Re: Writing custom Structured Streaming receiver

2018-03-04 Thread Hien Luu
Finally got a toy version of Structured Streaming DataSource V2 version with Apache Spark 2.3 working. Tested locally and on Databricks community edition. Source code is here - https://github.com/hienluu/wikiedit-streaming -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

Re: Writing custom Structured Streaming receiver

2018-03-03 Thread Hien Luu
I finally got around to implement a custom structured streaming receiver (source) to read Wikipedia edit events from the IRC server. It works fines locally as well as in spark-shell on my laptop. However, it failed with the following exception when running in Databricks community edition. It

Re: Writing custom Structured Streaming receiver

2017-11-28 Thread Hien Luu
Cool. Thanks nezhazheng. I will give it a shot. -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Writing custom Structured Streaming receiver

2017-11-20 Thread nezhazheng
Hi Hien, You can write your own Source or Sink either through SPI(https://docs.oracle.com/javase/tutorial/sound/SPI-intro.html ). Below in an example that implement kafka 0.8 source. https://github.com/jerryshao/spark-kafka-0-8-sql

Re: Writing custom Structured Streaming receiver

2017-11-20 Thread Hien Luu
Hi TD, I looked at DataStreamReader class and looks like we can specify an FQCN as a source (provided that it implements trait Source). The DataSource.lookupDataSource function will try to load this FQCN during the creation of a DataSource object instance inside the DataStreamReader.load(). Will

Re: Writing custom Structured Streaming receiver

2017-11-01 Thread Tathagata Das
Structured Streaming source APIs are not yet public, so there isnt a guide. However, if you are adventurous enough, you can take a look at the source code in Spark. Source API: https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Source.scala