Re: [Spark Structured Streaming]: Is it possible to ingest data from a jdbc data source incrementally?

2017-01-03 Thread Yuanzhe Yang
gestion to use dbtable with inline view > 2. parallelism - use numPartition,lowerbound,upper bound to generate > number of partitions > > HTH > > > > On Wed, Jan 4, 2017 at 3:46 AM, Yuanzhe Yang wrote: > >> Hi Ayan, >> >> Yeah, I understand your proposal,

Re: [Spark Structured Streaming]: Is it possible to ingest data from a jdbc data source incrementally?

2017-01-03 Thread Yuanzhe Yang
; Essentially, you want to create a query like > > select * from table where INSERTED_ON > lowerBound and > INSERTED_ON > everytime you run the job.... > > > > On Wed, Jan 4, 2017 at 2:13 AM, Yuanzhe Yang wrote: > >> Hi Ayan, >> >> Thanks a lot for your

Re: [Spark Structured Streaming]: Is it possible to ingest data from a jdbc data source incrementally?

2017-01-03 Thread Yuanzhe Yang
k to grab data from DB. > > In Spark, you can use sqlContext.load function for JDBC and use > partitionColumn and numPartition to define parallelism of connection. > > Best > Ayan > > On Tue, Jan 3, 2017 at 10:49 PM, Yuanzhe Yang wrote: > >> Hi Ayan, >> >> Thanks

Re: [Spark Structured Streaming]: Is it possible to ingest data from a jdbc data source incrementally?

2017-01-03 Thread Yuanzhe Yang
t; >> You can try out *debezium* : https://github.com/debezium. it reads data >> from bin-logs, provides structure and stream into Kafka. >> >> Now Kafka can be your new source for streaming. >> >> On Tue, Jan 3, 2017 at 4:36 PM, Yuanzhe Yang wrote: >> >>>

Re: [Spark Structured Streaming]: Is it possible to ingest data from a jdbc data source incrementally?

2017-01-03 Thread Yuanzhe Yang
can be your new source for streaming. > > On Tue, Jan 3, 2017 at 4:36 PM, Yuanzhe Yang wrote: > >> Hi Hongdi, >> >> Thanks a lot for your suggestion. The data is truely immutable and the >> table is append-only. But actually there are different databases involved, &g

Re: [Spark Structured Streaming]: Is it possible to ingest data from a jdbc data source incrementally?

2017-01-03 Thread Yuanzhe Yang
g? >> >> On Fri, Dec 30, 2016 at 9:01 AM, Michael Armbrust > > wrote: >> >>> We don't support this yet, but I've opened this JIRA as it sounds >>> generally useful: https://issues.apache.org/jira/browse/SPARK-19031 >>> >>> In th

Re: [Spark Structured Streaming]: Is it possible to ingest data from a jdbc data source incrementally?

2017-01-03 Thread Yuanzhe Yang
d this JIRA as it sounds >> generally useful: https://issues.apache.org/jira/browse/SPARK-19031 >> >> In the mean time you could try implementing your own Source, but that is >> pretty low level and is not yet a stable API. >> >> On Thu, Dec 29, 2016 at 4:05 AM

Re: [Spark Structured Streaming]: Is it possible to ingest data from a jdbc data source incrementally?

2017-01-03 Thread Yuanzhe Yang
> > In the mean time you could try implementing your own Source, but that is > pretty low level and is not yet a stable API. > > On Thu, Dec 29, 2016 at 4:05 AM, "Yuanzhe Yang (杨远哲)" > wrote: > >> Hi all, >> >> Thanks a lot for your contributions to bri

[Spark Structured Streaming]: Is it possible to ingest data from a jdbc data source incrementally?

2016-12-29 Thread Yuanzhe Yang (杨远哲)
Hi all, Thanks a lot for your contributions to bring us new technologies. I don't want to waste your time, so before I write to you, I googled, checked stackoverflow and mailing list archive with keywords "streaming" and "jdbc". But I was not able to get any solution to my use case. I hope I ca