Re: data source api v2 refactoring

2018-10-21 Thread JackyLee
I have pushed a patch for SQLStreaming, which just resolved the problem just discussed. the Jira: https://issues.apache.org/jira/browse/SPARK-24630 the Patch: https://github.com/apache/spark/pull/22575 SQLStreaming just defined the table API for StructStreaming, and the Table APIs for

RE: data source api v2 refactoring

2018-10-18 Thread Mendelson, Assaf
). Thanks, Assaf From: Wenchen Fan [mailto:cloud0...@gmail.com] Sent: Thursday, October 18, 2018 5:26 PM To: Reynold?Xin Cc: Ryan Blue; Hyukjin Kwon; Spark dev list Subject: Re: data source api v2 refactoring [EXTERNAL EMAIL] Please report any suspicious attachments, links, or requests

Re: data source api v2 refactoring

2018-10-18 Thread Wenchen Fan
uot; > *Cc: *Wenchen Fan , Hyukjin Kwon , > Spark Dev List > *Subject: *Re: data source api v2 refactoring > > > > Hi Jayesh, > > > > The existing sources haven't been ported to v2 yet. That is going to be > tricky because the existing sources implement behav

Re: data source api v2 refactoring

2018-09-19 Thread Thakrar, Jayesh
Thanks for the info Ryan – very helpful! From: Ryan Blue Reply-To: "rb...@netflix.com" Date: Wednesday, September 19, 2018 at 3:17 PM To: "Thakrar, Jayesh" Cc: Wenchen Fan , Hyukjin Kwon , Spark Dev List Subject: Re: data source api v2 refactoring Hi Jayesh, The exis

Re: data source api v2 refactoring

2018-09-19 Thread Ryan Blue
rom: *Ryan Blue > *Reply-To: * > *Date: *Friday, September 7, 2018 at 2:19 PM > *To: *Wenchen Fan > *Cc: *Hyukjin Kwon , Spark Dev List < > dev@spark.apache.org> > *Subject: *Re: data source api v2 refactoring > > > > There are a few v2-related changes that we can w

Re: data source api v2 refactoring

2018-09-07 Thread Thakrar, Jayesh
To: Wenchen Fan Cc: Hyukjin Kwon , Spark Dev List Subject: Re: data source api v2 refactoring There are a few v2-related changes that we can work in parallel, at least for reviews: * SPARK-25006, #21978<https://github.com/apache/spark/pull/21978>: Add catalog to TableIdentifier - this propos

Re: data source api v2 refactoring

2018-09-07 Thread Ryan Blue
;> } >>>> >>>> Without WriteConfig, the API looks like >>>> trait Table { >>>> LogicalWrite newAppendWrite(); >>>> >>>> LogicalWrite newDeleteWrite(deleteExprs); >>>> } >>>> >>>> >>>> I

Re: data source api v2 refactoring

2018-09-07 Thread Wenchen Fan
ewDeleteWrite(deleteExprs); >>> } >>> >>> >>> It looks to me that the API is simpler without WriteConfig, what do you >>> think? >>> >>> Thanks, >>> Wenchen >>> >>> On Wed, Sep 5, 2018 at 4:24 AM Ry

Re: data source api v2 refactoring

2018-09-07 Thread Hyukjin Kwon
> trait Table { >> LogicalWrite newAppendWrite(); >> >> LogicalWrite newDeleteWrite(deleteExprs); >> } >> >> >> It looks to me that the API is simpler without WriteConfig, what do you >> think? >> >> Thanks, >> Wenchen

Re: data source api v2 refactoring

2018-09-06 Thread Ryan Blue
> >> Latest from Wenchen in case it was dropped. >> >> -- Forwarded message - >> From: Wenchen Fan >> Date: Mon, Sep 3, 2018 at 6:16 AM >> Subject: Re: data source api v2 refactoring >> To: >> Cc: Ryan Blue , Reynold Xin , < &g

Re: data source api v2 refactoring

2018-09-04 Thread Wenchen Fan
ase it was dropped. > > -- Forwarded message - > From: Wenchen Fan > Date: Mon, Sep 3, 2018 at 6:16 AM > Subject: Re: data source api v2 refactoring > To: > Cc: Ryan Blue , Reynold Xin , < > dev@spark.apache.org> > > > Hi Mridul, > >

Re: data source api v2 refactoring

2018-09-04 Thread Marcelo Vanzin
Same here, I don't see anything from Wenchen... just replies to him. On Sat, Sep 1, 2018 at 9:31 PM Mridul Muralidharan wrote: > > > Is it only me or are all others getting Wenchen’s mails ? (Obviously Ryan did > :-) ) > I did not see it in the mail thread I received or in archives ... [1] >

Re: data source api v2 refactoring

2018-09-01 Thread Mridul Muralidharan
Is it only me or are all others getting Wenchen’s mails ? (Obviously Ryan did :-) ) I did not see it in the mail thread I received or in archives ... [1] Wondering which othersenderswere getting dropped (if yes). Regards Mridul [1]

Re: data source api v2 refactoring

2018-09-01 Thread Ryan Blue
Thanks for clarifying, Wenchen. I think that's what I expected. As for the abstraction, here's the way that I think about it: there are two important parts of a scan: the definition of what will be read, and task sets that actually perform the read. In batch, there's one definition of the scan

Re: data source api v2 refactoring

2018-08-31 Thread Jungtaek Lim
Nice suggestion Reynold and great news to see that Wenchen succeeded prototyping! One thing I would like to make sure is, how continuous mode works with such abstraction. Would continuous mode be also abstracted with Stream, and createScan would provide unbounded Scan? Thanks, Jungtaek Lim

Re: data source api v2 refactoring

2018-08-31 Thread Ryan Blue
Thanks, Reynold! I think your API sketch looks great. I appreciate having the Table level in the abstraction to plug into as well. I think this makes it clear what everything does, particularly having the Stream level that represents a configured (by ScanConfig) streaming read and can act as a