On Fri, Jun 24, 2016 at 7:05 PM, Dan Halperin <[email protected]> wrote:
> On Fri, Jun 24, 2016 at 2:03 PM, Raghu Angadi <[email protected]> > wrote: > >> DSL is a pretty generic term.. >> > > I agree and am not married to it. Neville? > > >> The fact that scio uses Java SDK is an implementation detail. > > > Reasonable, which is why I am also not pushing hard for '/java/scio' to be > in the path. > > >> I love the >> name scio. But I think sdks/scala might be most appropriate and would make >> it a first class citizen for Beam. >> > > I am strongly against it being in the 'sdks/' top-level module -- it's not > a Beam SDK. Unlike DSL, SDK is a very specific term in Beam. > > >> Where would a future python sdk reside? >> > > The Python SDK is in the python-sdk branch on Apache already, and it lives > in `sdks/python`. (And it is aiming to become a proper Beam SDK. ;) > Now with a link: https://github.com/apache/incubator-beam/tree/python-sdk/sdks > > Thanks, > Dan > > On Fri, Jun 24, 2016 at 1:50 PM, Jean-Baptiste Onofré <[email protected]> >> wrote: >> >> > Agree for dsls/scio >> > >> > Regards >> > JB >> > >> > >> > On 06/24/2016 10:22 PM, Lukasz Cwik wrote: >> > >> >> +1 for dsls/scio for the already listed reasons >> >> >> >> On Fri, Jun 24, 2016 at 11:21 AM, Rafal Wojdyla >> <[email protected]> >> >> wrote: >> >> >> >> Hello. When it comes to SDK vs DSL - I fully agree with Frances. About >> >>> dsls/java/scio or dsls/scio - dsls/java/scio may cause confusion, scio >> >>> is a >> >>> scala DSL but lives under java directory (?) - that makes sense only >> once >> >>> you get that scio is using java SDK under the hood. Thus, +1 to >> >>> dsls/scio. >> >>> - Rafal >> >>> >> >>> On Fri, Jun 24, 2016 at 2:01 PM, Kenneth Knowles >> <[email protected] >> >>> > >> >>> wrote: >> >>> >> >>> My +1 goes to dsls/scio. It already has a cool name, so let's use it. >> And >> >>>> there might be other Scala-based DSLs. >> >>>> >> >>>> On Fri, Jun 24, 2016 at 8:39 AM, Ismaël Mejía <[email protected]> >> >>>> wrote: >> >>>> >> >>>> Hello everyone, >> >>>>> >> >>>>> Neville, thanks a lot for your contribution. Your work is amazing >> and I >> >>>>> >> >>>> am >> >>>> >> >>>>> really happy that this scala integration is finally happening. >> >>>>> Congratulations to you and your team. >> >>>>> >> >>>>> I *strongly* disagree about the DSL classification for scio for one >> >>>>> >> >>>> reason, >> >>>> >> >>>>> if you go to the root of the term, Domain Specific Languages are >> about >> >>>>> >> >>>> a >> >>> >> >>>> domain, and the domain in this case is writing Beam pipelines, which >> >>>>> >> >>>> is a >> >>> >> >>>> really broad domain. >> >>>>> >> >>>>> I agree with Frances’ argument that scio is not an SDK e.g. it >> reuses >> >>>>> >> >>>> the >> >>> >> >>>> existing Beam java SDK. My proposition is that scio will be called >> the >> >>>>> Scala API because in the end this is what it is. I think the >> confusion >> >>>>> comes from the common definition of SDK which is normally an API + a >> >>>>> Runtime. In this case scio will share the runtime with what we call >> the >> >>>>> Beam Java SDK. >> >>>>> >> >>>>> One additional point of using the term API is that it sends the >> clear >> >>>>> message that Beam has a Scala API too (which is good for visibility >> as >> >>>>> >> >>>> JB >> >>> >> >>>> mentioned). >> >>>>> >> >>>>> Regards, >> >>>>> Ismaël >> >>>>> >> >>>>> >> >>>>> On Fri, Jun 24, 2016 at 5:08 PM, Jean-Baptiste Onofré < >> [email protected] >> >>>>> >> >>>> >> >>>> wrote: >> >>>>> >> >>>>> Hi Dan, >> >>>>>> >> >>>>>> fair enough. >> >>>>>> >> >>>>>> As I'm also working on new DSLs (XML, JSON), I already created the >> >>>>>> >> >>>>> dsls >> >>> >> >>>> module. >> >>>>>> >> >>>>>> So, I would say dsls/scala. >> >>>>>> >> >>>>>> WDYT ? >> >>>>>> >> >>>>>> Regards >> >>>>>> JB >> >>>>>> >> >>>>>> >> >>>>>> On 06/24/2016 05:07 PM, Dan Halperin wrote: >> >>>>>> >> >>>>>> I don't think that sdks/scala is the right place -- scio is not a >> >>>>>>> >> >>>>>> Beam >> >>> >> >>>> Scala SDK; it wraps the existing Java SDK. >> >>>>>>> >> >>>>>>> Some options: >> >>>>>>> * sdks/java/extensions (Scio builds on the Java SDK) -- mentally >> >>>>>>> >> >>>>>> vetoed >> >>>> >> >>>>> since Scio isn't an extension for the Java SDK, but rather a wrapper >> >>>>>>> >> >>>>>>> * dsls/java/scio (Scio is a Beam DSL that uses the Java SDK) >> >>>>>>> * dsls/scio (Scio is a Beam DSL that could eventually use >> multiple >> >>>>>>> >> >>>>>> SDKs) >> >>>>> >> >>>>>> * extensions/java/scio (Scio is an extension of Beam that uses the >> >>>>>>> >> >>>>>> Java >> >>>> >> >>>>> SDK) >> >>>>>>> * extensions/scio (Scio is an extension of Beam that is not >> limited >> >>>>>>> >> >>>>>> to >> >>>> >> >>>>> one >> >>>>>>> SDK) >> >>>>>>> >> >>>>>>> I lean towards either dsls/java/scio or extensions/java/scio, >> since >> >>>>>>> >> >>>>>> I >> >>> >> >>>> don't >> >>>>>>> think there are plans for Scio to handle multiple different SDKs >> (in >> >>>>>>> different languages). The question between these two is whether we >> >>>>>>> >> >>>>>> think >> >>>> >> >>>>> DSLs are "big enough" to be a top level concept. >> >>>>>>> >> >>>>>>> On Thu, Jun 23, 2016 at 11:05 PM, Jean-Baptiste Onofré < >> >>>>>>> >> >>>>>> [email protected] >> >>>> >> >>>>> >> >>>>>> wrote: >> >>>>>>> >> >>>>>>> Good point about new Fn and the fact it's based on the Java SDK. >> >>>>>>> >> >>>>>>>> >> >>>>>>>> It's just that in term of "marketing", it's a good message to >> >>>>>>>> >> >>>>>>> provide a >> >>>> >> >>>>> Scala SDK even if technically it's more a DSL. >> >>>>>>>> >> >>>>>>>> For instance, a valid "marketing" DSL would be a Java fluent DSL >> on >> >>>>>>>> >> >>>>>>> top >> >>>> >> >>>>> of >> >>>>>>>> the Java SDK, or a declarative XML DSL. >> >>>>>>>> >> >>>>>>>> However, from a technical perspective, it can go into dsl module. >> >>>>>>>> >> >>>>>>>> My $0.02 ;) >> >>>>>>>> >> >>>>>>>> Regards >> >>>>>>>> JB >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> On 06/24/2016 06:51 AM, Frances Perry wrote: >> >>>>>>>> >> >>>>>>>> +Rafal & Andrew again >> >>>>>>>> >> >>>>>>>>> >> >>>>>>>>> I am leaning DSL for two reasons: (1) scio uses the existing >> java >> >>>>>>>>> execution >> >>>>>>>>> environment (and won't have a language-specific fn harness of >> its >> >>>>>>>>> >> >>>>>>>> own), >> >>>>> >> >>>>>> and >> >>>>>>>>> (2) it changes the abstractions that users interact with. >> >>>>>>>>> >> >>>>>>>>> I recently saw a scio repl demo from Reuven -- there's some >> really >> >>>>>>>>> >> >>>>>>>> cool >> >>>>> >> >>>>>> stuff in there. I'd love to dive into it a bit more and see what >> >>>>>>>>> >> >>>>>>>> can >> >>> >> >>>> be >> >>>>> >> >>>>>> generalized beyond scio. The repl-like interactive graph >> >>>>>>>>> >> >>>>>>>> construction >> >>>> >> >>>>> is >> >>>>> >> >>>>>> very similar to what we've seen with ipython, in that it doesn't >> >>>>>>>>> >> >>>>>>>> always >> >>>>> >> >>>>>> play nicely with the graph construction / graph execution >> >>>>>>>>> >> >>>>>>>> distinction. I >> >>>>> >> >>>>>> wonder what changes to Beam might more generally support this. The >> >>>>>>>>> materialize stuff looks similar to some functionality in >> FlumeJava >> >>>>>>>>> >> >>>>>>>> we >> >>>> >> >>>>> used >> >>>>>>>>> to support multi-segment pipelines with some shared intermediate >> >>>>>>>>> PCollections. >> >>>>>>>>> >> >>>>>>>>> On Thu, Jun 23, 2016 at 9:22 PM, Jean-Baptiste Onofré < >> >>>>>>>>> >> >>>>>>>> [email protected]> >> >>>>> >> >>>>>> wrote: >> >>>>>>>>> >> >>>>>>>>> Hi Neville, >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>>> thanks for the update ! >> >>>>>>>>>> >> >>>>>>>>>> As it's another language support, and to clearly identify the >> >>>>>>>>>> >> >>>>>>>>> purpose, >> >>>>> >> >>>>>> I >> >>>>>>>>>> would say sdks/scala. >> >>>>>>>>>> >> >>>>>>>>>> Regards >> >>>>>>>>>> JB >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> On 06/23/2016 11:56 PM, Neville Li wrote: >> >>>>>>>>>> >> >>>>>>>>>> +folks in my team >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>>> On Thu, Jun 23, 2016 at 5:57 PM Neville Li < >> >>>>>>>>>>> >> >>>>>>>>>> [email protected] >> >>> >> >>>> >> >>>>> wrote: >> >>>>>>>>>>> >> >>>>>>>>>>> Hi all, >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> I'm the co-author of Scio <https://github.com/spotify/scio> >> >>>>>>>>>>>> >> >>>>>>>>>>> and >> >>> >> >>>> am >> >>>> >> >>>>> in >> >>>>>>>>>>>> the >> >>>>>>>>>>>> progress of moving code to Beam (BEAM-302 >> >>>>>>>>>>>> <https://issues.apache.org/jira/browse/BEAM-302>). Just >> >>>>>>>>>>>> >> >>>>>>>>>>> wondering >> >>>> >> >>>>> if >> >>>>> >> >>>>>> sdks/scala is the right place for this code or if something >> >>>>>>>>>>>> >> >>>>>>>>>>> like >> >>> >> >>>> dsls/scio >> >>>>>>>>>>>> is a better choice? What do you think? >> >>>>>>>>>>>> >> >>>>>>>>>>>> A little background: Scio was built as a high-level Scala API >> >>>>>>>>>>>> >> >>>>>>>>>>> for >> >>> >> >>>> Google >> >>>>>>>>>>>> Cloud Dataflow (now also Apache Beam) and is heavily >> influenced >> >>>>>>>>>>>> >> >>>>>>>>>>> by >> >>>> >> >>>>> Spark >> >>>>>>>>>>>> and Scalding. It wraps around the Dataflow/Beam Java SDK >> while >> >>>>>>>>>>>> >> >>>>>>>>>>> also >> >>>> >> >>>>> providing features comparable to other Scala data frameworks. >> >>>>>>>>>>>> >> >>>>>>>>>>> We >> >>> >> >>>> use >> >>>>> >> >>>>>> Scio >> >>>>>>>>>>>> on Dataflow for production extensively inside Spotify. >> >>>>>>>>>>>> >> >>>>>>>>>>>> Cheers, >> >>>>>>>>>>>> Neville >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> -- >> >>>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> Jean-Baptiste Onofré >> >>>>>>>>>> [email protected] >> >>>>>>>>>> http://blog.nanthrax.net >> >>>>>>>>>> Talend - http://www.talend.com >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> -- >> >>>>>>>>> >> >>>>>>>> Jean-Baptiste Onofré >> >>>>>>>> [email protected] >> >>>>>>>> http://blog.nanthrax.net >> >>>>>>>> Talend - http://www.talend.com >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>> -- >> >>>>>> Jean-Baptiste Onofré >> >>>>>> [email protected] >> >>>>>> http://blog.nanthrax.net >> >>>>>> Talend - http://www.talend.com >> >>>>>> >> >>>>>> >> >>>>> >> >>>> >> >>> >> >> >> > -- >> > Jean-Baptiste Onofré >> > [email protected] >> > http://blog.nanthrax.net >> > Talend - http://www.talend.com >> > >> > >
