I'm also in favor of branding it a DSL rather than an SDK. Mostly because it uses the Java SDK and because it does not (necessarily) follow/implement the Beam model. As the Java SDK does and what the Python SDK is apparently going for.
On Sat, 25 Jun 2016 at 10:04 Amit Sela <[email protected]> wrote: > Just looked at some Scio examples - and saw Spark Scala code ;-) > > For me, this made some sense - Spark is written in Scala (let's call it > Scala SDK ?) but it also provides Java API. New version has a unified API > (Java-Scala interop.) So I see Scio in a similar way, It's Scala API > because it's built on top of the Java SDK. > Having said that, Scio could offer more than just Scala API over the Java > SDK (i.e., repl) so in the lack of a native fit, I'd go with DSL. And to > relate to the very valid notes people had about saying "Hi, we support > Scala!", we can call it Scala API, even if it's under dsls/scio. > > So +1 for dsls/scio > > Thanks, > Amit > > On Sat, Jun 25, 2016 at 5:06 AM Dan Halperin <[email protected]> > wrote: > > > On Fri, Jun 24, 2016 at 7:05 PM, Dan Halperin <[email protected]> > wrote: > > > > > On Fri, Jun 24, 2016 at 2:03 PM, Raghu Angadi > <[email protected] > > > > > > wrote: > > > > > >> DSL is a pretty generic term.. > > >> > > > > > > I agree and am not married to it. Neville? > > > > > > > > >> The fact that scio uses Java SDK is an implementation detail. > > > > > > > > > Reasonable, which is why I am also not pushing hard for '/java/scio' to > > be > > > in the path. > > > > > > > > >> I love the > > >> name scio. But I think sdks/scala might be most appropriate and would > > make > > >> it a first class citizen for Beam. > > >> > > > > > > I am strongly against it being in the 'sdks/' top-level module -- it's > > not > > > a Beam SDK. Unlike DSL, SDK is a very specific term in Beam. > > > > > > > > >> Where would a future python sdk reside? > > >> > > > > > > The Python SDK is in the python-sdk branch on Apache already, and it > > lives > > > in `sdks/python`. (And it is aiming to become a proper Beam SDK. ;) > > > > > > > Now with a link: > > https://github.com/apache/incubator-beam/tree/python-sdk/sdks > > > > > > > > Thanks, > > > Dan > > > > > > On Fri, Jun 24, 2016 at 1:50 PM, Jean-Baptiste Onofré <[email protected] > > > > >> wrote: > > >> > > >> > Agree for dsls/scio > > >> > > > >> > Regards > > >> > JB > > >> > > > >> > > > >> > On 06/24/2016 10:22 PM, Lukasz Cwik wrote: > > >> > > > >> >> +1 for dsls/scio for the already listed reasons > > >> >> > > >> >> On Fri, Jun 24, 2016 at 11:21 AM, Rafal Wojdyla > > >> <[email protected]> > > >> >> wrote: > > >> >> > > >> >> Hello. When it comes to SDK vs DSL - I fully agree with Frances. > > About > > >> >>> dsls/java/scio or dsls/scio - dsls/java/scio may cause confusion, > > scio > > >> >>> is a > > >> >>> scala DSL but lives under java directory (?) - that makes sense > only > > >> once > > >> >>> you get that scio is using java SDK under the hood. Thus, +1 to > > >> >>> dsls/scio. > > >> >>> - Rafal > > >> >>> > > >> >>> On Fri, Jun 24, 2016 at 2:01 PM, Kenneth Knowles > > >> <[email protected] > > >> >>> > > > >> >>> wrote: > > >> >>> > > >> >>> My +1 goes to dsls/scio. It already has a cool name, so let's use > > it. > > >> And > > >> >>>> there might be other Scala-based DSLs. > > >> >>>> > > >> >>>> On Fri, Jun 24, 2016 at 8:39 AM, Ismaël Mejía <[email protected] > > > > >> >>>> wrote: > > >> >>>> > > >> >>>> Hello everyone, > > >> >>>>> > > >> >>>>> Neville, thanks a lot for your contribution. Your work is > amazing > > >> and I > > >> >>>>> > > >> >>>> am > > >> >>>> > > >> >>>>> really happy that this scala integration is finally happening. > > >> >>>>> Congratulations to you and your team. > > >> >>>>> > > >> >>>>> I *strongly* disagree about the DSL classification for scio for > > one > > >> >>>>> > > >> >>>> reason, > > >> >>>> > > >> >>>>> if you go to the root of the term, Domain Specific Languages are > > >> about > > >> >>>>> > > >> >>>> a > > >> >>> > > >> >>>> domain, and the domain in this case is writing Beam pipelines, > > which > > >> >>>>> > > >> >>>> is a > > >> >>> > > >> >>>> really broad domain. > > >> >>>>> > > >> >>>>> I agree with Frances’ argument that scio is not an SDK e.g. it > > >> reuses > > >> >>>>> > > >> >>>> the > > >> >>> > > >> >>>> existing Beam java SDK. My proposition is that scio will be > called > > >> the > > >> >>>>> Scala API because in the end this is what it is. I think the > > >> confusion > > >> >>>>> comes from the common definition of SDK which is normally an API > > + a > > >> >>>>> Runtime. In this case scio will share the runtime with what we > > call > > >> the > > >> >>>>> Beam Java SDK. > > >> >>>>> > > >> >>>>> One additional point of using the term API is that it sends the > > >> clear > > >> >>>>> message that Beam has a Scala API too (which is good for > > visibility > > >> as > > >> >>>>> > > >> >>>> JB > > >> >>> > > >> >>>> mentioned). > > >> >>>>> > > >> >>>>> Regards, > > >> >>>>> Ismaël > > >> >>>>> > > >> >>>>> > > >> >>>>> On Fri, Jun 24, 2016 at 5:08 PM, Jean-Baptiste Onofré < > > >> [email protected] > > >> >>>>> > > >> >>>> > > >> >>>> wrote: > > >> >>>>> > > >> >>>>> Hi Dan, > > >> >>>>>> > > >> >>>>>> fair enough. > > >> >>>>>> > > >> >>>>>> As I'm also working on new DSLs (XML, JSON), I already created > > the > > >> >>>>>> > > >> >>>>> dsls > > >> >>> > > >> >>>> module. > > >> >>>>>> > > >> >>>>>> So, I would say dsls/scala. > > >> >>>>>> > > >> >>>>>> WDYT ? > > >> >>>>>> > > >> >>>>>> Regards > > >> >>>>>> JB > > >> >>>>>> > > >> >>>>>> > > >> >>>>>> On 06/24/2016 05:07 PM, Dan Halperin wrote: > > >> >>>>>> > > >> >>>>>> I don't think that sdks/scala is the right place -- scio is > not a > > >> >>>>>>> > > >> >>>>>> Beam > > >> >>> > > >> >>>> Scala SDK; it wraps the existing Java SDK. > > >> >>>>>>> > > >> >>>>>>> Some options: > > >> >>>>>>> * sdks/java/extensions (Scio builds on the Java SDK) -- > > mentally > > >> >>>>>>> > > >> >>>>>> vetoed > > >> >>>> > > >> >>>>> since Scio isn't an extension for the Java SDK, but rather a > > wrapper > > >> >>>>>>> > > >> >>>>>>> * dsls/java/scio (Scio is a Beam DSL that uses the Java SDK) > > >> >>>>>>> * dsls/scio (Scio is a Beam DSL that could eventually use > > >> multiple > > >> >>>>>>> > > >> >>>>>> SDKs) > > >> >>>>> > > >> >>>>>> * extensions/java/scio (Scio is an extension of Beam that uses > > the > > >> >>>>>>> > > >> >>>>>> Java > > >> >>>> > > >> >>>>> SDK) > > >> >>>>>>> * extensions/scio (Scio is an extension of Beam that is not > > >> limited > > >> >>>>>>> > > >> >>>>>> to > > >> >>>> > > >> >>>>> one > > >> >>>>>>> SDK) > > >> >>>>>>> > > >> >>>>>>> I lean towards either dsls/java/scio or extensions/java/scio, > > >> since > > >> >>>>>>> > > >> >>>>>> I > > >> >>> > > >> >>>> don't > > >> >>>>>>> think there are plans for Scio to handle multiple different > SDKs > > >> (in > > >> >>>>>>> different languages). The question between these two is > whether > > we > > >> >>>>>>> > > >> >>>>>> think > > >> >>>> > > >> >>>>> DSLs are "big enough" to be a top level concept. > > >> >>>>>>> > > >> >>>>>>> On Thu, Jun 23, 2016 at 11:05 PM, Jean-Baptiste Onofré < > > >> >>>>>>> > > >> >>>>>> [email protected] > > >> >>>> > > >> >>>>> > > >> >>>>>> wrote: > > >> >>>>>>> > > >> >>>>>>> Good point about new Fn and the fact it's based on the Java > SDK. > > >> >>>>>>> > > >> >>>>>>>> > > >> >>>>>>>> It's just that in term of "marketing", it's a good message to > > >> >>>>>>>> > > >> >>>>>>> provide a > > >> >>>> > > >> >>>>> Scala SDK even if technically it's more a DSL. > > >> >>>>>>>> > > >> >>>>>>>> For instance, a valid "marketing" DSL would be a Java fluent > > DSL > > >> on > > >> >>>>>>>> > > >> >>>>>>> top > > >> >>>> > > >> >>>>> of > > >> >>>>>>>> the Java SDK, or a declarative XML DSL. > > >> >>>>>>>> > > >> >>>>>>>> However, from a technical perspective, it can go into dsl > > module. > > >> >>>>>>>> > > >> >>>>>>>> My $0.02 ;) > > >> >>>>>>>> > > >> >>>>>>>> Regards > > >> >>>>>>>> JB > > >> >>>>>>>> > > >> >>>>>>>> > > >> >>>>>>>> On 06/24/2016 06:51 AM, Frances Perry wrote: > > >> >>>>>>>> > > >> >>>>>>>> +Rafal & Andrew again > > >> >>>>>>>> > > >> >>>>>>>>> > > >> >>>>>>>>> I am leaning DSL for two reasons: (1) scio uses the existing > > >> java > > >> >>>>>>>>> execution > > >> >>>>>>>>> environment (and won't have a language-specific fn harness > of > > >> its > > >> >>>>>>>>> > > >> >>>>>>>> own), > > >> >>>>> > > >> >>>>>> and > > >> >>>>>>>>> (2) it changes the abstractions that users interact with. > > >> >>>>>>>>> > > >> >>>>>>>>> I recently saw a scio repl demo from Reuven -- there's some > > >> really > > >> >>>>>>>>> > > >> >>>>>>>> cool > > >> >>>>> > > >> >>>>>> stuff in there. I'd love to dive into it a bit more and see > what > > >> >>>>>>>>> > > >> >>>>>>>> can > > >> >>> > > >> >>>> be > > >> >>>>> > > >> >>>>>> generalized beyond scio. The repl-like interactive graph > > >> >>>>>>>>> > > >> >>>>>>>> construction > > >> >>>> > > >> >>>>> is > > >> >>>>> > > >> >>>>>> very similar to what we've seen with ipython, in that it > doesn't > > >> >>>>>>>>> > > >> >>>>>>>> always > > >> >>>>> > > >> >>>>>> play nicely with the graph construction / graph execution > > >> >>>>>>>>> > > >> >>>>>>>> distinction. I > > >> >>>>> > > >> >>>>>> wonder what changes to Beam might more generally support this. > > The > > >> >>>>>>>>> materialize stuff looks similar to some functionality in > > >> FlumeJava > > >> >>>>>>>>> > > >> >>>>>>>> we > > >> >>>> > > >> >>>>> used > > >> >>>>>>>>> to support multi-segment pipelines with some shared > > intermediate > > >> >>>>>>>>> PCollections. > > >> >>>>>>>>> > > >> >>>>>>>>> On Thu, Jun 23, 2016 at 9:22 PM, Jean-Baptiste Onofré < > > >> >>>>>>>>> > > >> >>>>>>>> [email protected]> > > >> >>>>> > > >> >>>>>> wrote: > > >> >>>>>>>>> > > >> >>>>>>>>> Hi Neville, > > >> >>>>>>>>> > > >> >>>>>>>>> > > >> >>>>>>>>>> thanks for the update ! > > >> >>>>>>>>>> > > >> >>>>>>>>>> As it's another language support, and to clearly identify > the > > >> >>>>>>>>>> > > >> >>>>>>>>> purpose, > > >> >>>>> > > >> >>>>>> I > > >> >>>>>>>>>> would say sdks/scala. > > >> >>>>>>>>>> > > >> >>>>>>>>>> Regards > > >> >>>>>>>>>> JB > > >> >>>>>>>>>> > > >> >>>>>>>>>> > > >> >>>>>>>>>> On 06/23/2016 11:56 PM, Neville Li wrote: > > >> >>>>>>>>>> > > >> >>>>>>>>>> +folks in my team > > >> >>>>>>>>>> > > >> >>>>>>>>>> > > >> >>>>>>>>>>> On Thu, Jun 23, 2016 at 5:57 PM Neville Li < > > >> >>>>>>>>>>> > > >> >>>>>>>>>> [email protected] > > >> >>> > > >> >>>> > > >> >>>>> wrote: > > >> >>>>>>>>>>> > > >> >>>>>>>>>>> Hi all, > > >> >>>>>>>>>>> > > >> >>>>>>>>>>> > > >> >>>>>>>>>>> I'm the co-author of Scio < > https://github.com/spotify/scio> > > >> >>>>>>>>>>>> > > >> >>>>>>>>>>> and > > >> >>> > > >> >>>> am > > >> >>>> > > >> >>>>> in > > >> >>>>>>>>>>>> the > > >> >>>>>>>>>>>> progress of moving code to Beam (BEAM-302 > > >> >>>>>>>>>>>> <https://issues.apache.org/jira/browse/BEAM-302>). Just > > >> >>>>>>>>>>>> > > >> >>>>>>>>>>> wondering > > >> >>>> > > >> >>>>> if > > >> >>>>> > > >> >>>>>> sdks/scala is the right place for this code or if something > > >> >>>>>>>>>>>> > > >> >>>>>>>>>>> like > > >> >>> > > >> >>>> dsls/scio > > >> >>>>>>>>>>>> is a better choice? What do you think? > > >> >>>>>>>>>>>> > > >> >>>>>>>>>>>> A little background: Scio was built as a high-level Scala > > API > > >> >>>>>>>>>>>> > > >> >>>>>>>>>>> for > > >> >>> > > >> >>>> Google > > >> >>>>>>>>>>>> Cloud Dataflow (now also Apache Beam) and is heavily > > >> influenced > > >> >>>>>>>>>>>> > > >> >>>>>>>>>>> by > > >> >>>> > > >> >>>>> Spark > > >> >>>>>>>>>>>> and Scalding. It wraps around the Dataflow/Beam Java SDK > > >> while > > >> >>>>>>>>>>>> > > >> >>>>>>>>>>> also > > >> >>>> > > >> >>>>> providing features comparable to other Scala data frameworks. > > >> >>>>>>>>>>>> > > >> >>>>>>>>>>> We > > >> >>> > > >> >>>> use > > >> >>>>> > > >> >>>>>> Scio > > >> >>>>>>>>>>>> on Dataflow for production extensively inside Spotify. > > >> >>>>>>>>>>>> > > >> >>>>>>>>>>>> Cheers, > > >> >>>>>>>>>>>> Neville > > >> >>>>>>>>>>>> > > >> >>>>>>>>>>>> > > >> >>>>>>>>>>>> > > >> >>>>>>>>>>>> -- > > >> >>>>>>>>>>>> > > >> >>>>>>>>>>> > > >> >>>>>>>>>>> Jean-Baptiste Onofré > > >> >>>>>>>>>> [email protected] > > >> >>>>>>>>>> http://blog.nanthrax.net > > >> >>>>>>>>>> Talend - http://www.talend.com > > >> >>>>>>>>>> > > >> >>>>>>>>>> > > >> >>>>>>>>>> > > >> >>>>>>>>>> -- > > >> >>>>>>>>> > > >> >>>>>>>> Jean-Baptiste Onofré > > >> >>>>>>>> [email protected] > > >> >>>>>>>> http://blog.nanthrax.net > > >> >>>>>>>> Talend - http://www.talend.com > > >> >>>>>>>> > > >> >>>>>>>> > > >> >>>>>>>> > > >> >>>>>>> -- > > >> >>>>>> Jean-Baptiste Onofré > > >> >>>>>> [email protected] > > >> >>>>>> http://blog.nanthrax.net > > >> >>>>>> Talend - http://www.talend.com > > >> >>>>>> > > >> >>>>>> > > >> >>>>> > > >> >>>> > > >> >>> > > >> >> > > >> > -- > > >> > Jean-Baptiste Onofré > > >> > [email protected] > > >> > http://blog.nanthrax.net > > >> > Talend - http://www.talend.com > > >> > > > >> > > > > > > > > >
