My +1 goes to dsls/scio. It already has a cool name, so let's use it. And there might be other Scala-based DSLs.
On Fri, Jun 24, 2016 at 8:39 AM, Ismaël Mejía <[email protected]> wrote: > Hello everyone, > > Neville, thanks a lot for your contribution. Your work is amazing and I am > really happy that this scala integration is finally happening. > Congratulations to you and your team. > > I *strongly* disagree about the DSL classification for scio for one reason, > if you go to the root of the term, Domain Specific Languages are about a > domain, and the domain in this case is writing Beam pipelines, which is a > really broad domain. > > I agree with Frances’ argument that scio is not an SDK e.g. it reuses the > existing Beam java SDK. My proposition is that scio will be called the > Scala API because in the end this is what it is. I think the confusion > comes from the common definition of SDK which is normally an API + a > Runtime. In this case scio will share the runtime with what we call the > Beam Java SDK. > > One additional point of using the term API is that it sends the clear > message that Beam has a Scala API too (which is good for visibility as JB > mentioned). > > Regards, > Ismaël > > > On Fri, Jun 24, 2016 at 5:08 PM, Jean-Baptiste Onofré <[email protected]> > wrote: > > > Hi Dan, > > > > fair enough. > > > > As I'm also working on new DSLs (XML, JSON), I already created the dsls > > module. > > > > So, I would say dsls/scala. > > > > WDYT ? > > > > Regards > > JB > > > > > > On 06/24/2016 05:07 PM, Dan Halperin wrote: > > > >> I don't think that sdks/scala is the right place -- scio is not a Beam > >> Scala SDK; it wraps the existing Java SDK. > >> > >> Some options: > >> * sdks/java/extensions (Scio builds on the Java SDK) -- mentally vetoed > >> since Scio isn't an extension for the Java SDK, but rather a wrapper > >> > >> * dsls/java/scio (Scio is a Beam DSL that uses the Java SDK) > >> * dsls/scio (Scio is a Beam DSL that could eventually use multiple > SDKs) > >> * extensions/java/scio (Scio is an extension of Beam that uses the Java > >> SDK) > >> * extensions/scio (Scio is an extension of Beam that is not limited to > >> one > >> SDK) > >> > >> I lean towards either dsls/java/scio or extensions/java/scio, since I > >> don't > >> think there are plans for Scio to handle multiple different SDKs (in > >> different languages). The question between these two is whether we think > >> DSLs are "big enough" to be a top level concept. > >> > >> On Thu, Jun 23, 2016 at 11:05 PM, Jean-Baptiste Onofré <[email protected] > > > >> wrote: > >> > >> Good point about new Fn and the fact it's based on the Java SDK. > >>> > >>> It's just that in term of "marketing", it's a good message to provide a > >>> Scala SDK even if technically it's more a DSL. > >>> > >>> For instance, a valid "marketing" DSL would be a Java fluent DSL on top > >>> of > >>> the Java SDK, or a declarative XML DSL. > >>> > >>> However, from a technical perspective, it can go into dsl module. > >>> > >>> My $0.02 ;) > >>> > >>> Regards > >>> JB > >>> > >>> > >>> On 06/24/2016 06:51 AM, Frances Perry wrote: > >>> > >>> +Rafal & Andrew again > >>>> > >>>> I am leaning DSL for two reasons: (1) scio uses the existing java > >>>> execution > >>>> environment (and won't have a language-specific fn harness of its > own), > >>>> and > >>>> (2) it changes the abstractions that users interact with. > >>>> > >>>> I recently saw a scio repl demo from Reuven -- there's some really > cool > >>>> stuff in there. I'd love to dive into it a bit more and see what can > be > >>>> generalized beyond scio. The repl-like interactive graph construction > is > >>>> very similar to what we've seen with ipython, in that it doesn't > always > >>>> play nicely with the graph construction / graph execution > distinction. I > >>>> wonder what changes to Beam might more generally support this. The > >>>> materialize stuff looks similar to some functionality in FlumeJava we > >>>> used > >>>> to support multi-segment pipelines with some shared intermediate > >>>> PCollections. > >>>> > >>>> On Thu, Jun 23, 2016 at 9:22 PM, Jean-Baptiste Onofré < > [email protected]> > >>>> wrote: > >>>> > >>>> Hi Neville, > >>>> > >>>>> > >>>>> thanks for the update ! > >>>>> > >>>>> As it's another language support, and to clearly identify the > purpose, > >>>>> I > >>>>> would say sdks/scala. > >>>>> > >>>>> Regards > >>>>> JB > >>>>> > >>>>> > >>>>> On 06/23/2016 11:56 PM, Neville Li wrote: > >>>>> > >>>>> +folks in my team > >>>>> > >>>>>> > >>>>>> On Thu, Jun 23, 2016 at 5:57 PM Neville Li <[email protected]> > >>>>>> wrote: > >>>>>> > >>>>>> Hi all, > >>>>>> > >>>>>> > >>>>>>> I'm the co-author of Scio <https://github.com/spotify/scio> and am > >>>>>>> in > >>>>>>> the > >>>>>>> progress of moving code to Beam (BEAM-302 > >>>>>>> <https://issues.apache.org/jira/browse/BEAM-302>). Just wondering > if > >>>>>>> sdks/scala is the right place for this code or if something like > >>>>>>> dsls/scio > >>>>>>> is a better choice? What do you think? > >>>>>>> > >>>>>>> A little background: Scio was built as a high-level Scala API for > >>>>>>> Google > >>>>>>> Cloud Dataflow (now also Apache Beam) and is heavily influenced by > >>>>>>> Spark > >>>>>>> and Scalding. It wraps around the Dataflow/Beam Java SDK while also > >>>>>>> providing features comparable to other Scala data frameworks. We > use > >>>>>>> Scio > >>>>>>> on Dataflow for production extensively inside Spotify. > >>>>>>> > >>>>>>> Cheers, > >>>>>>> Neville > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>> > >>>>> Jean-Baptiste Onofré > >>>>> [email protected] > >>>>> http://blog.nanthrax.net > >>>>> Talend - http://www.talend.com > >>>>> > >>>>> > >>>>> > >>>> -- > >>> Jean-Baptiste Onofré > >>> [email protected] > >>> http://blog.nanthrax.net > >>> Talend - http://www.talend.com > >>> > >>> > >> > > -- > > Jean-Baptiste Onofré > > [email protected] > > http://blog.nanthrax.net > > Talend - http://www.talend.com > > >
