​Hello everyone,

Neville, thanks a lot for your contribution. Your work is amazing and I am
really happy that this scala integration is finally happening.
Congratulations to you and your team.

I *strongly* disagree about the DSL classification for scio for one reason,
if you go to the root of the term, Domain Specific Languages are about a
domain, and the domain in this case is writing Beam pipelines, which is a
really broad domain.

I agree with Frances’ argument that scio is not an SDK e.g. it reuses the
existing Beam java SDK. My proposition is that scio will be called the
Scala API because in the end this is what it is. I think the confusion
comes from the common definition of SDK which is normally an API + a
Runtime. In this case scio will share the runtime with what we call the
Beam Java SDK.

One additional point of using the term API is that it sends the clear
message that Beam has a Scala API too (which is good for visibility as JB
mentioned).

Regards,
Ismaël​


On Fri, Jun 24, 2016 at 5:08 PM, Jean-Baptiste Onofré <[email protected]>
wrote:

> Hi Dan,
>
> fair enough.
>
> As I'm also working on new DSLs (XML, JSON), I already created the dsls
> module.
>
> So, I would say dsls/scala.
>
> WDYT ?
>
> Regards
> JB
>
>
> On 06/24/2016 05:07 PM, Dan Halperin wrote:
>
>> I don't think that sdks/scala is the right place -- scio is not a Beam
>> Scala SDK; it wraps the existing Java SDK.
>>
>> Some options:
>> * sdks/java/extensions  (Scio builds on the Java SDK) -- mentally vetoed
>> since Scio isn't an extension for the Java SDK, but rather a wrapper
>>
>> * dsls/java/scio (Scio is a Beam DSL that uses the Java SDK)
>> * dsls/scio  (Scio is a Beam DSL that could eventually use multiple SDKs)
>> * extensions/java/scio  (Scio is an extension of Beam that uses the Java
>> SDK)
>> * extensions/scio  (Scio is an extension of Beam that is not limited to
>> one
>> SDK)
>>
>> I lean towards either dsls/java/scio or extensions/java/scio, since I
>> don't
>> think there are plans for Scio to handle multiple different SDKs (in
>> different languages). The question between these two is whether we think
>> DSLs are "big enough" to be a top level concept.
>>
>> On Thu, Jun 23, 2016 at 11:05 PM, Jean-Baptiste Onofré <[email protected]>
>> wrote:
>>
>> Good point about new Fn and the fact it's based on the Java SDK.
>>>
>>> It's just that in term of "marketing", it's a good message to provide a
>>> Scala SDK even if technically it's more a DSL.
>>>
>>> For instance, a valid "marketing" DSL would be a Java fluent DSL on top
>>> of
>>> the Java SDK, or a declarative XML DSL.
>>>
>>> However, from a technical perspective, it can go into dsl module.
>>>
>>> My $0.02 ;)
>>>
>>> Regards
>>> JB
>>>
>>>
>>> On 06/24/2016 06:51 AM, Frances Perry wrote:
>>>
>>> +Rafal & Andrew again
>>>>
>>>> I am leaning DSL for two reasons: (1) scio uses the existing java
>>>> execution
>>>> environment (and won't have a language-specific fn harness of its own),
>>>> and
>>>> (2) it changes the abstractions that users interact with.
>>>>
>>>> I recently saw a scio repl demo from Reuven -- there's some really cool
>>>> stuff in there. I'd love to dive into it a bit more and see what can be
>>>> generalized beyond scio. The repl-like interactive graph construction is
>>>> very similar to what we've seen with ipython, in that it doesn't always
>>>> play nicely with the graph construction / graph execution distinction. I
>>>> wonder what changes to Beam might more generally support this. The
>>>> materialize stuff looks similar to some functionality in FlumeJava we
>>>> used
>>>> to support multi-segment pipelines with some shared intermediate
>>>> PCollections.
>>>>
>>>> On Thu, Jun 23, 2016 at 9:22 PM, Jean-Baptiste Onofré <[email protected]>
>>>> wrote:
>>>>
>>>> Hi Neville,
>>>>
>>>>>
>>>>> thanks for the update !
>>>>>
>>>>> As it's another language support, and to clearly identify the purpose,
>>>>> I
>>>>> would say sdks/scala.
>>>>>
>>>>> Regards
>>>>> JB
>>>>>
>>>>>
>>>>> On 06/23/2016 11:56 PM, Neville Li wrote:
>>>>>
>>>>> +folks in my team
>>>>>
>>>>>>
>>>>>> On Thu, Jun 23, 2016 at 5:57 PM Neville Li <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>>
>>>>>>> I'm the co-author of Scio <https://github.com/spotify/scio> and am
>>>>>>> in
>>>>>>> the
>>>>>>> progress of moving code to Beam (BEAM-302
>>>>>>> <https://issues.apache.org/jira/browse/BEAM-302>). Just wondering if
>>>>>>> sdks/scala is the right place for this code or if something like
>>>>>>> dsls/scio
>>>>>>> is a better choice? What do you think?
>>>>>>>
>>>>>>> A little background: Scio was built as a high-level Scala API for
>>>>>>> Google
>>>>>>> Cloud Dataflow (now also Apache Beam) and is heavily influenced by
>>>>>>> Spark
>>>>>>> and Scalding. It wraps around the Dataflow/Beam Java SDK while also
>>>>>>> providing features comparable to other Scala data frameworks. We use
>>>>>>> Scio
>>>>>>> on Dataflow for production extensively inside Spotify.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Neville
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>
>>>>> Jean-Baptiste Onofré
>>>>> [email protected]
>>>>> http://blog.nanthrax.net
>>>>> Talend - http://www.talend.com
>>>>>
>>>>>
>>>>>
>>>> --
>>> Jean-Baptiste Onofré
>>> [email protected]
>>> http://blog.nanthrax.net
>>> Talend - http://www.talend.com
>>>
>>>
>>
> --
> Jean-Baptiste Onofré
> [email protected]
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Reply via email to