I don't think that sdks/scala is the right place -- scio is not a Beam
Scala SDK; it wraps the existing Java SDK.

Some options:
* sdks/java/extensions  (Scio builds on the Java SDK) -- mentally vetoed
since Scio isn't an extension for the Java SDK, but rather a wrapper

* dsls/java/scio (Scio is a Beam DSL that uses the Java SDK)
* dsls/scio  (Scio is a Beam DSL that could eventually use multiple SDKs)
* extensions/java/scio  (Scio is an extension of Beam that uses the Java
SDK)
* extensions/scio  (Scio is an extension of Beam that is not limited to one
SDK)

I lean towards either dsls/java/scio or extensions/java/scio, since I don't
think there are plans for Scio to handle multiple different SDKs (in
different languages). The question between these two is whether we think
DSLs are "big enough" to be a top level concept.

On Thu, Jun 23, 2016 at 11:05 PM, Jean-Baptiste Onofré <[email protected]>
wrote:

> Good point about new Fn and the fact it's based on the Java SDK.
>
> It's just that in term of "marketing", it's a good message to provide a
> Scala SDK even if technically it's more a DSL.
>
> For instance, a valid "marketing" DSL would be a Java fluent DSL on top of
> the Java SDK, or a declarative XML DSL.
>
> However, from a technical perspective, it can go into dsl module.
>
> My $0.02 ;)
>
> Regards
> JB
>
>
> On 06/24/2016 06:51 AM, Frances Perry wrote:
>
>> +Rafal & Andrew again
>>
>> I am leaning DSL for two reasons: (1) scio uses the existing java
>> execution
>> environment (and won't have a language-specific fn harness of its own),
>> and
>> (2) it changes the abstractions that users interact with.
>>
>> I recently saw a scio repl demo from Reuven -- there's some really cool
>> stuff in there. I'd love to dive into it a bit more and see what can be
>> generalized beyond scio. The repl-like interactive graph construction is
>> very similar to what we've seen with ipython, in that it doesn't always
>> play nicely with the graph construction / graph execution distinction. I
>> wonder what changes to Beam might more generally support this. The
>> materialize stuff looks similar to some functionality in FlumeJava we used
>> to support multi-segment pipelines with some shared intermediate
>> PCollections.
>>
>> On Thu, Jun 23, 2016 at 9:22 PM, Jean-Baptiste Onofré <[email protected]>
>> wrote:
>>
>> Hi Neville,
>>>
>>> thanks for the update !
>>>
>>> As it's another language support, and to clearly identify the purpose, I
>>> would say sdks/scala.
>>>
>>> Regards
>>> JB
>>>
>>>
>>> On 06/23/2016 11:56 PM, Neville Li wrote:
>>>
>>> +folks in my team
>>>>
>>>> On Thu, Jun 23, 2016 at 5:57 PM Neville Li <[email protected]>
>>>> wrote:
>>>>
>>>> Hi all,
>>>>
>>>>>
>>>>> I'm the co-author of Scio <https://github.com/spotify/scio> and am in
>>>>> the
>>>>> progress of moving code to Beam (BEAM-302
>>>>> <https://issues.apache.org/jira/browse/BEAM-302>). Just wondering if
>>>>> sdks/scala is the right place for this code or if something like
>>>>> dsls/scio
>>>>> is a better choice? What do you think?
>>>>>
>>>>> A little background: Scio was built as a high-level Scala API for
>>>>> Google
>>>>> Cloud Dataflow (now also Apache Beam) and is heavily influenced by
>>>>> Spark
>>>>> and Scalding. It wraps around the Dataflow/Beam Java SDK while also
>>>>> providing features comparable to other Scala data frameworks. We use
>>>>> Scio
>>>>> on Dataflow for production extensively inside Spotify.
>>>>>
>>>>> Cheers,
>>>>> Neville
>>>>>
>>>>>
>>>>>
>>>> --
>>> Jean-Baptiste Onofré
>>> [email protected]
>>> http://blog.nanthrax.net
>>> Talend - http://www.talend.com
>>>
>>>
>>
> --
> Jean-Baptiste Onofré
> [email protected]
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Reply via email to