Agree for dsls/scio

Regards
JB

On 06/24/2016 10:22 PM, Lukasz Cwik wrote:
+1 for dsls/scio for the already listed reasons

On Fri, Jun 24, 2016 at 11:21 AM, Rafal Wojdyla <[email protected]>
wrote:

Hello. When it comes to SDK vs DSL - I fully agree with Frances. About
dsls/java/scio or dsls/scio - dsls/java/scio may cause confusion, scio is a
scala DSL but lives under java directory (?) - that makes sense only once
you get that scio is using java SDK under the hood. Thus, +1 to dsls/scio.
- Rafal

On Fri, Jun 24, 2016 at 2:01 PM, Kenneth Knowles <[email protected]>
wrote:

My +1 goes to dsls/scio. It already has a cool name, so let's use it. And
there might be other Scala-based DSLs.

On Fri, Jun 24, 2016 at 8:39 AM, Ismaël Mejía <[email protected]> wrote:

​Hello everyone,

Neville, thanks a lot for your contribution. Your work is amazing and I
am
really happy that this scala integration is finally happening.
Congratulations to you and your team.

I *strongly* disagree about the DSL classification for scio for one
reason,
if you go to the root of the term, Domain Specific Languages are about
a
domain, and the domain in this case is writing Beam pipelines, which
is a
really broad domain.

I agree with Frances’ argument that scio is not an SDK e.g. it reuses
the
existing Beam java SDK. My proposition is that scio will be called the
Scala API because in the end this is what it is. I think the confusion
comes from the common definition of SDK which is normally an API + a
Runtime. In this case scio will share the runtime with what we call the
Beam Java SDK.

One additional point of using the term API is that it sends the clear
message that Beam has a Scala API too (which is good for visibility as
JB
mentioned).

Regards,
Ismaël​


On Fri, Jun 24, 2016 at 5:08 PM, Jean-Baptiste Onofré <[email protected]

wrote:

Hi Dan,

fair enough.

As I'm also working on new DSLs (XML, JSON), I already created the
dsls
module.

So, I would say dsls/scala.

WDYT ?

Regards
JB


On 06/24/2016 05:07 PM, Dan Halperin wrote:

I don't think that sdks/scala is the right place -- scio is not a
Beam
Scala SDK; it wraps the existing Java SDK.

Some options:
* sdks/java/extensions  (Scio builds on the Java SDK) -- mentally
vetoed
since Scio isn't an extension for the Java SDK, but rather a wrapper

* dsls/java/scio (Scio is a Beam DSL that uses the Java SDK)
* dsls/scio  (Scio is a Beam DSL that could eventually use multiple
SDKs)
* extensions/java/scio  (Scio is an extension of Beam that uses the
Java
SDK)
* extensions/scio  (Scio is an extension of Beam that is not limited
to
one
SDK)

I lean towards either dsls/java/scio or extensions/java/scio, since
I
don't
think there are plans for Scio to handle multiple different SDKs (in
different languages). The question between these two is whether we
think
DSLs are "big enough" to be a top level concept.

On Thu, Jun 23, 2016 at 11:05 PM, Jean-Baptiste Onofré <
[email protected]

wrote:

Good point about new Fn and the fact it's based on the Java SDK.

It's just that in term of "marketing", it's a good message to
provide a
Scala SDK even if technically it's more a DSL.

For instance, a valid "marketing" DSL would be a Java fluent DSL on
top
of
the Java SDK, or a declarative XML DSL.

However, from a technical perspective, it can go into dsl module.

My $0.02 ;)

Regards
JB


On 06/24/2016 06:51 AM, Frances Perry wrote:

+Rafal & Andrew again

I am leaning DSL for two reasons: (1) scio uses the existing java
execution
environment (and won't have a language-specific fn harness of its
own),
and
(2) it changes the abstractions that users interact with.

I recently saw a scio repl demo from Reuven -- there's some really
cool
stuff in there. I'd love to dive into it a bit more and see what
can
be
generalized beyond scio. The repl-like interactive graph
construction
is
very similar to what we've seen with ipython, in that it doesn't
always
play nicely with the graph construction / graph execution
distinction. I
wonder what changes to Beam might more generally support this. The
materialize stuff looks similar to some functionality in FlumeJava
we
used
to support multi-segment pipelines with some shared intermediate
PCollections.

On Thu, Jun 23, 2016 at 9:22 PM, Jean-Baptiste Onofré <
[email protected]>
wrote:

Hi Neville,


thanks for the update !

As it's another language support, and to clearly identify the
purpose,
I
would say sdks/scala.

Regards
JB


On 06/23/2016 11:56 PM, Neville Li wrote:

+folks in my team


On Thu, Jun 23, 2016 at 5:57 PM Neville Li <
[email protected]

wrote:

Hi all,


I'm the co-author of Scio <https://github.com/spotify/scio>
and
am
in
the
progress of moving code to Beam (BEAM-302
<https://issues.apache.org/jira/browse/BEAM-302>). Just
wondering
if
sdks/scala is the right place for this code or if something
like
dsls/scio
is a better choice? What do you think?

A little background: Scio was built as a high-level Scala API
for
Google
Cloud Dataflow (now also Apache Beam) and is heavily influenced
by
Spark
and Scalding. It wraps around the Dataflow/Beam Java SDK while
also
providing features comparable to other Scala data frameworks.
We
use
Scio
on Dataflow for production extensively inside Spotify.

Cheers,
Neville



--

Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com



--
Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com



--
Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com






--
Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com

Reply via email to