Good point about new Fn and the fact it's based on the Java SDK.

It's just that in term of "marketing", it's a good message to provide a Scala SDK even if technically it's more a DSL.

For instance, a valid "marketing" DSL would be a Java fluent DSL on top of the Java SDK, or a declarative XML DSL.

However, from a technical perspective, it can go into dsl module.

My $0.02 ;)

Regards
JB

On 06/24/2016 06:51 AM, Frances Perry wrote:
+Rafal & Andrew again

I am leaning DSL for two reasons: (1) scio uses the existing java execution
environment (and won't have a language-specific fn harness of its own), and
(2) it changes the abstractions that users interact with.

I recently saw a scio repl demo from Reuven -- there's some really cool
stuff in there. I'd love to dive into it a bit more and see what can be
generalized beyond scio. The repl-like interactive graph construction is
very similar to what we've seen with ipython, in that it doesn't always
play nicely with the graph construction / graph execution distinction. I
wonder what changes to Beam might more generally support this. The
materialize stuff looks similar to some functionality in FlumeJava we used
to support multi-segment pipelines with some shared intermediate
PCollections.

On Thu, Jun 23, 2016 at 9:22 PM, Jean-Baptiste Onofré <[email protected]>
wrote:

Hi Neville,

thanks for the update !

As it's another language support, and to clearly identify the purpose, I
would say sdks/scala.

Regards
JB


On 06/23/2016 11:56 PM, Neville Li wrote:

+folks in my team

On Thu, Jun 23, 2016 at 5:57 PM Neville Li <[email protected]> wrote:

Hi all,

I'm the co-author of Scio <https://github.com/spotify/scio> and am in
the
progress of moving code to Beam (BEAM-302
<https://issues.apache.org/jira/browse/BEAM-302>). Just wondering if
sdks/scala is the right place for this code or if something like
dsls/scio
is a better choice? What do you think?

A little background: Scio was built as a high-level Scala API for Google
Cloud Dataflow (now also Apache Beam) and is heavily influenced by Spark
and Scalding. It wraps around the Dataflow/Beam Java SDK while also
providing features comparable to other Scala data frameworks. We use Scio
on Dataflow for production extensively inside Spotify.

Cheers,
Neville



--
Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com



--
Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com

Reply via email to