Re: Euphoria Java 8 DSL - proposal

2018-01-02 Thread Jean-Baptiste Onofré

Great !

Thanks !
Regards
JB

On 01/03/2018 07:29 AM, David Morávek wrote:

Hello JB,

Perfect! I'm already on the Beam Slack workspace, I'll contact you once I get to 
the office.


Thanks!
D.

On Wed, Jan 3, 2018 at 6:19 AM, Jean-Baptiste Onofré > wrote:


Hi David,

absolutely !! Let's move forward on the preparation steps.

Are you on Slack and/or hangout to plan this ?

Thanks,
Regards
JB

On 01/02/2018 05:35 PM, David Morávek wrote:

Hello JB,

can we help in any way to move things forward?

Thanks,
D.

On Mon, Dec 18, 2017 at 4:28 PM, Jean-Baptiste Onofré  >> wrote:

     Thanks Jan,

     It makes sense.

     Let me take a look on the code to understand the "interaction".

     Regards
     JB


     On 12/18/2017 04:26 PM, Jan Lukavský wrote:

         Hi JB,

         basically you are not wrong. The project started about three or
four
         years ago with a goal to unify batch and streaming processing 
into
         single portable, executor independent API. Because of that, it 
is
         currently "close" to Beam in this sense. But we don't see much
added
         value keeping this as a separate project, with one of the key
         differences to be the API (not the model itself), so we would
like to
         focus on translation from Euphoria API to Beam's SDK. That's 
why we
         would like to see it as a DSL, so that it would be possible to 
use
         Euphoria API with Beam's runners as much natively as possible.

         I hope I didn't make the subject even more unclear, if so, I'll
be happy
         to explain anything in more detail. :-)

             Jan


         On 12/18/2017 04:08 PM, Jean-Baptiste Onofré wrote:

             Hi Jan,

             Thanks for your answers.

             However, they confused me ;)

             Regarding what you replied, Euphoria seems like a 
programming
             model/SDK "close" to Beam more than a DSL on top of an
existing Beam
             SDK.

             Am I wrong ?

             Regards
             JB

             On 12/18/2017 03:44 PM, Jan Lukavský wrote:

                 Hi Ismael,

                 basically we adopted the Beam's design regarding
partitioning
                 (https://github.com/seznam/euphoria/issues/160

                 >) and implemented
                 the sorting manually
                 (https://github.com/seznam/euphoria/issues/158

                 >). I'm not aware
                 of the time model differences (Euphoria supports
ingestion and
                 event time, we don't support processing time by 
decision).
                 Regarding other differences (looking into Beam 
capability
                 matrix, I'd say that):

                    - we don't support stateful FlatMap (i.e. ParDo) 
for now
                 (https://github.com/seznam/euphoria/issues/192

                 >)

                    - we don't support side inputs (by decision now, but
might be
                 reconsidered) and outputs
                 (https://github.com/seznam/euphoria/issues/124

                 >)


                    - we support complete event-time windows 
(non-merging,
                 merging, aligned, unaligned) and time control

                    - we don't support processing time by decision 
(might be
                 reconsidered if a valid use-case is found)

                    - we support window triggering based on both time
and data,
                 including discarding and accumulating (without
accumulating &
                 retracting)

                 All our executors (runners) - Flink, Spark and 

Re: Euphoria Java 8 DSL - proposal

2018-01-02 Thread David Morávek
Hello JB,

Perfect! I'm already on the Beam Slack workspace, I'll contact you once I
get to the office.

Thanks!
D.

On Wed, Jan 3, 2018 at 6:19 AM, Jean-Baptiste Onofré 
wrote:

> Hi David,
>
> absolutely !! Let's move forward on the preparation steps.
>
> Are you on Slack and/or hangout to plan this ?
>
> Thanks,
> Regards
> JB
>
> On 01/02/2018 05:35 PM, David Morávek wrote:
>
>> Hello JB,
>>
>> can we help in any way to move things forward?
>>
>> Thanks,
>> D.
>>
>> On Mon, Dec 18, 2017 at 4:28 PM, Jean-Baptiste Onofré > > wrote:
>>
>> Thanks Jan,
>>
>> It makes sense.
>>
>> Let me take a look on the code to understand the "interaction".
>>
>> Regards
>> JB
>>
>>
>> On 12/18/2017 04:26 PM, Jan Lukavský wrote:
>>
>> Hi JB,
>>
>> basically you are not wrong. The project started about three or
>> four
>> years ago with a goal to unify batch and streaming processing into
>> single portable, executor independent API. Because of that, it is
>> currently "close" to Beam in this sense. But we don't see much
>> added
>> value keeping this as a separate project, with one of the key
>> differences to be the API (not the model itself), so we would
>> like to
>> focus on translation from Euphoria API to Beam's SDK. That's why
>> we
>> would like to see it as a DSL, so that it would be possible to use
>> Euphoria API with Beam's runners as much natively as possible.
>>
>> I hope I didn't make the subject even more unclear, if so, I'll
>> be happy
>> to explain anything in more detail. :-)
>>
>> Jan
>>
>>
>> On 12/18/2017 04:08 PM, Jean-Baptiste Onofré wrote:
>>
>> Hi Jan,
>>
>> Thanks for your answers.
>>
>> However, they confused me ;)
>>
>> Regarding what you replied, Euphoria seems like a programming
>> model/SDK "close" to Beam more than a DSL on top of an
>> existing Beam
>> SDK.
>>
>> Am I wrong ?
>>
>> Regards
>> JB
>>
>> On 12/18/2017 03:44 PM, Jan Lukavský wrote:
>>
>> Hi Ismael,
>>
>> basically we adopted the Beam's design regarding
>> partitioning
>> (https://github.com/seznam/euphoria/issues/160
>> ) and
>> implemented
>> the sorting manually
>> (https://github.com/seznam/euphoria/issues/158
>> ). I'm
>> not aware
>> of the time model differences (Euphoria supports
>> ingestion and
>> event time, we don't support processing time by decision).
>> Regarding other differences (looking into Beam capability
>> matrix, I'd say that):
>>
>>- we don't support stateful FlatMap (i.e. ParDo) for
>> now
>> (https://github.com/seznam/euphoria/issues/192
>> )
>>
>>- we don't support side inputs (by decision now, but
>> might be
>> reconsidered) and outputs
>> (https://github.com/seznam/euphoria/issues/124
>> )
>>
>>
>>- we support complete event-time windows (non-merging,
>> merging, aligned, unaligned) and time control
>>
>>- we don't support processing time by decision (might
>> be
>> reconsidered if a valid use-case is found)
>>
>>- we support window triggering based on both time and
>> data,
>> including discarding and accumulating (without
>> accumulating &
>> retracting)
>>
>> All our executors (runners) - Flink, Spark and Local -
>> implement
>> the complete model, which we enforce using "operator test
>> kit"
>> that all executors must pass. Spark executor supports
>> bounded
>> sources only (for now). As David said, we currently don't
>> have
>> serialization abstraction, so there is some work to be
>> done in
>> that regard.
>>
>> Our intention is to completely supersede Euphoria, we
>> would like
>> to consider possibility to use executors that would not
>> rely on
>> Beam, but that is optional now and should be
>> straightforward.
>>
>> We'd be happy to answer any more questions you might have
>> and
>> thanks a lot!
>>
>> Best,
>>
>>Jan
>>
>>
>> On 12/18/2017 03:19 PM, Ismaël Mejía wrote:
>>
>> Hi,
>>
>> It is great to see 

Re: [INFO] Spark runner itests fail

2018-01-02 Thread Jean-Baptiste Onofré

By the way, nightly build is not impacted, nor all PRs. I keep you posted.

Regards
JB

On 01/02/2018 07:38 PM, Kenneth Knowles wrote:
Figure anything out? If it keeps happening, maybe we could gather knowledge 
about it on a JIRA.


Kenn

On Sun, Dec 31, 2017 at 11:18 PM, Jean-Baptiste Onofré > wrote:


Hi,

The Spark runner itests fails on Jenkins for some pull requests.

I'm trying to reproduce locally (it seems to be random).

I keep you posted.

Regards
JB
-- 
Jean-Baptiste Onofré

jbono...@apache.org 
http://blog.nanthrax.net
Talend - http://www.talend.com




--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


Re: Testing, automation, and pipeline rollout in a CI/CD world

2018-01-02 Thread Jean-Baptiste Onofré

Hi Charles,

Maybe you can setup data sets and use the TestPipeline to validate (with 
PAssert) that it works as expected in your pipeline.


The data sets can be store somewhere (database or filesystem) and loaded in 
tests (basically as we do in the Beam ITs).


Thought ?

Regards
JB

On 01/03/2018 12:37 AM, Charles Allen wrote:

Hello Beam list!

We are looking at adopting some more advanced use cases with Beam code at its 
core including automated testing and data dependency tracking.


Specifically I'm interested in things like making sure data changes don't break 
pipelines, or things that depend on pipeline output, especially if the Beam code 
isn't managed by the same team that is producing the data or the systems that 
consume the Beam output.


This becomes more complex if you consider certain runners with non-zero 
replacement time doing a rolling or staged restart/upgrade/replacement that 
depend on data producers that ALSO have non-zero replacement time. Are there any 
best practices for Beam code management / data dependency management when the 
code in /master is not necessarily what is running live in your production 
systems? Is it all just "pretend all data is bad and try to be backwards 
compatible", or are there any Beam features that help with this?


Thanks,
Charles Allen


--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


Re: [INFO] Spark runner itests fail

2018-01-02 Thread Jean-Baptiste Onofré

Hi Kenn,

I will create the Jira and the corresponding PR today.

Regards
JB

On 01/02/2018 07:38 PM, Kenneth Knowles wrote:
Figure anything out? If it keeps happening, maybe we could gather knowledge 
about it on a JIRA.


Kenn

On Sun, Dec 31, 2017 at 11:18 PM, Jean-Baptiste Onofré > wrote:


Hi,

The Spark runner itests fails on Jenkins for some pull requests.

I'm trying to reproduce locally (it seems to be random).

I keep you posted.

Regards
JB
-- 
Jean-Baptiste Onofré

jbono...@apache.org 
http://blog.nanthrax.net
Talend - http://www.talend.com




--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


Re: Euphoria Java 8 DSL - proposal

2018-01-02 Thread Jean-Baptiste Onofré

Hi David,

absolutely !! Let's move forward on the preparation steps.

Are you on Slack and/or hangout to plan this ?

Thanks,
Regards
JB

On 01/02/2018 05:35 PM, David Morávek wrote:

Hello JB,

can we help in any way to move things forward?

Thanks,
D.

On Mon, Dec 18, 2017 at 4:28 PM, Jean-Baptiste Onofré > wrote:


Thanks Jan,

It makes sense.

Let me take a look on the code to understand the "interaction".

Regards
JB


On 12/18/2017 04:26 PM, Jan Lukavský wrote:

Hi JB,

basically you are not wrong. The project started about three or four
years ago with a goal to unify batch and streaming processing into
single portable, executor independent API. Because of that, it is
currently "close" to Beam in this sense. But we don't see much added
value keeping this as a separate project, with one of the key
differences to be the API (not the model itself), so we would like to
focus on translation from Euphoria API to Beam's SDK. That's why we
would like to see it as a DSL, so that it would be possible to use
Euphoria API with Beam's runners as much natively as possible.

I hope I didn't make the subject even more unclear, if so, I'll be happy
to explain anything in more detail. :-)

    Jan


On 12/18/2017 04:08 PM, Jean-Baptiste Onofré wrote:

Hi Jan,

Thanks for your answers.

However, they confused me ;)

Regarding what you replied, Euphoria seems like a programming
model/SDK "close" to Beam more than a DSL on top of an existing Beam
SDK.

Am I wrong ?

Regards
JB

On 12/18/2017 03:44 PM, Jan Lukavský wrote:

Hi Ismael,

basically we adopted the Beam's design regarding partitioning
(https://github.com/seznam/euphoria/issues/160
) and implemented
the sorting manually
(https://github.com/seznam/euphoria/issues/158
). I'm not aware
of the time model differences (Euphoria supports ingestion and
event time, we don't support processing time by decision).
Regarding other differences (looking into Beam capability
matrix, I'd say that):

   - we don't support stateful FlatMap (i.e. ParDo) for now
(https://github.com/seznam/euphoria/issues/192
)

   - we don't support side inputs (by decision now, but might be
reconsidered) and outputs
(https://github.com/seznam/euphoria/issues/124
)

   - we support complete event-time windows (non-merging,
merging, aligned, unaligned) and time control

   - we don't support processing time by decision (might be
reconsidered if a valid use-case is found)

   - we support window triggering based on both time and data,
including discarding and accumulating (without accumulating &
retracting)

All our executors (runners) - Flink, Spark and Local - implement
the complete model, which we enforce using "operator test kit"
that all executors must pass. Spark executor supports bounded
sources only (for now). As David said, we currently don't have
serialization abstraction, so there is some work to be done in
that regard.

Our intention is to completely supersede Euphoria, we would like
to consider possibility to use executors that would not rely on
Beam, but that is optional now and should be straightforward.

We'd be happy to answer any more questions you might have and
thanks a lot!

Best,

   Jan


On 12/18/2017 03:19 PM, Ismaël Mejía wrote:

Hi,

It is great to see that you guys have achieved a maturity
point to
propose this. Congratulations for your work and the idea to
contribute
it into Beam.

I remember from a previous discussion with Jan about the 
model
mismatch between Euphoria and Beam, because of some design
decisions
of both projects. I remember you guys had some issues with
the way
Beam's sources do partitioning, as well as Beam's lack of

Testing, automation, and pipeline rollout in a CI/CD world

2018-01-02 Thread Charles Allen
Hello Beam list!

We are looking at adopting some more advanced use cases with Beam code at
its core including automated testing and data dependency tracking.

Specifically I'm interested in things like making sure data changes don't
break pipelines, or things that depend on pipeline output, especially if
the Beam code isn't managed by the same team that is producing the data or
the systems that consume the Beam output.

This becomes more complex if you consider certain runners with non-zero
replacement time doing a rolling or staged restart/upgrade/replacement that
depend on data producers that ALSO have non-zero replacement time. Are
there any best practices for Beam code management / data dependency
management when the code in /master is not necessarily what is running live
in your production systems? Is it all just "pretend all data is bad and try
to be backwards compatible", or are there any Beam features that help with
this?

Thanks,
Charles Allen


Re: Euphoria Java 8 DSL - proposal

2018-01-02 Thread Kenneth Knowles
+1 here. I already liked Euphoria, and I like the merger even more :-)

Kenn

On Tue, Jan 2, 2018 at 8:45 AM, Tyler Akidau  wrote:

> +1, I'm supportive of seeing this move forward. What remaining concrete
> concerns are there?
>
> -Tyler
>
>
> On Tue, Jan 2, 2018 at 8:35 AM David Morávek 
> wrote:
>
>> Hello JB,
>>
>> can we help in any way to move things forward?
>>
>> Thanks,
>> D.
>>
>> On Mon, Dec 18, 2017 at 4:28 PM, Jean-Baptiste Onofré 
>> wrote:
>>
>>> Thanks Jan,
>>>
>>> It makes sense.
>>>
>>> Let me take a look on the code to understand the "interaction".
>>>
>>> Regards
>>> JB
>>>
>>>
>>> On 12/18/2017 04:26 PM, Jan Lukavský wrote:
>>>
 Hi JB,

 basically you are not wrong. The project started about three or four
 years ago with a goal to unify batch and streaming processing into single
 portable, executor independent API. Because of that, it is currently
 "close" to Beam in this sense. But we don't see much added value keeping
 this as a separate project, with one of the key differences to be the API
 (not the model itself), so we would like to focus on translation from
 Euphoria API to Beam's SDK. That's why we would like to see it as a DSL, so
 that it would be possible to use Euphoria API with Beam's runners as much
 natively as possible.

 I hope I didn't make the subject even more unclear, if so, I'll be
 happy to explain anything in more detail. :-)

Jan


 On 12/18/2017 04:08 PM, Jean-Baptiste Onofré wrote:

> Hi Jan,
>
> Thanks for your answers.
>
> However, they confused me ;)
>
> Regarding what you replied, Euphoria seems like a programming
> model/SDK "close" to Beam more than a DSL on top of an existing Beam SDK.
>
> Am I wrong ?
>
> Regards
> JB
>
> On 12/18/2017 03:44 PM, Jan Lukavský wrote:
>
>> Hi Ismael,
>>
>> basically we adopted the Beam's design regarding partitioning (
>> https://github.com/seznam/euphoria/issues/160) and implemented the
>> sorting manually (https://github.com/seznam/euphoria/issues/158).
>> I'm not aware of the time model differences (Euphoria supports ingestion
>> and event time, we don't support processing time by decision). Regarding
>> other differences (looking into Beam capability matrix, I'd say that):
>>
>>   - we don't support stateful FlatMap (i.e. ParDo) for now (
>> https://github.com/seznam/euphoria/issues/192)
>>
>>   - we don't support side inputs (by decision now, but might be
>> reconsidered) and outputs (https://github.com/seznam/
>> euphoria/issues/124)
>>
>>   - we support complete event-time windows (non-merging, merging,
>> aligned, unaligned) and time control
>>
>>   - we don't support processing time by decision (might be
>> reconsidered if a valid use-case is found)
>>
>>   - we support window triggering based on both time and data,
>> including discarding and accumulating (without accumulating & retracting)
>>
>> All our executors (runners) - Flink, Spark and Local - implement the
>> complete model, which we enforce using "operator test kit" that all
>> executors must pass. Spark executor supports bounded sources only (for
>> now). As David said, we currently don't have serialization abstraction, 
>> so
>> there is some work to be done in that regard.
>>
>> Our intention is to completely supersede Euphoria, we would like to
>> consider possibility to use executors that would not rely on Beam, but 
>> that
>> is optional now and should be straightforward.
>>
>> We'd be happy to answer any more questions you might have and thanks
>> a lot!
>>
>> Best,
>>
>>   Jan
>>
>>
>> On 12/18/2017 03:19 PM, Ismaël Mejía wrote:
>>
>>> Hi,
>>>
>>> It is great to see that you guys have achieved a maturity point to
>>> propose this. Congratulations for your work and the idea to
>>> contribute
>>> it into Beam.
>>>
>>> I remember from a previous discussion with Jan about the model
>>> mismatch between Euphoria and Beam, because of some design decisions
>>> of both projects. I remember you guys had some issues with the way
>>> Beam's sources do partitioning, as well as Beam's lack of sorted data
>>> (on shuffle a la hadoop). Also if I remember well the 'time' model of
>>> Euphoria was simpler than Beam's. I talk about all of this because I
>>> am curious about what parts of the Euphoria model you guys had to
>>> sacrifice to support Beam, and what parts of Beam's model should
>>> still
>>> be integrated into Euphoria (and if there is a straightforward path
>>> to
>>> do it).
>>>
>>> If I understand well if this gets merged into Apache this means that
>>> Euphoria's 

Re: Happy new year

2018-01-02 Thread Tyler Akidau
+1 all around. 2017 was excellent, and we're in a great position for 2018
to be even better. Looking forward to it. :-)

On Tue, Jan 2, 2018 at 10:34 AM Kenneth Knowles  wrote:

> Happy new year! Very excited for 2018's possibilities. And nice work, JB.
>
> Kenn
>
> On Tue, Jan 2, 2018 at 6:42 AM, Ismaël Mejía  wrote:
>
>> Happy new year everyone !
>>
>> Agree 100% with Davor. 2017 was a good year for the project and it is
>> worth to thank everyone who helped to make the project better.
>> Now is the time to work to have a great project in 2018 too !
>>
>> Best wishes to all (and kudos to JB too for the top 5) !
>>
>> Ismaël
>>
>>
>>
>> On Mon, Jan 1, 2018 at 9:29 PM, David Sabater Dinter
>>  wrote:
>> > Happy New year all!
>> >
>> > On Mon, 1 Jan 2018 at 19:43, Davor Bonaci  wrote:
>> >>
>> >> Hi everyone --
>> >> As we begin the new year, I wanted to send the best wishes in 2018 to
>> >> everyone in the Beam community -- users, contributors and observers
>> alike!
>> >>
>> >> There's so much to be proud of in 2017; including graduation to a
>> >> top-level project and the availability of the first stable release.
>> Thanks
>> >> to everyone for making this possible!
>> >>
>> >> Finally, I'd also like to pass along some fun facts compiled by others
>> >> [1]. Beam mailing lists had the 9th highest volume among all user@
>> +dev@
>> >> lists. Our very own, Jean-Baptiste Onofre, has once again finished in
>> the
>> >> top 5 committers across all projects in the Apache Software
>> Foundation. This
>> >> year, JB finished as #3, with 2,142 commits, among 6,504 committers.
>> >> Congrats JB!
>> >>
>> >> Happy New Year -- and I hope to see you out and about in the next few
>> >> months!
>> >>
>> >> Davor
>> >>
>> >> [1] https://blogs.apache.org/foundation/entry/apache-in-2017-by-the
>> >>
>> >> On Mon, Jan 1, 2018 at 8:41 AM, Jesse Anderson <
>> je...@bigdatainstitute.io>
>> >> wrote:
>> >>>
>> >>> Happy New Year!
>> >>>
>> >>>
>> >>> On Sun, Dec 31, 2017, 11:09 PM Jean-Baptiste Onofré 
>> >>> wrote:
>> 
>>  Hi beamers,
>> 
>>  I wish you a great and happy new year !
>> 
>>  Regards
>>  JB
>>  --
>>  Jean-Baptiste Onofré
>>  jbono...@apache.org
>>  http://blog.nanthrax.net
>>  Talend - http://www.talend.com
>> >>
>> >>
>> >
>>
>
>


Re: [INFO] Spark runner itests fail

2018-01-02 Thread Kenneth Knowles
Figure anything out? If it keeps happening, maybe we could gather knowledge
about it on a JIRA.

Kenn

On Sun, Dec 31, 2017 at 11:18 PM, Jean-Baptiste Onofré 
wrote:

> Hi,
>
> The Spark runner itests fails on Jenkins for some pull requests.
>
> I'm trying to reproduce locally (it seems to be random).
>
> I keep you posted.
>
> Regards
> JB
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>


Re: Happy new year

2018-01-02 Thread Kenneth Knowles
Happy new year! Very excited for 2018's possibilities. And nice work, JB.

Kenn

On Tue, Jan 2, 2018 at 6:42 AM, Ismaël Mejía  wrote:

> Happy new year everyone !
>
> Agree 100% with Davor. 2017 was a good year for the project and it is
> worth to thank everyone who helped to make the project better.
> Now is the time to work to have a great project in 2018 too !
>
> Best wishes to all (and kudos to JB too for the top 5) !
>
> Ismaël
>
>
>
> On Mon, Jan 1, 2018 at 9:29 PM, David Sabater Dinter
>  wrote:
> > Happy New year all!
> >
> > On Mon, 1 Jan 2018 at 19:43, Davor Bonaci  wrote:
> >>
> >> Hi everyone --
> >> As we begin the new year, I wanted to send the best wishes in 2018 to
> >> everyone in the Beam community -- users, contributors and observers
> alike!
> >>
> >> There's so much to be proud of in 2017; including graduation to a
> >> top-level project and the availability of the first stable release.
> Thanks
> >> to everyone for making this possible!
> >>
> >> Finally, I'd also like to pass along some fun facts compiled by others
> >> [1]. Beam mailing lists had the 9th highest volume among all user@+dev@
> >> lists. Our very own, Jean-Baptiste Onofre, has once again finished in
> the
> >> top 5 committers across all projects in the Apache Software Foundation.
> This
> >> year, JB finished as #3, with 2,142 commits, among 6,504 committers.
> >> Congrats JB!
> >>
> >> Happy New Year -- and I hope to see you out and about in the next few
> >> months!
> >>
> >> Davor
> >>
> >> [1] https://blogs.apache.org/foundation/entry/apache-in-2017-by-the
> >>
> >> On Mon, Jan 1, 2018 at 8:41 AM, Jesse Anderson <
> je...@bigdatainstitute.io>
> >> wrote:
> >>>
> >>> Happy New Year!
> >>>
> >>>
> >>> On Sun, Dec 31, 2017, 11:09 PM Jean-Baptiste Onofré 
> >>> wrote:
> 
>  Hi beamers,
> 
>  I wish you a great and happy new year !
> 
>  Regards
>  JB
>  --
>  Jean-Baptiste Onofré
>  jbono...@apache.org
>  http://blog.nanthrax.net
>  Talend - http://www.talend.com
> >>
> >>
> >
>


Re: Euphoria Java 8 DSL - proposal

2018-01-02 Thread Tyler Akidau
+1, I'm supportive of seeing this move forward. What remaining concrete
concerns are there?

-Tyler


On Tue, Jan 2, 2018 at 8:35 AM David Morávek 
wrote:

> Hello JB,
>
> can we help in any way to move things forward?
>
> Thanks,
> D.
>
> On Mon, Dec 18, 2017 at 4:28 PM, Jean-Baptiste Onofré 
> wrote:
>
>> Thanks Jan,
>>
>> It makes sense.
>>
>> Let me take a look on the code to understand the "interaction".
>>
>> Regards
>> JB
>>
>>
>> On 12/18/2017 04:26 PM, Jan Lukavský wrote:
>>
>>> Hi JB,
>>>
>>> basically you are not wrong. The project started about three or four
>>> years ago with a goal to unify batch and streaming processing into single
>>> portable, executor independent API. Because of that, it is currently
>>> "close" to Beam in this sense. But we don't see much added value keeping
>>> this as a separate project, with one of the key differences to be the API
>>> (not the model itself), so we would like to focus on translation from
>>> Euphoria API to Beam's SDK. That's why we would like to see it as a DSL, so
>>> that it would be possible to use Euphoria API with Beam's runners as much
>>> natively as possible.
>>>
>>> I hope I didn't make the subject even more unclear, if so, I'll be happy
>>> to explain anything in more detail. :-)
>>>
>>>Jan
>>>
>>>
>>> On 12/18/2017 04:08 PM, Jean-Baptiste Onofré wrote:
>>>
 Hi Jan,

 Thanks for your answers.

 However, they confused me ;)

 Regarding what you replied, Euphoria seems like a programming model/SDK
 "close" to Beam more than a DSL on top of an existing Beam SDK.

 Am I wrong ?

 Regards
 JB

 On 12/18/2017 03:44 PM, Jan Lukavský wrote:

> Hi Ismael,
>
> basically we adopted the Beam's design regarding partitioning (
> https://github.com/seznam/euphoria/issues/160) and implemented the
> sorting manually (https://github.com/seznam/euphoria/issues/158). I'm
> not aware of the time model differences (Euphoria supports ingestion and
> event time, we don't support processing time by decision). Regarding other
> differences (looking into Beam capability matrix, I'd say that):
>
>   - we don't support stateful FlatMap (i.e. ParDo) for now (
> https://github.com/seznam/euphoria/issues/192)
>
>   - we don't support side inputs (by decision now, but might be
> reconsidered) and outputs (
> https://github.com/seznam/euphoria/issues/124)
>
>   - we support complete event-time windows (non-merging, merging,
> aligned, unaligned) and time control
>
>   - we don't support processing time by decision (might be
> reconsidered if a valid use-case is found)
>
>   - we support window triggering based on both time and data,
> including discarding and accumulating (without accumulating & retracting)
>
> All our executors (runners) - Flink, Spark and Local - implement the
> complete model, which we enforce using "operator test kit" that all
> executors must pass. Spark executor supports bounded sources only (for
> now). As David said, we currently don't have serialization abstraction, so
> there is some work to be done in that regard.
>
> Our intention is to completely supersede Euphoria, we would like to
> consider possibility to use executors that would not rely on Beam, but 
> that
> is optional now and should be straightforward.
>
> We'd be happy to answer any more questions you might have and thanks a
> lot!
>
> Best,
>
>   Jan
>
>
> On 12/18/2017 03:19 PM, Ismaël Mejía wrote:
>
>> Hi,
>>
>> It is great to see that you guys have achieved a maturity point to
>> propose this. Congratulations for your work and the idea to contribute
>> it into Beam.
>>
>> I remember from a previous discussion with Jan about the model
>> mismatch between Euphoria and Beam, because of some design decisions
>> of both projects. I remember you guys had some issues with the way
>> Beam's sources do partitioning, as well as Beam's lack of sorted data
>> (on shuffle a la hadoop). Also if I remember well the 'time' model of
>> Euphoria was simpler than Beam's. I talk about all of this because I
>> am curious about what parts of the Euphoria model you guys had to
>> sacrifice to support Beam, and what parts of Beam's model should still
>> be integrated into Euphoria (and if there is a straightforward path to
>> do it).
>>
>> If I understand well if this gets merged into Apache this means that
>> Euphoria's current implementation would be superseded by this DSL? I
>> am curious because I would like to understand your level of investment
>> on supporting the future of this DSL.
>>
>> Thanks and congrats again !
>> Ismaël
>>
>> On Mon, Dec 18, 2017 at 10:12 AM, Jean-Baptiste Onofré <
>> 

Re: Euphoria Java 8 DSL - proposal

2018-01-02 Thread David Morávek
Hello JB,

can we help in any way to move things forward?

Thanks,
D.

On Mon, Dec 18, 2017 at 4:28 PM, Jean-Baptiste Onofré 
wrote:

> Thanks Jan,
>
> It makes sense.
>
> Let me take a look on the code to understand the "interaction".
>
> Regards
> JB
>
>
> On 12/18/2017 04:26 PM, Jan Lukavský wrote:
>
>> Hi JB,
>>
>> basically you are not wrong. The project started about three or four
>> years ago with a goal to unify batch and streaming processing into single
>> portable, executor independent API. Because of that, it is currently
>> "close" to Beam in this sense. But we don't see much added value keeping
>> this as a separate project, with one of the key differences to be the API
>> (not the model itself), so we would like to focus on translation from
>> Euphoria API to Beam's SDK. That's why we would like to see it as a DSL, so
>> that it would be possible to use Euphoria API with Beam's runners as much
>> natively as possible.
>>
>> I hope I didn't make the subject even more unclear, if so, I'll be happy
>> to explain anything in more detail. :-)
>>
>>Jan
>>
>>
>> On 12/18/2017 04:08 PM, Jean-Baptiste Onofré wrote:
>>
>>> Hi Jan,
>>>
>>> Thanks for your answers.
>>>
>>> However, they confused me ;)
>>>
>>> Regarding what you replied, Euphoria seems like a programming model/SDK
>>> "close" to Beam more than a DSL on top of an existing Beam SDK.
>>>
>>> Am I wrong ?
>>>
>>> Regards
>>> JB
>>>
>>> On 12/18/2017 03:44 PM, Jan Lukavský wrote:
>>>
 Hi Ismael,

 basically we adopted the Beam's design regarding partitioning (
 https://github.com/seznam/euphoria/issues/160) and implemented the
 sorting manually (https://github.com/seznam/euphoria/issues/158). I'm
 not aware of the time model differences (Euphoria supports ingestion and
 event time, we don't support processing time by decision). Regarding other
 differences (looking into Beam capability matrix, I'd say that):

   - we don't support stateful FlatMap (i.e. ParDo) for now (
 https://github.com/seznam/euphoria/issues/192)

   - we don't support side inputs (by decision now, but might be
 reconsidered) and outputs (https://github.com/seznam/eup
 horia/issues/124)

   - we support complete event-time windows (non-merging, merging,
 aligned, unaligned) and time control

   - we don't support processing time by decision (might be reconsidered
 if a valid use-case is found)

   - we support window triggering based on both time and data, including
 discarding and accumulating (without accumulating & retracting)

 All our executors (runners) - Flink, Spark and Local - implement the
 complete model, which we enforce using "operator test kit" that all
 executors must pass. Spark executor supports bounded sources only (for
 now). As David said, we currently don't have serialization abstraction, so
 there is some work to be done in that regard.

 Our intention is to completely supersede Euphoria, we would like to
 consider possibility to use executors that would not rely on Beam, but that
 is optional now and should be straightforward.

 We'd be happy to answer any more questions you might have and thanks a
 lot!

 Best,

   Jan


 On 12/18/2017 03:19 PM, Ismaël Mejía wrote:

> Hi,
>
> It is great to see that you guys have achieved a maturity point to
> propose this. Congratulations for your work and the idea to contribute
> it into Beam.
>
> I remember from a previous discussion with Jan about the model
> mismatch between Euphoria and Beam, because of some design decisions
> of both projects. I remember you guys had some issues with the way
> Beam's sources do partitioning, as well as Beam's lack of sorted data
> (on shuffle a la hadoop). Also if I remember well the 'time' model of
> Euphoria was simpler than Beam's. I talk about all of this because I
> am curious about what parts of the Euphoria model you guys had to
> sacrifice to support Beam, and what parts of Beam's model should still
> be integrated into Euphoria (and if there is a straightforward path to
> do it).
>
> If I understand well if this gets merged into Apache this means that
> Euphoria's current implementation would be superseded by this DSL? I
> am curious because I would like to understand your level of investment
> on supporting the future of this DSL.
>
> Thanks and congrats again !
> Ismaël
>
> On Mon, Dec 18, 2017 at 10:12 AM, Jean-Baptiste Onofré <
> j...@nanthrax.net> wrote:
>
>> Depending of the donation, you would need ICLA for each contributor,
>> and
>> CCLA in addition of SGA.
>>
>> We can sync with Davor and I for the legal stuff.
>> However, I would wait a little bit just to have feedback from the
>> whole team
>> and start a formal 

Re: Happy new year

2018-01-02 Thread Ismaël Mejía
Happy new year everyone !

Agree 100% with Davor. 2017 was a good year for the project and it is
worth to thank everyone who helped to make the project better.
Now is the time to work to have a great project in 2018 too !

Best wishes to all (and kudos to JB too for the top 5) !

Ismaël



On Mon, Jan 1, 2018 at 9:29 PM, David Sabater Dinter
 wrote:
> Happy New year all!
>
> On Mon, 1 Jan 2018 at 19:43, Davor Bonaci  wrote:
>>
>> Hi everyone --
>> As we begin the new year, I wanted to send the best wishes in 2018 to
>> everyone in the Beam community -- users, contributors and observers alike!
>>
>> There's so much to be proud of in 2017; including graduation to a
>> top-level project and the availability of the first stable release. Thanks
>> to everyone for making this possible!
>>
>> Finally, I'd also like to pass along some fun facts compiled by others
>> [1]. Beam mailing lists had the 9th highest volume among all user@+dev@
>> lists. Our very own, Jean-Baptiste Onofre, has once again finished in the
>> top 5 committers across all projects in the Apache Software Foundation. This
>> year, JB finished as #3, with 2,142 commits, among 6,504 committers.
>> Congrats JB!
>>
>> Happy New Year -- and I hope to see you out and about in the next few
>> months!
>>
>> Davor
>>
>> [1] https://blogs.apache.org/foundation/entry/apache-in-2017-by-the
>>
>> On Mon, Jan 1, 2018 at 8:41 AM, Jesse Anderson 
>> wrote:
>>>
>>> Happy New Year!
>>>
>>>
>>> On Sun, Dec 31, 2017, 11:09 PM Jean-Baptiste Onofré 
>>> wrote:

 Hi beamers,

 I wish you a great and happy new year !

 Regards
 JB
 --
 Jean-Baptiste Onofré
 jbono...@apache.org
 http://blog.nanthrax.net
 Talend - http://www.talend.com
>>
>>
>