Re: Custom 2.20 failing on Dataflow: what am I doing wrong?

2020-02-16 Thread Alex Van Boxel
Yes, running it manually with the normal parameters as I do for production
Dataflow. I'm probably a bit ignorant on that, and I probably need to
provide my own worker.

Thanks for the hint... I'll dive into that.

 _/
_/ Alex Van Boxel


On Mon, Feb 17, 2020 at 8:16 AM Reuven Lax  wrote:

> Are you running things manually? This probably means you are using an
> out-of-date Dataflow worker. I believe that all tests on Jenkins will build
> the Dataflow worker from head to prevent exactly this problem.
>
> On Sun, Feb 16, 2020 at 11:10 PM Alex Van Boxel  wrote:
>
>> Digging further in the traces, it seems like a result of changes to the
>> model:
>>
>> Caused by: java.lang.ClassNotFoundException:
>> org.apache.beam.model.pipeline.v1.StandardWindowFns$SessionsPayload$Enum
>>
>> I see changes by Lukasz Cwik. Will this be a problem for the release?
>>
>>  _/
>> _/ Alex Van Boxel
>>
>>
>> On Sun, Feb 16, 2020 at 12:11 PM Alex Van Boxel  wrote:
>>
>>> Hey,
>>>
>>> I'm testing my own PR's against Dataflow, something I've done in the
>>> past with success seem to fail now. I get this error:
>>>
>>> java.lang.NoClassDefFoundError: Could not initialize class
>>> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.construction.WindowingStrategyTranslation
>>>
>>>1.
>>>
>>>
>>> Am I doing something wrong?
>>>
>>>  _/
>>> _/ Alex Van Boxel
>>>
>>


Re: Custom 2.20 failing on Dataflow: what am I doing wrong?

2020-02-16 Thread Reuven Lax
Are you running things manually? This probably means you are using an
out-of-date Dataflow worker. I believe that all tests on Jenkins will build
the Dataflow worker from head to prevent exactly this problem.

On Sun, Feb 16, 2020 at 11:10 PM Alex Van Boxel  wrote:

> Digging further in the traces, it seems like a result of changes to the
> model:
>
> Caused by: java.lang.ClassNotFoundException:
> org.apache.beam.model.pipeline.v1.StandardWindowFns$SessionsPayload$Enum
>
> I see changes by Lukasz Cwik. Will this be a problem for the release?
>
>  _/
> _/ Alex Van Boxel
>
>
> On Sun, Feb 16, 2020 at 12:11 PM Alex Van Boxel  wrote:
>
>> Hey,
>>
>> I'm testing my own PR's against Dataflow, something I've done in the past
>> with success seem to fail now. I get this error:
>>
>> java.lang.NoClassDefFoundError: Could not initialize class
>> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.construction.WindowingStrategyTranslation
>>
>>1.
>>
>>
>> Am I doing something wrong?
>>
>>  _/
>> _/ Alex Van Boxel
>>
>


Re: Custom 2.20 failing on Dataflow: what am I doing wrong?

2020-02-16 Thread Alex Van Boxel
Digging further in the traces, it seems like a result of changes to the
model:

Caused by: java.lang.ClassNotFoundException:
org.apache.beam.model.pipeline.v1.StandardWindowFns$SessionsPayload$Enum

I see changes by Lukasz Cwik. Will this be a problem for the release?

 _/
_/ Alex Van Boxel


On Sun, Feb 16, 2020 at 12:11 PM Alex Van Boxel  wrote:

> Hey,
>
> I'm testing my own PR's against Dataflow, something I've done in the past
> with success seem to fail now. I get this error:
>
> java.lang.NoClassDefFoundError: Could not initialize class
> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.construction.WindowingStrategyTranslation
>
>1.
>
>
> Am I doing something wrong?
>
>  _/
> _/ Alex Van Boxel
>


Re: [REQUEST] Workshop at Data Day in Mexico City - March 26th

2020-02-16 Thread Austin Bennett
Ismael,

I'm also putting together training exercises for O'Reilly (ex:
https://conferences.oreilly.com/strata-data-ai/stai-ca/public/schedule/detail/83505),
and to offer online versions and extensions as well.  My criteria for
involvement with them has been ability to reuse and share produced
materials.

I am not a competent enough Spanish speaker (among other things, an area I
should retrain or study more!).  Happy to figure out ways to help/share the
things am putting together.  No need to reinvent the wheel.

Cheers,
Austin



On Fri, Feb 14, 2020, 3:57 PM Ismaël Mejía  wrote:

> Hello,
>
> I am interested and I can make it in spanish if it is easier for the
> audience.
>
> Regards,
> Ismaël
>
> On Sat, Feb 15, 2020 at 12:42 AM María Cruz  wrote:
>
>> Hello everyone,
>>
>> There is an opportunity to host an Apache Beam workshop at the next Data
>> Day in Mexico City *[1]*. The event will take place on Thursday, March
>> 26th. This is a great opportunity to expand the reach of Apache Beam in a
>> growing market like Latin America!
>>
>> If you are looking for more leadership opportunities, this is the perfect
>> opportunity for you. The workshop can be presented either in English or in
>> Spanish. If you are interested, Google will help you create the content for
>> the workshop, drawing on some available training resources, like Apache
>> Beam Katas *[2]* and previous workshops that have already been
>> developed.
>>
>> This workshop is an important contribution that will bring more diversity
>> to the project, making it more accessible and expanding its user and
>> potentially its contributor base as well. Google Open Source will cover the
>> cost of sponsorship. More information about Data Day below. Please contact
>> us if you have any questions or comments.
>>
>> ---
>>
>> About Data Day in Ciudad de México
>>
>>
>>-
>>
>>What: A one-day conference focused on data science in enterprise
>>scenarios. The event has been organized since 2016; this will be the 6th
>>edition. Data Day has become the leading event for Data Professionals in
>>Mexico.
>>-
>>
>>When: March 26, 2020
>>-
>>
>>Where: Mexico City, Mexico.
>>-
>>
>>Who will you find there: business executives, data professionals, IT
>>professionals.
>>-
>>
>>2019 attendance: 440 participants (36% data scientist / analyst, 6%
>>executive level C, 15% data manager, 43% developer or other)
>>-
>>
>>2020 expected attendance: 450 participants.
>>
>> Looking forward to hearing from you all!
>>
>> Thanks,
>>
>> María
>>
>> [1] https://sg.com.mx/dataday/agenda/
>>
>> [2] https://beam.apache.org/blog/2019/05/30/beam-kata-release.html
>>
>>


Custom 2.20 failing on Dataflow: what am I doing wrong?

2020-02-16 Thread Alex Van Boxel
Hey,

I'm testing my own PR's against Dataflow, something I've done in the past
with success seem to fail now. I get this error:

java.lang.NoClassDefFoundError: Could not initialize class
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.construction.WindowingStrategyTranslation

   1.


Am I doing something wrong?

 _/
_/ Alex Van Boxel


Re: Please comment on draft comms strategy by Oct 16

2020-02-16 Thread Matthias Baetens
Hey Maria,

Great work!! This provides a nice overview of all the channels and how they
contribute to the overall project.

I will comment on the sections I am most familiar with:
- Website / blog: +1 for reducing the number of outlets to 1. The Beam
Summit website repository could use a clean-up and this can be part of that
effort.
- Workshops: Pablo & Austin have done some great work in this space, but
even more dedicated time & attention to this would greatly benefit the
project - and as Pablo raised this might be separate from project
transparency but directly contribute to increased usage.
-- For topics/format, it's split up in how to use and how to contribute.
How to use I would split up in a beginner training ("set-up and write your
first pipeline"), advanced training ("use some of the more exotic Beam
features / optimise your pipeline") and office hours (solve specific
problems people have which can feedback to the development team as well in
terms of missing features or potential improvement of documentation). Imho,
classroom style with an intro to Beam and concepts / advanced topics,
combined with do-it-yourself where people actually do some coding with
something to walk away is good for onboarding. A more fun and game-like
set-up can be done for users that already use Beam where we can accommodate
discussions or have them solve a specific use-case in groups.
- +1 for increased distribution of the talks. We should actively engage the
speakers as well to share it on their personal channels, and surface the
talks on the website and Twitter handles. Also a curated list as blogpost
would be a great addition.

I think a more consistent feedback mechanism and maybe idea box at events
might also benefit the community.

Cheers,
Matthias


On Wed, 15 Jan 2020 at 22:46, Austin Bennett 
wrote:

> Hi Kenn,
>
> We had workshop on this very topic (how to contribute to Beam) at our
> Berlin Summit:  https://www.youtube.com/watch?v=PtPslSdAPcM There's is
> certainly room for me (or anyone) to cleanup and formalize that a bit
> more.  Though, the views of that are relatively small, which either indeed
> points to a lack of appetite and/or that it wasn't well publicized (I
> suspect both).
>
> Cheers,
> Austin
>
> On Thu, Jan 9, 2020 at 6:53 PM Kenneth Knowles  wrote:
>
>> Wow, this is great work. I looked at the graphical maps when you sent
>> them but forgot to reply on thread. They really distill a lot of
>> possibilities and help to think about the current state.
>>
>> These three action items seem good and doable. Thanks for highlighting
>> those. The only one that isn't obvious to me is "workshop on how to
>> contribute to Beam". Is there enough appetite / audience to make this a
>> workshop? What forms could this take? A live coding demonstration in a
>> normal talk slot at an OSS or data conference seems like a possibility.
>> Whatever we do, we should record and distribute for sure, because when
>> someone wants to contribute, they need to find the resources at that moment.
>>
>> Kenn
>>
>> On Wed, Jan 8, 2020 at 1:01 PM María Cruz  wrote:
>>
>>> Hi everyone,
>>> I'm writing to send an update about the communication strategy for Beam.
>>> In a nutshell, I have 3 proposed changes (copied from the md file here:
>>> https://github.com/macruzbar/beam/blob/master/Communication-strategy-DRAFT.md
>>> ).
>>>
>>> While all the channels are connected to a specific function in the
>>> short, medium, and long term, some areas have redundancies, and some other
>>> areas could use more exposure. In order to continue to grow the project,
>>> there are 3 proposals we need to focus on (click on the link on each
>>> section to read more):
>>>
>>> 1. Blog post categories, frequency and distribution.
>>> 
>>>  Reduce
>>> to one blogging space (on Beam Website). Incorporate 3 categories to the
>>> blog: Apache Beam summit, Apache Beam use cases, and Your journey as a
>>> contributor.
>>>
>>> 2. Develop more in-person and digital workshops.
>>> 
>>>  Two
>>> workshop types: how to use Beam, and how to contribute to Beam.
>>>
>>> 3. Increase distribution of tech talks.
>>> 
>>>
>>>- Embed tech talks in the Beam website, and the Beam Summit website,
>>>- Share talks on @ApacheBeam Twitter handle
>>>- Curate a list of talks by topic, and write blog posts to share
>>>curated talks (1 blog every 3 months),
>>>- Distribute copy via email to users@ and dev@ mailing lists.
>>>- Create playlists on YouTube channel. Create one view for
>>>subscribed users (featuring latest content), and one for non-subscribed
>>>users 

Re: Contributor permission for Beam Jira tickets

2020-02-16 Thread Ismaël Mejía
done

On Sun, Feb 16, 2020 at 9:02 AM Yu Watanabe  wrote:

> Hi,
>
> This is Yu Watanabe from Creationline.
> I would like to contribute issues related to elasticsearch and kafka.
> Can someone add me as a contributor for Beam's Jira issue
> tracker? I would like to create/assign tickets for my work.
>
> My username in JIRA is "y-watanabe"
>
> Thanks,
> Yu
>
> --
> Yu Watanabe
>
> linkedin: www.linkedin.com/in/yuwatanabe1/
> twitter:   twitter.com/yuwtennis
>


Re: [DISCUSS] BIP reloaded

2020-02-16 Thread Alex Van Boxel
OK, I've reordered and tweaked it a bit based on your suggestions, but I'm
going to stop here. I'm spending far more time on this than the
implementation.

I did add an open issues section though that people can add issues too that
can be discussed and voted on. Those that make sense?

 _/
_/ Alex Van Boxel


On Mon, Feb 10, 2020 at 9:57 AM Jan Lukavský  wrote:

> Hi Alex,
>
> because it would be super cool, to create a template from the BIP, I'd
> suggest a few minor changes:
>
>  - can we make motivation, current state, alternatives and implementation
> the same level headings?
>
>  - regarding the ordering - in my point of view it makes sense to first
> define problem (motivation + current state), then to elaborate on _all_
> options we have to solve the defined problem and then to make a choice
> (that would be motivation -> current state -> implementation options ->
> choice on an option). But I agree that once the section is called
> 'alternatives' (maybe even 'rejected alternatives') it makes more sense to
> have it _after_ the choice. But the naming might be just a matter of taste,
> so this might be sorted out later.
>
>  - a small fact note - because the BIP should make people ideally involved
> in voting process, it should be as explanatory as possible - I'm not
> feeling to be expert on schemas, so it would help me a little more context
> and maybe example of the "rejected alternatives", how it would look like,
> so that one can make a decision even when not being involved with schema on
> a daily basis. Your explanation is probably well understood by people who
> are experts in the area, but maybe might somewhat limit the audience.
>
> What do you think?
>
>  Jan
> On 2/9/20 9:19 PM, Alex Van Boxel wrote:
>
> a = motivation
> b => *added current state in Beam*
> c = alternatives
> d = implementation *(I prefer this to define before the alternatives)*
> e = *rest of document?*
>
>  _/
> _/ Alex Van Boxel
>
>
> On Sun, Feb 9, 2020 at 7:50 PM Jan Lukavský  wrote:
>
>> It's absolutely fine. :-) I think that the scope and quality of your
>> document suits very well for the first BIP.
>>
>> What I would find generally useful is a general structure that would be
>> something like:
>>
>>  a) definition of the problem
>>
>>  b) explanation why current Beam options don't fit well for the problem
>> defined at a)
>>
>>  c) ideally exhaustive list of possible solutions
>>
>>  d) choose of an option from c) with justification of the choice
>>
>>  e) implementation notes specific to the choice in d)
>>
>> I find mostly the point d) essential, because that can be used as a base
>> for vote (that is, if the community agrees that the list of options is
>> exhaustive and that the chosen solution is the best one possible) for
>> promoting a BIP from proposed to accepted.
>>
>> Does that make sense in your case?
>>
>>  Jan
>> On 2/9/20 7:08 PM, Alex Van Boxel wrote:
>>
>> I'm sorry, I stole the number 1 from you. Feel free to give suggestions
>> on the form, so we can get a good template for further BIPs
>>
>>  _/
>> _/ Alex Van Boxel
>>
>>
>> On Sun, Feb 9, 2020 at 6:43 PM Jan Lukavský  wrote:
>>
>>> Hi Alex,
>>>
>>> this is cool! Thanks for pushing this topic forward!
>>>
>>> Jan
>>> On 2/9/20 6:36 PM, Alex Van Boxel wrote:
>>>
>>> BIP-1 is available here:
>>> https://cwiki.apache.org/confluence/display/BEAM/%5BBIP-1%5D+Beam+Schema+Options
>>>
>>>  _/
>>> _/ Alex Van Boxel
>>>
>>>
>>> On Sat, Feb 1, 2020 at 9:11 PM Kenneth Knowles  wrote:
>>>
 Sounds great. If you scrape recent dev@ for proposals that are not yet
 implemented, I think you will find some, and you could ask them to add as a
 BIP if they are still interested.

 Kenn

 On Sat, Feb 1, 2020 at 1:11 AM Jan Lukavský  wrote:

> Hi Kenn,
>
> yes, I can do that. I think that there should be at least one first
> BIP, I can try to setup one. But (as opposed to my previous proposal) I'll
> try to setup a fresh one, not the one of [BEAM-8550], because that one
> already has a PR and rebasing the PR on master for such a long period (and
> it is likely, that final polishing of the BIP process will take several
> more months) starts to be costly. I have in mind two fresh candidates, so
> I'll pick one of them. I think that only setuping a cwiki would not start
> the process, we need a real-life example of a BIP included in that.
>
> Does that sound ok?
>
>  Jan
> On 2/1/20 5:55 AM, Kenneth Knowles wrote:
>
> These stages sound like a great starting point to me. Would you be the
> volunteer to set up a cwiki page for BIPs?
>
> Kenn
>
> On Mon, Jan 20, 2020 at 3:30 AM Jan Lukavský  wrote:
>
>> I agree that we can take inspiration from other projects. Besides the
>> organizational part (what should be part of BIP, where to store it, how 
>> to
>> edit it and how to make it available to the whole community) - for which
>> 

Contributor permission for Beam Jira tickets

2020-02-16 Thread Yu Watanabe
Hi,

This is Yu Watanabe from Creationline.
I would like to contribute issues related to elasticsearch and kafka.
Can someone add me as a contributor for Beam's Jira issue
tracker? I would like to create/assign tickets for my work.

My username in JIRA is "y-watanabe"

Thanks,
Yu

-- 
Yu Watanabe

linkedin: www.linkedin.com/in/yuwatanabe1/
twitter:   twitter.com/yuwtennis