Re: [ANNOUNCE] New committer: Michał Walenia

2020-01-27 Thread Kamil Wasilewski
Congrats, Michał!

On Tue, Jan 28, 2020 at 3:03 AM Udi Meiri  wrote:

> Congratulations Michał!
>
> On Mon, Jan 27, 2020 at 3:49 PM Chamikara Jayalath 
> wrote:
>
>> Congrats Michał!
>>
>> On Mon, Jan 27, 2020 at 2:59 PM Reza Rokni  wrote:
>>
>>> Congratulations buddy!
>>>
>>> On Tue, 28 Jan 2020, 06:52 Valentyn Tymofieiev, 
>>> wrote:
>>>
 Congratulations, Michał!

 On Mon, Jan 27, 2020 at 2:24 PM Austin Bennett <
 whatwouldausti...@gmail.com> wrote:

> Nice -- keep up the good work!
>
> On Mon, Jan 27, 2020 at 2:02 PM Mikhail Gryzykhin 
> wrote:
> >
> > Congratulations Michal!
> >
> > --Mikhail
> >
> > On Mon, Jan 27, 2020 at 1:01 PM Kyle Weaver 
> wrote:
> >>
> >> Congratulations Michał! Looking forward to your future
> contributions :)
> >>
> >> Thanks,
> >> Kyle
> >>
> >> On Mon, Jan 27, 2020 at 12:47 PM Pablo Estrada 
> wrote:
> >>>
> >>> Hi everyone,
> >>>
> >>> Please join me and the rest of the Beam PMC in welcoming a new
> committer: Michał Walenia
> >>>
> >>> Michał has contributed to Beam in many ways, including the
> performance testing infrastructure, and has even spoken at events about
> Beam.
> >>>
> >>> In consideration of his contributions, the Beam PMC trusts him
> with the responsibilities of a Beam committer[1].
> >>>
> >>> Thanks for your contributions Michał!
> >>>
> >>> Pablo, on behalf of the Apache Beam PMC.
> >>>
> >>> [1]
> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>



Re: Subclassing MapElements

2020-01-27 Thread Kenneth Knowles
It might be more trouble than it is worth, saving typing but adding
complexity. Especially since you've got @AutoValue and @AutoValue.Builder
to do all the heavy lifting anyhow (
https://beam.apache.org/contribute/ptransform-style-guide/#api).

Kenn


Subclassing MapElements

2020-01-27 Thread jmac...@godaddy.com
Hi Beam Community,

Our team has a number of PTransforms that are basically wrappers around 
MapElements, which give us a concise syntax when specifying pipelines which 
leverage shared map stages. One example that we are looking at currently is a 
function which takes JSON and maps it into ProtoBuf Message object using protos 
declarative type annotation to specify metadata about how to perform the 
transform. Our proto logic is a bit custom so it might not be suitable for 
upstreaming. Our current implementations of these mapper PTransforms are pretty 
simple ‘expand’ methods which perform the mapping and reduce code clutter on 
our pipeline code. We recently found that we may want to leverage some of the 
bells and whistles of MapElements when we use these PTransforms, such as ‘ 
exceptionsInto’ etc. It seems like a clean way to do this would be to subclass 
MapElements directly and just override the expand function, but MapElements has 
a private constructor and builders etc… Im wondering if there is an existing 
way to do this, or maybe if it would be useful to make the MapElements 
constructor protected? Feels like this would be a little funny since 
MapElements uses a builder etc. Curious if anyone has a way to do what we are 
trying to do here, or hear any other thoughts on this.

-Jason


Re: [ANNOUNCE] New committer: Michał Walenia

2020-01-27 Thread Udi Meiri
Congratulations Michał!

On Mon, Jan 27, 2020 at 3:49 PM Chamikara Jayalath 
wrote:

> Congrats Michał!
>
> On Mon, Jan 27, 2020 at 2:59 PM Reza Rokni  wrote:
>
>> Congratulations buddy!
>>
>> On Tue, 28 Jan 2020, 06:52 Valentyn Tymofieiev, 
>> wrote:
>>
>>> Congratulations, Michał!
>>>
>>> On Mon, Jan 27, 2020 at 2:24 PM Austin Bennett <
>>> whatwouldausti...@gmail.com> wrote:
>>>
 Nice -- keep up the good work!

 On Mon, Jan 27, 2020 at 2:02 PM Mikhail Gryzykhin 
 wrote:
 >
 > Congratulations Michal!
 >
 > --Mikhail
 >
 > On Mon, Jan 27, 2020 at 1:01 PM Kyle Weaver 
 wrote:
 >>
 >> Congratulations Michał! Looking forward to your future contributions
 :)
 >>
 >> Thanks,
 >> Kyle
 >>
 >> On Mon, Jan 27, 2020 at 12:47 PM Pablo Estrada 
 wrote:
 >>>
 >>> Hi everyone,
 >>>
 >>> Please join me and the rest of the Beam PMC in welcoming a new
 committer: Michał Walenia
 >>>
 >>> Michał has contributed to Beam in many ways, including the
 performance testing infrastructure, and has even spoken at events about
 Beam.
 >>>
 >>> In consideration of his contributions, the Beam PMC trusts him with
 the responsibilities of a Beam committer[1].
 >>>
 >>> Thanks for your contributions Michał!
 >>>
 >>> Pablo, on behalf of the Apache Beam PMC.
 >>>
 >>> [1]
 https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer

>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Apache Beam KafkaIO Transform "go" available?

2020-01-27 Thread Daisy Wu
Thank you so much for your prompt reply.
This really helped us make design decisions for our future projects.
I appreciate your help.

Daisy Wu 

On Monday, January 27, 2020, 10:10:40 AM PST, Robert Burke 
 wrote:  
 
 +cc dev@beam
Hello Daisy!There presently isn't a Kafka transform for Go. 

The Go SDK is still experimental, largely due to scalable IO support, which is 
why the Go SDK isn't represented in the built-in io page.
There's presently no way for an SDK user to write a Streaming source in the Go 
SDK, since there's no mechanism for a DoFn to "self terminate" bundles, such as 
to allow for scalability and windowing from streaming sources. 
However, SplittableDoFns are on their way, and will eventually be the solution 
for writing these.
At present, the Beam Go SDK IOs haven't been tested and vetted for production 
use. Until the initial SplittableDoFn support is added to the Go SDK, Batch 
transforms cannot split, and can't scale beyond a single worker thread. This 
batch version should land in the next few months, and the streaming version 
land a few months after that, after which a Kafka IO can be developed. 
I wish I had better news for you, but I can say progress is being made.
Robert Burke

On Sun, Jan 26, 2020 at 10:14 PM Daisy Wu  wrote:

Hi, Robert,
I found your name from the Apache Beam WIKI page. 
I am working on building a data ingestion pipeline using Apache Beam "go" SDK.

My pipeline is to consume data from Kafka queue and persist the data to Google 
Cloud Bigtable (and/or to another Kafka topic).

So far, I have not been able to find a Kafka IO Connector (also known as Apache 
I/O Transform) written in "go" (I was able to find a java version, however).

Here's link to supported Apache Beam built-in I/O 
transforms:https://beam.apache.org/documentation/io/built-in/

I am looking for the "go" equivalent of the following Java code:
pipeline.apply("kafka_deserialization", KafkaIO.read()
.withBootstrapServers(KAFKA_BROKER)
.withTopic(KAFKA_TOPIC)
.withConsumerConfigUpdates(CONSUMER_CONFIG)
.withKeyDeserializer(StringDeserializer.class)
.withValueDeserializer(StringDeserializer.class))
Do you have any information on the availability of KafkaIO Connector/Transform 
"go" SDK/library?

Any help or information would be much appreciated.

Thank you.


Daisy Wu

  

Re: [DISCUSS][PROPOSAL] Improvements to the Apache Beam website

2020-01-27 Thread Heejong Lee
On Mon, Jan 27, 2020 at 11:19 AM Aizhamal Nurmamat kyzy 
wrote:

> Hi Alexey,
>
> Answers are inline:
>
> Do we have any user demands for documentation translation into other
>> languages? I’m asking this because, in my experience, it’s quite tough work
>> to translate everything and it won’t be always up-to-date with the
>> mainstream docs in English.
>>
>
> We know of at least one user who has been trying to grow a Beam community
> in China and translate the documentation with the local community help:
> --> https://github.com/mybeam/Apache-Beam-/tree/master/website
> -->
> https://lists.apache.org/thread.html/6b7008affee7d70aa0ef13bce7d57455c85759b0af7e08582a086f53%40%3Cdev.beam.apache.org%3E
>
> This would hopefully unblock other contributors. For translations, the
> idea is that the source of truth is the english version, and we'll make
> sure it's visible on the header of translated pages, as well as dates for
> the latest updates.
>

+1

I think localized contents help non-english speaking users a lot to spark
the interest even if the contents are somewhat out-of-date.


>
> Also, moving to another doc engine probably will require us to change a
>> format of mark-up language or not?. What are the other advantages of Docsy
>> over Jekyll?
>>
>
> We will have to make small tweaks to the Jekyll MD files, but as Brian
> pointed out in the old thread we can use some tools to automate the process:
> -->   https://gohugo.io/commands/hugo_import_jekyll/
>
> I’d also suggest to improve Beam site context search to be able to
>> differentiate search queries over user documentation and/or API references.
>>
> +1. Will add this as a work item.
>
>


Re: [ANNOUNCE] New committer: Michał Walenia

2020-01-27 Thread Chamikara Jayalath
Congrats Michał!

On Mon, Jan 27, 2020 at 2:59 PM Reza Rokni  wrote:

> Congratulations buddy!
>
> On Tue, 28 Jan 2020, 06:52 Valentyn Tymofieiev, 
> wrote:
>
>> Congratulations, Michał!
>>
>> On Mon, Jan 27, 2020 at 2:24 PM Austin Bennett <
>> whatwouldausti...@gmail.com> wrote:
>>
>>> Nice -- keep up the good work!
>>>
>>> On Mon, Jan 27, 2020 at 2:02 PM Mikhail Gryzykhin 
>>> wrote:
>>> >
>>> > Congratulations Michal!
>>> >
>>> > --Mikhail
>>> >
>>> > On Mon, Jan 27, 2020 at 1:01 PM Kyle Weaver 
>>> wrote:
>>> >>
>>> >> Congratulations Michał! Looking forward to your future contributions
>>> :)
>>> >>
>>> >> Thanks,
>>> >> Kyle
>>> >>
>>> >> On Mon, Jan 27, 2020 at 12:47 PM Pablo Estrada 
>>> wrote:
>>> >>>
>>> >>> Hi everyone,
>>> >>>
>>> >>> Please join me and the rest of the Beam PMC in welcoming a new
>>> committer: Michał Walenia
>>> >>>
>>> >>> Michał has contributed to Beam in many ways, including the
>>> performance testing infrastructure, and has even spoken at events about
>>> Beam.
>>> >>>
>>> >>> In consideration of his contributions, the Beam PMC trusts him with
>>> the responsibilities of a Beam committer[1].
>>> >>>
>>> >>> Thanks for your contributions Michał!
>>> >>>
>>> >>> Pablo, on behalf of the Apache Beam PMC.
>>> >>>
>>> >>> [1]
>>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>>
>>


Re: [ANNOUNCE] New committer: Michał Walenia

2020-01-27 Thread Reza Rokni
Congratulations buddy!

On Tue, 28 Jan 2020, 06:52 Valentyn Tymofieiev,  wrote:

> Congratulations, Michał!
>
> On Mon, Jan 27, 2020 at 2:24 PM Austin Bennett <
> whatwouldausti...@gmail.com> wrote:
>
>> Nice -- keep up the good work!
>>
>> On Mon, Jan 27, 2020 at 2:02 PM Mikhail Gryzykhin 
>> wrote:
>> >
>> > Congratulations Michal!
>> >
>> > --Mikhail
>> >
>> > On Mon, Jan 27, 2020 at 1:01 PM Kyle Weaver 
>> wrote:
>> >>
>> >> Congratulations Michał! Looking forward to your future contributions :)
>> >>
>> >> Thanks,
>> >> Kyle
>> >>
>> >> On Mon, Jan 27, 2020 at 12:47 PM Pablo Estrada 
>> wrote:
>> >>>
>> >>> Hi everyone,
>> >>>
>> >>> Please join me and the rest of the Beam PMC in welcoming a new
>> committer: Michał Walenia
>> >>>
>> >>> Michał has contributed to Beam in many ways, including the
>> performance testing infrastructure, and has even spoken at events about
>> Beam.
>> >>>
>> >>> In consideration of his contributions, the Beam PMC trusts him with
>> the responsibilities of a Beam committer[1].
>> >>>
>> >>> Thanks for your contributions Michał!
>> >>>
>> >>> Pablo, on behalf of the Apache Beam PMC.
>> >>>
>> >>> [1]
>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>
>


Re: [ANNOUNCE] New committer: Michał Walenia

2020-01-27 Thread Valentyn Tymofieiev
Congratulations, Michał!

On Mon, Jan 27, 2020 at 2:24 PM Austin Bennett 
wrote:

> Nice -- keep up the good work!
>
> On Mon, Jan 27, 2020 at 2:02 PM Mikhail Gryzykhin 
> wrote:
> >
> > Congratulations Michal!
> >
> > --Mikhail
> >
> > On Mon, Jan 27, 2020 at 1:01 PM Kyle Weaver  wrote:
> >>
> >> Congratulations Michał! Looking forward to your future contributions :)
> >>
> >> Thanks,
> >> Kyle
> >>
> >> On Mon, Jan 27, 2020 at 12:47 PM Pablo Estrada 
> wrote:
> >>>
> >>> Hi everyone,
> >>>
> >>> Please join me and the rest of the Beam PMC in welcoming a new
> committer: Michał Walenia
> >>>
> >>> Michał has contributed to Beam in many ways, including the performance
> testing infrastructure, and has even spoken at events about Beam.
> >>>
> >>> In consideration of his contributions, the Beam PMC trusts him with
> the responsibilities of a Beam committer[1].
> >>>
> >>> Thanks for your contributions Michał!
> >>>
> >>> Pablo, on behalf of the Apache Beam PMC.
> >>>
> >>> [1]
> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>


Beam Meetup LA -- KICKOFF (10 March)

2020-01-27 Thread Austin Bennett
Come join the community kicking off in LA (in person) on 10 March:
https://www.meetup.com/Los-Angeles-Apache-Beam/events/268207085/


Re: [ANNOUNCE] New committer: Michał Walenia

2020-01-27 Thread Austin Bennett
Nice -- keep up the good work!

On Mon, Jan 27, 2020 at 2:02 PM Mikhail Gryzykhin  wrote:
>
> Congratulations Michal!
>
> --Mikhail
>
> On Mon, Jan 27, 2020 at 1:01 PM Kyle Weaver  wrote:
>>
>> Congratulations Michał! Looking forward to your future contributions :)
>>
>> Thanks,
>> Kyle
>>
>> On Mon, Jan 27, 2020 at 12:47 PM Pablo Estrada  wrote:
>>>
>>> Hi everyone,
>>>
>>> Please join me and the rest of the Beam PMC in welcoming a new committer: 
>>> Michał Walenia
>>>
>>> Michał has contributed to Beam in many ways, including the performance 
>>> testing infrastructure, and has even spoken at events about Beam.
>>>
>>> In consideration of his contributions, the Beam PMC trusts him with the 
>>> responsibilities of a Beam committer[1].
>>>
>>> Thanks for your contributions Michał!
>>>
>>> Pablo, on behalf of the Apache Beam PMC.
>>>
>>> [1] 
>>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer


Re: [ANNOUNCE] New committer: Michał Walenia

2020-01-27 Thread Mikhail Gryzykhin
Congratulations Michal!

--Mikhail

On Mon, Jan 27, 2020 at 1:01 PM Kyle Weaver  wrote:

> Congratulations Michał! Looking forward to your future contributions :)
>
> Thanks,
> Kyle
>
> On Mon, Jan 27, 2020 at 12:47 PM Pablo Estrada  wrote:
>
>> Hi everyone,
>>
>> Please join me and the rest of the Beam PMC in welcoming a new
>> committer: Michał Walenia
>>
>> Michał has contributed to Beam in many ways, including the performance
>> testing infrastructure, and has even spoken at events about Beam.
>>
>> In consideration of his contributions, the Beam PMC trusts him with the
>> responsibilities of a Beam committer[1].
>>
>> Thanks for your contributions Michał!
>>
>> Pablo, on behalf of the Apache Beam PMC.
>>
>> [1]
>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>
>


Re: Using very slow changing stream of data (KV) as side input

2020-01-27 Thread Mikhail Gryzykhin
Hi Mohil,

Please, take a look at.
https://beam.apache.org/documentation/patterns/side-inputs/#slowly-updating-global-window-side-inputs


Also, I have design doc out that handles similar case. I'm working on
prototyping it in python atm.
https://lists.apache.org/thread.html/r792fcf4b6adbce79ea1eb81592d29a3cee7aef768ba4615ac2d078ad%40%3Cdev.beam.apache.org%3E


Regards,
--Mikhail

On Mon, Jan 27, 2020 at 8:56 AM Mohil Khare  wrote:

> Hi,
> This is Mohil Khare from San Jose, California. I work in an early stage
> startup: Prosimo.
> We use Apache beam with gcp dataflow for all real time stats processing
> with Kafka and Pubsub as data source while elasticsearch and GCS as sinks.
>
> I am trying to solve the following use with sideinputs.
>
> INPUT:
> 1. We have a continuous stream of data coming from pubsub topicA. This
> data can be put in KV Pcollection and each data item can be uniquely
> identified with certain key.
> 2. We have a very slow changing stream of data coming from pubsub topicB
> i.e. you can say that stream of data comes for few mins on topicB followed
> by no activity for a long time period.   This stream of data can be again
> put in KV PCollection with same keys as above. NOTE: after long inactivity,
> it is possible that data comes for only certain keys.
>
> DESIRED OUTPUT/PROCESSING:
> 1. I want to use KV PCollection as sideinput to enrich data arriving in
> topicA. I think View.asMap can be a good choice for it.
> 2. After enriching data in topic A using sideinput data from topic B,
> write to GCS in a fixed window of 10 minutes
> 2.  Want to continue using above PCollectionView as sideinput as long as
> no new data arrives in topicB.
> 3. Whenever new data arrives in topicB, want to update PCollectionView Map
> only for set of Keys that arrived in new stream.
>
> My question is what should be the best approach to tackle this use case? I
> will really appreciate if someone can suggest some good solution.
>
> Thanks and Regards
> Mohil Khare
>
>
>
>
>


Re: [ANNOUNCE] New committer: Michał Walenia

2020-01-27 Thread Kyle Weaver
Congratulations Michał! Looking forward to your future contributions :)

Thanks,
Kyle

On Mon, Jan 27, 2020 at 12:47 PM Pablo Estrada  wrote:

> Hi everyone,
>
> Please join me and the rest of the Beam PMC in welcoming a new
> committer: Michał Walenia
>
> Michał has contributed to Beam in many ways, including the performance
> testing infrastructure, and has even spoken at events about Beam.
>
> In consideration of his contributions, the Beam PMC trusts him with the
> responsibilities of a Beam committer[1].
>
> Thanks for your contributions Michał!
>
> Pablo, on behalf of the Apache Beam PMC.
>
> [1]
> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>


[ANNOUNCE] New committer: Michał Walenia

2020-01-27 Thread Pablo Estrada
Hi everyone,

Please join me and the rest of the Beam PMC in welcoming a new
committer: Michał Walenia

Michał has contributed to Beam in many ways, including the performance
testing infrastructure, and has even spoken at events about Beam.

In consideration of his contributions, the Beam PMC trusts him with the
responsibilities of a Beam committer[1].

Thanks for your contributions Michał!

Pablo, on behalf of the Apache Beam PMC.

[1]
https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer


Re: Go SplittableDoFn prototype and proposed changes

2020-01-27 Thread Daniel Oliveira
As a follow-up to the proposed changes from my first email, I've worked on
a doc with a more detailed changelist, including details still up for
discussion:
https://docs.google.com/document/d/1UeG5uNO00xCByGEZzDXk0m0LghX6HBWlMfRbMv_Xiyc/edit?usp=sharing

The doc is mostly full of my brainstorming on what the next version of the
user-facing Go SDF API will look like, so it's not too polished. But if
anyone's interested in this, I welcome any and all feedback!

On Mon, Jan 13, 2020 at 2:22 PM Luke Cwik  wrote:

> Thanks for the update and I agree with the points that you have made.
>
> On Fri, Jan 10, 2020 at 5:58 PM Robert Burke  wrote:
>
>> Thank you for sharing Daniel!
>>
>> Resolving SplittableDoFns for the Go SDK even just as far as initial
>> splitting will take the SDK that much closer to exiting its experimental
>> status.
>>
>> It's especially exciting seeing this work on Flink and on the Python
>> direct runner!
>>
>> On Fri, Jan 10, 2020, 5:36 PM Daniel Oliveira 
>> wrote:
>>
>>> Hey Beam devs,
>>>
>>> So several months ago I posted my Go SDF proposal and got a lot of good
>>> feedback (thread
>>> ,
>>> doc ). Since then I've been working
>>> on implementing it and I've got an initial prototype ready to show off! It
>>> works with initial splitting on Flink, and has a decently documented API.
>>> Also in the second part of the email I'll also be proposing changes to the
>>> original doc, based on my experience working on this prototype.
>>>
>>> To be clear, this is *not* ready to officially go into Beam yet; the
>>> API is still likely to go through changes. Rather, I'm showing this off to
>>> show that progress is being made on SDFs, and to provide some context to
>>> the changes I'll be proposing below.
>>>
>>> Here's a link to the repo and branch so you can download it, and a link
>>> to the changes specifically:
>>> Repo: https://github.com/youngoli/beam/tree/gosdf
>>> Changes:
>>> https://github.com/apache/beam/commit/28140ee3471d6cb80e74a16e6fd108cc380d4831
>>>
>>> If you give it a try and have any thoughts, please let me know! I'm open
>>> to any and all feedback.
>>>
>>> ==
>>>
>>> Proposed Changes
>>> Doc: https://s.apache.org/beam-go-sdf (Select "Version 1" from version
>>> history.)
>>>
>>> For anyone reading this who hasn't already read the doc above, I suggest
>>> reading it first, since I'll be referring to concepts from it.
>>>
>>> After working on the prototype I've changed my mind on the original
>>> decisions to go with an interface approach and a combined restriction +
>>> tracker. But I don't want to go all in and create another doc with a
>>> detailed proposal, so I've laid out a brief summary of the changes to get
>>> some initial feedback before I go ahead and start working on these changes
>>> in detail. Please let me know what you think!
>>>
>>> *1. Change from native Go interfaces to dynamic reflection-based API.*
>>>
>>> Instead of the native Go interfaces (SplittableDoFn, RProvider, and
>>> RTracker) described in the doc and implemented in the prototype, use the
>>> same dynamic approach that the Go SDK already uses for DoFns: Use the
>>> reflection system to examine the names and signatures of methods in the
>>> user's DoFn, RProvider, and RTracker.
>>>
>>> Original approach reasoning:
>>>
>>>- Simpler, so faster to implement and less bug-prone.
>>>- The extra burden on the user to keep types consistent is ok since
>>>most users of SDFs are more advanced
>>>
>>> Change reasoning:
>>>
>>>- In the prototype, I found interfaces to require too much extra
>>>boilerplate which added more complexity than expected. (Examples: 
>>> Constant
>>>casting,
>>>- More consistent API: Inconsistency between regular DoFns (dynamic)
>>>and SDF API (interfaces) was jarring and unintuitive when implementing 
>>> SDFs
>>>as a user.
>>>
>>> Implementation: Full details are up for discussion, but the goal is to
>>> make the RProvider and  RTracker interfaces dynamic, so we can replace all
>>> instances of interface{} in the methods with the actual element types
>>> (i.e. fake generics). Also uses of the RProvider and RTracker interfaces in
>>> signatures can be replaced with the implementations of those
>>> providers/trackers. This will require a good amount of additional work in
>>> the DoFn validation codebase and the code generator. Plus a fair amount of
>>> additional user code validation will be needed and more testing since the
>>> new code is more complex.
>>>
>>> *2. Seperate the restriction tracker and restriction.*
>>>
>>> Currently the API has the restriction combined with the tracker. In most
>>> other SDKs and within the SDF model, the two are usually separate concepts,
>>> and this change is to follow that approach and split the two.
>>>
>>> Original 

Re: GSOC announced!

2020-01-27 Thread Rui Wang
Hi Xinbin,

that sounds enough to get started. You can get permission on JIRA and then
assign JIRA that you want to work on to yourself.


-Rui

On Fri, Jan 24, 2020 at 1:22 PM Xinbin Huang  wrote:

> Hi Rui,
>
> Yes, I would like to contribute to Apache Beam, but I don't have a
> specific topic of interest in mind.
>
> I have reviewed some of the issues on JIRA, and would like to work on some
> of them. I have read through the contributing page
> https://beam.apache.org/contribute/ and it gives me an idea about the
> desired workflow. Besides that, are there any other sources I should refer
> to?
>
> I will open a separate email to get permission on JIRA.
>
> Cheers.
> Bin
>
> On Wed, Jan 15, 2020 at 1:16 PM Rui Wang  wrote:
>
>> Hi Xinbin,
>>
>> I assume you want to contribute to Apache Beam while you are less
>> experienced, thus you want to seek for some mentorship?
>>
>> This topic was discussed before. I don't think we decided to build a
>> formal mentorship program for Beam. Instead, would you share your interest
>> first and then probably we could ask if there are people that know the
>> topic who can actually mentor?
>>
>>
>> -Rui
>>
>> On Wed, Jan 15, 2020 at 9:30 AM Xinbin Huang 
>> wrote:
>>
>>> Hi community,
>>>
>>> I am pretty new to the apache beam community and want to contribute to
>>> the project. I think GCOS is a great opportunity for people to learn and
>>> contribute, but I am not eligible for it because I am not a student. That
>>> being said, would that be opportunities for non-students to participate in
>>> this or other opportunities that is suitable for less experienced people
>>> that want to contribute?
>>>
>>> Thanks!
>>> Bin
>>>
>>> On Wed, Jan 15, 2020 at 8:52 AM Ismaël Mejía  wrote:
>>>
 Thanks for bringing this info. +1 on the Nexmark + Python + Portability
 project.
 Let's sync on that one Pablo. I am interested on co-mentoring it.


 On Tue, Jan 14, 2020 at 7:55 PM Rui Wang  wrote:

> Great! I will try to propose something for BeamSQL.
>
>
> -Rui
>
> On Tue, Jan 14, 2020 at 10:40 AM Pablo Estrada 
> wrote:
>
>> Hello everyone,
>>
>> As with every year, the Google Summer of Code has been announced[1],
>> so we can start preparing for it if anyone is interested. It's early in 
>> the
>> process for now, but it's good to prepare early : )
>>
>> Here are the ASF mentor guidelines[2]. For now, the thing to do is to
>> file JIRA issues for your projects, and apply the labels "mentor", 
>> "gsoc",
>> "gsoc2020".
>>
>> When the time comes, the next steps are to join the
>> ment...@community.apache.org list, and request the PMC for approval
>> of a project.
>>
>> My current plan is to have these projects, though these are subject
>> to change:
>> - Build Nexmark pipelines for Python SDK (Ismael FYI)
>> - Azure Blobstore File System for Java & Python
>>
>> I'll try to keep the dev@ list updated with other steps of the
>> process.
>> Thanks!
>> -P.
>>
>> [1] https://summerofcode.withgoogle.com/
>> [2]
>> https://community.apache.org/gsoc.html#prospective-asf-mentors-read-this
>>
>


Re: [DISCUSS][PROPOSAL] Improvements to the Apache Beam website

2020-01-27 Thread Aizhamal Nurmamat kyzy
Hi Alexey,

Answers are inline:

Do we have any user demands for documentation translation into other
> languages? I’m asking this because, in my experience, it’s quite tough work
> to translate everything and it won’t be always up-to-date with the
> mainstream docs in English.
>

We know of at least one user who has been trying to grow a Beam community
in China and translate the documentation with the local community help:
--> https://github.com/mybeam/Apache-Beam-/tree/master/website
-->
https://lists.apache.org/thread.html/6b7008affee7d70aa0ef13bce7d57455c85759b0af7e08582a086f53%40%3Cdev.beam.apache.org%3E

This would hopefully unblock other contributors. For translations, the idea
is that the source of truth is the english version, and we'll make sure
it's visible on the header of translated pages, as well as dates for the
latest updates.

Also, moving to another doc engine probably will require us to change a
> format of mark-up language or not?. What are the other advantages of Docsy
> over Jekyll?
>

We will have to make small tweaks to the Jekyll MD files, but as Brian
pointed out in the old thread we can use some tools to automate the process:
-->   https://gohugo.io/commands/hugo_import_jekyll/

I’d also suggest to improve Beam site context search to be able to
> differentiate search queries over user documentation and/or API references.
>
+1. Will add this as a work item.


Re: [DISCUSS] Autoformat python code with Black

2020-01-27 Thread Robert Bradshaw
Thanks. I commented on the PR. I think if we're going this route we
should add a pre-commit, plus instructions on how to run the tool
(similar to spotless).

On Mon, Jan 27, 2020 at 10:00 AM Udi Meiri  wrote:
>
> I've done a pass on the PR on code I'm familiar with.
> Please make a pass and add your suggestions on the PR.
>
> On Fri, Jan 24, 2020 at 7:15 AM Ismaël Mejía  wrote:
>>
>> Java build fails on any unformatted code so python probably should be like 
>> that.
>> We have to ensure however that it fails early on that.
>> As Robert said time to debate the knobs :)
>>
>> On Fri, Jan 24, 2020 at 3:19 PM Kamil Wasilewski 
>>  wrote:
>>>
>>> PR is ready: https://github.com/apache/beam/pull/10684. Please share your 
>>> comments ;-) I've managed to reduce the impact a bit:
>>> 501 files changed, 18245 insertions(+), 19495 deletions(-)
>>>
>>> We still need to consider how to enforce the usage of autoformatter. 
>>> Pre-commit sounds like a nice addition, but it still needs to be installed 
>>> manually by a developer. On the other hand, Jenkins precommit job that 
>>> fails if any unformatted code is detected looks like too strict. What do 
>>> you think?
>>>
>>> On Thu, Jan 23, 2020 at 8:37 PM Robert Bradshaw  wrote:

 Thanks! Now we get to debate what knobs to twiddle :-P

 FYI, I did a simple run (just pushed to
 https://github.com/apache/beam/compare/master...robertwb:yapf) to see
 the impact. The diff is

 $ git diff --stat master
 ...
  547 files changed, 22118 insertions(+), 21129 deletions(-)

 For reference

 $ find sdks/python/apache_beam -name '*.py' | xargs wc
 ...
 200424  612002 7431637 total

 which means a little over 10% of lines get touched. I think there are
 some options, such as SPLIT_ALL_TOP_LEVEL_COMMA_SEPARATED_VALUES and
 COALESCE_BRACKETS, that will conform more to the style we are already
 (mostly) following.


 On Thu, Jan 23, 2020 at 1:59 AM Kamil Wasilewski
  wrote:
 >
 > Thank you Michał for creating the ticket. I have some free time and I'd 
 > like to volunteer myself for this task.
 > Indeed, it looks like there's consensus for `yapf`, so I'll try `yapf` 
 > first.
 >
 > Best,
 > Kamil
 >
 >
 > On Thu, Jan 23, 2020 at 10:37 AM Michał Walenia 
 >  wrote:
 >>
 >> Hi all,
 >> I created a JIRA issue for this and summarized the available tools
 >>
 >> https://issues.apache.org/jira/browse/BEAM-9175
 >>
 >> Cheers,
 >> Michal
 >>
 >> On Thu, Jan 23, 2020 at 1:49 AM Udi Meiri  wrote:
 >>>
 >>> Sorry, backing off on this due to time constraints.
 >>>
 >>> On Wed, Jan 22, 2020 at 3:39 PM Udi Meiri  wrote:
 
  It sounds like there's a consensus for yapf. I volunteer to take this 
  on
 
  On Wed, Jan 22, 2020, 10:31 Udi Meiri  wrote:
 >
 > +1 to autoformatting
 >
 > On Wed, Jan 22, 2020 at 9:57 AM Luke Cwik  wrote:
 >>
 >> +1 to autoformatters. Also the Beam Java SDK went through a one 
 >> time pass to apply the spotless formatting.
 >>
 >> On Tue, Jan 21, 2020 at 9:52 PM Ahmet Altay  
 >> wrote:
 >>>
 >>> +1 to autoformatters and yapf. It appears to be a well maintained 
 >>> project. I do support making a one time pass to apply formatting 
 >>> the whole code base.
 >>>
 >>> On Tue, Jan 21, 2020 at 5:38 PM Chad Dombrova  
 >>> wrote:
 >
 >
 > It'd be good if there was a way to only apply to violating (or at
 > least changed) lines.
 
 
  I assumed the first thing we’d do is convert all of the code in 
  one go, since it’s a very safe operation. Did you have something 
  else in mind?
 
  -chad
 
 
 
 
 >
 >
 > On Tue, Jan 21, 2020 at 1:56 PM Chad Dombrova 
 >  wrote:
 > >
 > > +1 to autoformatting
 > >
 > > Let me add some nuance to that.
 > >
 > > The way I see it there are 2 varieties of formatters:  those 
 > > which take the original formatting into consideration 
 > > (autopep8) and those which disregard it (yapf, black).
 > >
 > > I much prefer yapf to black, because you have plenty of 
 > > options to tweak with yapf (enough to make the output a pretty 
 > > close match to the current Beam style), and you can mark areas 
 > > to preserve the original formatting, which could be very 
 > > useful with Pipeline building with pipe operators.  Please 
 > > don't pick black.
 

Re: New contributor

2020-01-27 Thread Ahmet Altay
Done. Added xbhuang to the list of contributors.

On Mon, Jan 27, 2020 at 10:45 AM Xinbin Huang  wrote:

> Hi
>
> This is Xinbin Huang. Can someone add me as a contributor for Beam's Jira
> issue tracker? I would like to create/assign tickets for my work.
>
> My Jira username is xbhuang .
>
> Thanks
> Bin
>
>>


Re: New contributor

2020-01-27 Thread Xinbin Huang
Hi

This is Xinbin Huang. Can someone add me as a contributor for Beam's Jira
issue tracker? I would like to create/assign tickets for my work.

My Jira username is xbhuang .

Thanks
Bin

>


Re: [DISCUSS][PROPOSAL] Improvements to the Apache Beam website

2020-01-27 Thread Alexey Romanenko
HI Aizhamal,

Thank you for working on documentation improvement! This should be very helpful 
even if Beam site is in quite good shape now (though it lacks updates on new 
features and revision of the old ones). 

> On 27 Jan 2020, at 08:33, Aizhamal Nurmamat kyzy  wrote:
> 
> First, is to enable internationalization and localization of the Apache Beam 
> website to increase the reach of the project. We can do this by migrating the 
> current website from Jekyll do Docsy [2]. Docsy supports internationalization 
> out-of-the-box. Other projects have been very successful enabling platforms 
> that allow their users to contribute a translation of the documentation and 
> make the project accessible to non-english speakers. Examples are Kubernetes, 
> Tensorflow and Apache Airflow.

Do we have any user demands for documentation translation into other languages? 
I’m asking this because, in my experience, it’s quite tough work to translate 
everything and it won’t be always up-to-date with the mainstream docs in 
English. 

Also, moving to another doc engine probably will require us to change a format 
of mark-up language or not?. What are the other advantages of Docsy over Jekyll?

> Second, is to work on the content of the website to make the user onboarding 
> experience easier. This includes updating tutorials and quickstarts, 
> deprecating outdated or irrelevant sets of documentation, and creating new 
> documentation that is currently lacking. From the conversations with both new 
> and experienced users, it is clear that there are a number of new features 
> (Schemas, State and Timers, new Python IOs, etc) that are not documented. 
> Documentation for existing features can also be improved. We also plan to add 
> new pages with useful knowledge about Beam (its Ecosystem, past workshops, 
> talks and upcoming meetups) that will provide Beam users additional support 
> in their journey towards adoption.

This should be an excellent improvemnet because many new features are suffering 
from not being very visible for users. I’d be happy to help with that.

> 
> Third, rework the knowledge architecture to be more intuitive and improve 
> website functionality for better and faster user onboarding. Include relevant 
> links in the Apache Beam docs to additional contributor resources on Beam 
> Cwiki to make new contributor experience better. 

I’d also suggest to improve Beam site context search to be able to 
differentiate search queries over user documentation and/or API references.




Re: Apache Beam KafkaIO Transform "go" available?

2020-01-27 Thread Robert Burke
+cc dev@beam
Hello Daisy!
There presently isn't a Kafka transform for Go.

The Go SDK is still experimental, largely due to scalable IO support, which
is why the Go SDK isn't represented in the built-in io page
.

There's presently no way for an SDK user to write a Streaming source in the
Go SDK, since there's no mechanism for a DoFn to "self terminate" bundles,
such as to allow for scalability and windowing from streaming sources.

However, SplittableDoFns are on their way, and will eventually be the
solution for writing these.

At present, the Beam Go SDK IOs haven't been tested and vetted for
production use. Until the initial SplittableDoFn support is added to the Go
SDK, Batch transforms cannot split, and can't scale beyond a single worker
thread. This batch version should land in the next few months, and the
streaming version land a few months after that, after which a Kafka IO can
be developed.

I wish I had better news for you, but I can say progress is being made.

Robert Burke


On Sun, Jan 26, 2020 at 10:14 PM Daisy Wu  wrote:

> Hi, Robert,
>
> I found your name from the Apache Beam WIKI page.
>
> I am working on building a data ingestion pipeline using Apache Beam "go"
> SDK.
>
> My pipeline is to consume data from Kafka queue and persist the data to
> Google Cloud Bigtable (and/or to another Kafka topic).
>
> So far, I have not been able to find a Kafka IO Connector (also known as
> Apache I/O Transform) written in "go" (I was able to find a java version,
> however).
>
> Here's link to supported Apache Beam built-in I/O transforms:
> https://beam.apache.org/documentation/io/built-in/
>
> I am looking for the "go" equivalent of the following Java code:
>
> pipeline.apply("kafka_deserialization", KafkaIO.read()
>   .withBootstrapServers(KAFKA_BROKER)
>   .withTopic(KAFKA_TOPIC)
>   .withConsumerConfigUpdates(CONSUMER_CONFIG)
>   .withKeyDeserializer(StringDeserializer.class)
>   .withValueDeserializer(StringDeserializer.class))
>
> Do you have any information on the availability of KafkaIO
> Connector/Transform "go" SDK/library?
>
> Any help or information would be much appreciated.
>
> Thank you.
>
>
> Daisy Wu
>


Re: [DISCUSS] Autoformat python code with Black

2020-01-27 Thread Udi Meiri
I've done a pass on the PR on code I'm familiar with.
Please make a pass and add your suggestions on the PR.

On Fri, Jan 24, 2020 at 7:15 AM Ismaël Mejía  wrote:

> Java build fails on any unformatted code so python probably should be like
> that.
> We have to ensure however that it fails early on that.
> As Robert said time to debate the knobs :)
>
> On Fri, Jan 24, 2020 at 3:19 PM Kamil Wasilewski <
> kamil.wasilew...@polidea.com> wrote:
>
>> PR is ready: https://github.com/apache/beam/pull/10684. Please share
>> your comments ;-) I've managed to reduce the impact a bit:
>> 501 files changed, 18245 insertions(+), 19495 deletions(-)
>>
>> We still need to consider how to enforce the usage of autoformatter.
>> Pre-commit sounds like a nice addition, but it still needs to be installed
>> manually by a developer. On the other hand, Jenkins precommit job that
>> fails if any unformatted code is detected looks like too strict. What do
>> you think?
>>
>> On Thu, Jan 23, 2020 at 8:37 PM Robert Bradshaw 
>> wrote:
>>
>>> Thanks! Now we get to debate what knobs to twiddle :-P
>>>
>>> FYI, I did a simple run (just pushed to
>>> https://github.com/apache/beam/compare/master...robertwb:yapf) to see
>>> the impact. The diff is
>>>
>>> $ git diff --stat master
>>> ...
>>>  547 files changed, 22118 insertions(+), 21129 deletions(-)
>>>
>>> For reference
>>>
>>> $ find sdks/python/apache_beam -name '*.py' | xargs wc
>>> ...
>>> 200424  612002 7431637 total
>>>
>>> which means a little over 10% of lines get touched. I think there are
>>> some options, such as SPLIT_ALL_TOP_LEVEL_COMMA_SEPARATED_VALUES and
>>> COALESCE_BRACKETS, that will conform more to the style we are already
>>> (mostly) following.
>>>
>>>
>>> On Thu, Jan 23, 2020 at 1:59 AM Kamil Wasilewski
>>>  wrote:
>>> >
>>> > Thank you Michał for creating the ticket. I have some free time and
>>> I'd like to volunteer myself for this task.
>>> > Indeed, it looks like there's consensus for `yapf`, so I'll try `yapf`
>>> first.
>>> >
>>> > Best,
>>> > Kamil
>>> >
>>> >
>>> > On Thu, Jan 23, 2020 at 10:37 AM Michał Walenia <
>>> michal.wale...@polidea.com> wrote:
>>> >>
>>> >> Hi all,
>>> >> I created a JIRA issue for this and summarized the available tools
>>> >>
>>> >> https://issues.apache.org/jira/browse/BEAM-9175
>>> >>
>>> >> Cheers,
>>> >> Michal
>>> >>
>>> >> On Thu, Jan 23, 2020 at 1:49 AM Udi Meiri  wrote:
>>> >>>
>>> >>> Sorry, backing off on this due to time constraints.
>>> >>>
>>> >>> On Wed, Jan 22, 2020 at 3:39 PM Udi Meiri  wrote:
>>> 
>>>  It sounds like there's a consensus for yapf. I volunteer to take
>>> this on
>>> 
>>>  On Wed, Jan 22, 2020, 10:31 Udi Meiri  wrote:
>>> >
>>> > +1 to autoformatting
>>> >
>>> > On Wed, Jan 22, 2020 at 9:57 AM Luke Cwik 
>>> wrote:
>>> >>
>>> >> +1 to autoformatters. Also the Beam Java SDK went through a one
>>> time pass to apply the spotless formatting.
>>> >>
>>> >> On Tue, Jan 21, 2020 at 9:52 PM Ahmet Altay 
>>> wrote:
>>> >>>
>>> >>> +1 to autoformatters and yapf. It appears to be a well
>>> maintained project. I do support making a one time pass to apply formatting
>>> the whole code base.
>>> >>>
>>> >>> On Tue, Jan 21, 2020 at 5:38 PM Chad Dombrova 
>>> wrote:
>>> >
>>> >
>>> > It'd be good if there was a way to only apply to violating (or
>>> at
>>> > least changed) lines.
>>> 
>>> 
>>>  I assumed the first thing we’d do is convert all of the code in
>>> one go, since it’s a very safe operation. Did you have something else in
>>> mind?
>>> 
>>>  -chad
>>> 
>>> 
>>> 
>>> 
>>> >
>>> >
>>> > On Tue, Jan 21, 2020 at 1:56 PM Chad Dombrova <
>>> chad...@gmail.com> wrote:
>>> > >
>>> > > +1 to autoformatting
>>> > >
>>> > > Let me add some nuance to that.
>>> > >
>>> > > The way I see it there are 2 varieties of formatters:  those
>>> which take the original formatting into consideration (autopep8) and those
>>> which disregard it (yapf, black).
>>> > >
>>> > > I much prefer yapf to black, because you have plenty of
>>> options to tweak with yapf (enough to make the output a pretty close match
>>> to the current Beam style), and you can mark areas to preserve the original
>>> formatting, which could be very useful with Pipeline building with pipe
>>> operators.  Please don't pick black.
>>> > >
>>> > > autopep8 is more along the lines of spotless in Java -- it
>>> only corrects code that breaks the project's style rules.  The big problem
>>> with Beam's current style is that it is so esoteric that autopep8 can't
>>> enforce it -- and I'm not just talking about 2-spaces, which I don't really
>>> have a problem with -- the problem is the use of either 2 or 4 spaces
>>> depending on context (expression start vs hanging 

Using very slow changing stream of data (KV) as side input

2020-01-27 Thread Mohil Khare
Hi,
This is Mohil Khare from San Jose, California. I work in an early stage
startup: Prosimo.
We use Apache beam with gcp dataflow for all real time stats processing
with Kafka and Pubsub as data source while elasticsearch and GCS as sinks.

I am trying to solve the following use with sideinputs.

INPUT:
1. We have a continuous stream of data coming from pubsub topicA. This data
can be put in KV Pcollection and each data item can be uniquely identified
with certain key.
2. We have a very slow changing stream of data coming from pubsub topicB
i.e. you can say that stream of data comes for few mins on topicB followed
by no activity for a long time period.   This stream of data can be again
put in KV PCollection with same keys as above. NOTE: after long inactivity,
it is possible that data comes for only certain keys.

DESIRED OUTPUT/PROCESSING:
1. I want to use KV PCollection as sideinput to enrich data arriving in
topicA. I think View.asMap can be a good choice for it.
2. After enriching data in topic A using sideinput data from topic B, write
to GCS in a fixed window of 10 minutes
2.  Want to continue using above PCollectionView as sideinput as long as no
new data arrives in topicB.
3. Whenever new data arrives in topicB, want to update PCollectionView Map
only for set of Keys that arrived in new stream.

My question is what should be the best approach to tackle this use case? I
will really appreciate if someone can suggest some good solution.

Thanks and Regards
Mohil Khare


Pipeline with PAssert on flink issue

2020-01-27 Thread Paweł Pasterz
Hello all!

Did any of you encounter a similar problem with PAssert on Flink (set up as
standalone instance not embedded one)? For simple test:

PCollection res = pipeline.apply("Generate 5", Create.of(5));

PAssert.thatSingleton(res).isEqualTo(5);

Only ~1 of 10 attempts (rough estimations) ends up with success, for the
rest I get

java.lang.AssertionError: Expected 1 successful assertions, but found 0.

   Expected: is <1L>

but: was <0L>

   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)

   at
org.apache.beam.sdk.testing.TestPipeline.verifyPAssertsSucceeded(TestPipeline.java:540)

   at
org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:351)

   at
org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:331)

…

A short primer on PAsserts:

They work by performing an assertion on all elements of a PCollection. If
the assertion is correct, a Counter metric holding the number of successful
PAsserts is incremented, if it fails, an analogous counter with failures is
increased. Checking the correctness of a pipeline is performed by
TestPipeline class by counting PAsserts in pipeline definition and
comparing their number with success counter.

In this case the success counter is never saved in the first place, this is
why the assertion error reads expected 1, was 0.

I checked the metrics reporting step by step in the debugger and didn’t
find a trace of the counter being saved, there were no metrics accumulators
reported back from Flink.

What is interesting with ‘--streaming=true’ everything is working fine.

Then I run PAssertTest on local flink instance (slightly modifying
ValidatesRunner test):

   -

   I’ve created my own interface

public interface ValidatesRunner_custom extends NeedsRunner {}

   -

   Replaced all occurrences of  ValidatesRunner.class  to
   ValidatesRunner_custom.class in PAsserTest , just to run only those tests
   -

   Modified a bit flink_runner.gradle

flink_runner.gradle

L[188]

def pipelineOptions = JsonOutput.toJson(

   ["--runner=TestFlinkRunner",

"--flinkMaster=localhost:8081",

"--streaming=${config.streaming}",

"--parallelism=1",

   ])

L[202]

includeCategories 'org.apache.beam.sdk.testing.ValidatesRunner_custom'

excludeCategories 'org.apache.beam.sdk.testing.ValidatesRunner'

   -

   Invoked with

./gradlew validatesRunner -p runners/flink/1.9 --stacktrace --info

Result was the same.

This might be a clue that our runner validation sets up embedded flink
instance using FlinkMiniClusterEntryPoint which uses MiniCluster from
flink. On the other, if I launch cluster according to docs
https://ci.apache.org/projects/flink/flink-docs-release-1.9/getting-started/tutorials/local_setup.html#start-a-local-flink-cluster
it uses StandaloneSessionClusterEntrypoint to start instance. I tried to
find something that might cause differences in behavior but, well, I am far
from being familiar with flink code. And, of course, it can be a false lead.

Let me know what you think.

Thanks

Pawel


Re: Jenkins jobs not running for my PR 10438

2020-01-27 Thread Rehman Murad Ali
Thanks, Ismaël.

*Rehman Murad Ali*
Software Engineer
Mobile: +92 3452076766
Skype: rehman.muradali


On Mon, Jan 27, 2020 at 6:18 PM Ismaël Mejía  wrote:

> done
>
> On Mon, Jan 27, 2020 at 2:13 PM Rehman Murad Ali <
> rehman.murad...@venturedive.com> wrote:
>
>> Hi,
>>
>> I appreciate if somebody can run jobs for this PR:
>>
>> https://github.com/apache/beam/pull/10627
>>
>> *Thanks & Regards*
>>
>>
>>
>> *Rehman Murad Ali*
>> Software Engineer
>> Mobile: +92 3452076766
>> Skype: rehman.muradali
>>
>>
>> On Fri, Jan 24, 2020 at 3:53 AM Rui Wang  wrote:
>>
>>> Done
>>>
>>>
>>> -Rui
>>>
>>> On Thu, Jan 23, 2020 at 2:45 PM Tomo Suzuki  wrote:
>>>
 Hi Beam Comitters,

 Can somebody trigger 2 failed checks below for
 https://github.com/apache/beam/pull/10674 ?
 Run Java PreCommit
 Run JavaPortabilityApi PreCommit


 On Thu, Jan 23, 2020 at 14:32 Tomo Suzuki  wrote:

> Hi Rui,
> (Thank you for quick response)
>
> Would you run one command per one comment? I don't think Jenkins
> recognizes multiple at once.
>
> On Thu, Jan 23, 2020 at 2:25 PM Rui Wang  wrote:
> >
> > Done
> >
> > On Thu, Jan 23, 2020 at 11:20 AM Tomo Suzuki 
> wrote:
> >>
> >> Hi Beam Committers,
> >>
> >> I appreciate if you can run precommit checks for
> >> https://github.com/apache/beam/pull/10674
> >> plus the following 6 extra commands:
> >>
> >> Run Java PostCommit
> >> Run Java HadoopFormatIO Performance Test
> >> Run BigQueryIO Streaming Performance Test Java
> >> Run Dataflow ValidatesRunner
> >> Run Spark ValidatesRunner
> >> Run SQL Postcommit
> >>
> >> Regards,
> >> Tomo
>
>
>
> --
> Regards,
> Tomo
>



Re: Jenkins jobs not running for my PR 10438

2020-01-27 Thread Ismaël Mejía
done

On Mon, Jan 27, 2020 at 2:13 PM Rehman Murad Ali <
rehman.murad...@venturedive.com> wrote:

> Hi,
>
> I appreciate if somebody can run jobs for this PR:
>
> https://github.com/apache/beam/pull/10627
>
> *Thanks & Regards*
>
>
>
> *Rehman Murad Ali*
> Software Engineer
> Mobile: +92 3452076766
> Skype: rehman.muradali
>
>
> On Fri, Jan 24, 2020 at 3:53 AM Rui Wang  wrote:
>
>> Done
>>
>>
>> -Rui
>>
>> On Thu, Jan 23, 2020 at 2:45 PM Tomo Suzuki  wrote:
>>
>>> Hi Beam Comitters,
>>>
>>> Can somebody trigger 2 failed checks below for
>>> https://github.com/apache/beam/pull/10674 ?
>>> Run Java PreCommit
>>> Run JavaPortabilityApi PreCommit
>>>
>>>
>>> On Thu, Jan 23, 2020 at 14:32 Tomo Suzuki  wrote:
>>>
 Hi Rui,
 (Thank you for quick response)

 Would you run one command per one comment? I don't think Jenkins
 recognizes multiple at once.

 On Thu, Jan 23, 2020 at 2:25 PM Rui Wang  wrote:
 >
 > Done
 >
 > On Thu, Jan 23, 2020 at 11:20 AM Tomo Suzuki 
 wrote:
 >>
 >> Hi Beam Committers,
 >>
 >> I appreciate if you can run precommit checks for
 >> https://github.com/apache/beam/pull/10674
 >> plus the following 6 extra commands:
 >>
 >> Run Java PostCommit
 >> Run Java HadoopFormatIO Performance Test
 >> Run BigQueryIO Streaming Performance Test Java
 >> Run Dataflow ValidatesRunner
 >> Run Spark ValidatesRunner
 >> Run SQL Postcommit
 >>
 >> Regards,
 >> Tomo



 --
 Regards,
 Tomo

>>>


Re: Jenkins jobs not running for my PR 10438

2020-01-27 Thread Rehman Murad Ali
Hi,

I appreciate if somebody can run jobs for this PR:

https://github.com/apache/beam/pull/10627

*Thanks & Regards*



*Rehman Murad Ali*
Software Engineer
Mobile: +92 3452076766
Skype: rehman.muradali


On Fri, Jan 24, 2020 at 3:53 AM Rui Wang  wrote:

> Done
>
>
> -Rui
>
> On Thu, Jan 23, 2020 at 2:45 PM Tomo Suzuki  wrote:
>
>> Hi Beam Comitters,
>>
>> Can somebody trigger 2 failed checks below for
>> https://github.com/apache/beam/pull/10674 ?
>> Run Java PreCommit
>> Run JavaPortabilityApi PreCommit
>>
>>
>> On Thu, Jan 23, 2020 at 14:32 Tomo Suzuki  wrote:
>>
>>> Hi Rui,
>>> (Thank you for quick response)
>>>
>>> Would you run one command per one comment? I don't think Jenkins
>>> recognizes multiple at once.
>>>
>>> On Thu, Jan 23, 2020 at 2:25 PM Rui Wang  wrote:
>>> >
>>> > Done
>>> >
>>> > On Thu, Jan 23, 2020 at 11:20 AM Tomo Suzuki 
>>> wrote:
>>> >>
>>> >> Hi Beam Committers,
>>> >>
>>> >> I appreciate if you can run precommit checks for
>>> >> https://github.com/apache/beam/pull/10674
>>> >> plus the following 6 extra commands:
>>> >>
>>> >> Run Java PostCommit
>>> >> Run Java HadoopFormatIO Performance Test
>>> >> Run BigQueryIO Streaming Performance Test Java
>>> >> Run Dataflow ValidatesRunner
>>> >> Run Spark ValidatesRunner
>>> >> Run SQL Postcommit
>>> >>
>>> >> Regards,
>>> >> Tomo
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Tomo
>>>
>>


Beam Dependency Check Report (2020-01-27)

2020-01-27 Thread Apache Jenkins Server

High Priority Dependency Updates Of Beam Python SDK:


  Dependency Name
  Current Version
  Latest Version
  Release Date Of the Current Used Version
  Release Date Of The Latest Release
  JIRA Issue
  
cachetools
3.1.1
4.0.0
2019-12-23
2019-12-23BEAM-9017
google-cloud-bigquery
1.17.1
1.23.1
2019-09-23
2019-12-23BEAM-5537
google-cloud-datastore
1.7.4
1.10.0
2019-05-27
2019-10-21BEAM-8443
httplib2
0.12.0
0.17.0
2018-12-10
2020-01-27BEAM-9018
mock
2.0.0
3.0.5
2019-05-20
2019-05-20BEAM-7369
oauth2client
3.0.0
4.1.3
2018-12-10
2018-12-10BEAM-6089
PyHamcrest
1.10.1
2.0.0
2020-01-20
2020-01-20BEAM-9155
pytest
4.6.9
5.3.4
2020-01-06
2020-01-27BEAM-8606
Sphinx
1.8.5
2.3.1
2019-05-20
2019-12-23BEAM-7370
tenacity
5.1.5
6.0.0
2019-11-11
2019-11-11BEAM-8607
High Priority Dependency Updates Of Beam Java SDK:


  Dependency Name
  Current Version
  Latest Version
  Release Date Of the Current Used Version
  Release Date Of The Latest Release
  JIRA Issue
  
com.alibaba:fastjson
1.2.49
1.2.62
2018-08-04
2019-10-07BEAM-8632
com.datastax.cassandra:cassandra-driver-core
3.8.0
4.0.0
2019-10-29
2019-03-18BEAM-8674
com.esotericsoftware:kryo
4.0.2
5.0.0-RC4
2018-03-20
2019-04-14BEAM-5809
com.esotericsoftware.kryo:kryo
2.21
2.24.0
2013-02-27
2014-05-04BEAM-5574
com.github.ben-manes.versions:com.github.ben-manes.versions.gradle.plugin
0.20.0
0.27.0
2019-02-11
2019-10-21BEAM-6645
com.github.luben:zstd-jni
1.3.8-3
1.4.4-7
2019-01-29
2020-01-24BEAM-9194
com.github.spotbugs:spotbugs
3.1.12
4.0.0-RC1
2019-03-01
2020-01-20BEAM-7792
com.github.spotbugs:spotbugs-annotations
3.1.12
4.0.0-RC1
2019-03-01
2020-01-20BEAM-6951
com.google.api.grpc:grpc-google-cloud-datacatalog-v1beta1
0.27.0-alpha
0.32.0
2019-10-03
2020-01-23BEAM-8853
com.google.api.grpc:grpc-google-common-protos
1.12.0
1.17.0
2018-06-29
2019-10-04BEAM-8633
com.google.api.grpc:proto-google-cloud-bigtable-v2
0.44.0
1.9.1
2019-01-23
2020-01-10BEAM-8679
com.google.api.grpc:proto-google-cloud-datacatalog-v1beta1
0.27.0-alpha
0.32.0
2019-10-03
2020-01-23BEAM-8854
com.google.api.grpc:proto-google-cloud-datastore-v1
0.44.0
0.85.0
2019-01-23
2019-12-05BEAM-8680
com.google.api.grpc:proto-google-cloud-spanner-admin-database-v1
1.6.0
1.49.0
2019-01-23
2020-01-16BEAM-8682
com.google.api.grpc:proto-google-common-protos
1.12.0
1.17.0
2018-06-29
2019-10-04BEAM-6899
com.google.apis:google-api-services-bigquery
v2-rev20181221-1.28.0
v2-rev20191211-1.30.3
2019-01-17
2020-01-14BEAM-8684
com.google.apis:google-api-services-clouddebugger
v2-rev20181114-1.28.0
v2-rev20191003-1.30.3
2019-01-17
2019-10-19BEAM-8750
com.google.apis:google-api-services-cloudresourcemanager
v1-rev20181015-1.28.0
v2-rev20191206-1.30.3
2019-01-17
2019-12-17BEAM-8751
com.google.apis:google-api-services-dataflow
v1b3-rev20190927-1.28.0
v1beta3-rev12-1.20.0
2019-10-11
2015-04-29BEAM-8752
com.google.apis:google-api-services-pubsub
v1-rev2019-1.28.0
v1-rev20191203-1.30.3
2019-11-26
2019-12-18BEAM-8753
com.google.apis:google-api-services-storage
v1-rev20181109-1.28.0
v1-rev20191011-1.30.3
2019-01-18
2019-10-30BEAM-8754
com.google.cloud:google-cloud-bigquery
1.28.0
1.104.0
2018-04-27
2020-01-23BEAM-8687
com.google.cloud:google-cloud-bigquerystorage
0.79.0-alpha
0.120.1-beta
2019-01-23
2020-01-08BEAM-8755
com.google.cloud:google-cloud-core
1.61.0
1.92.2
2019-01-23
2020-01-09BEAM-8756
com.google.cloud:google-cloud-core-grpc
1.61.0
1.92.2
2019-01-23
2020-01-09BEAM-8757
com.google.cloud:google-cloud-spanner
1.6.0
1.49.0
2019-01-23
2020-01-16BEAM-8758

Contribution to Apache Beam

2020-01-27 Thread Pasan Kamburugamuwa
Hi,
  I am Pasan Kamburugamuwa and I am 4th year student of Sri Lanka
Insitute of Information Technology. I would like to contribute Apache Beam
with the topic of,
  1) Azure Blobstore File System for Java & Python
   Please can you guys help me to make a strong proposal and make good
bond with the apache community.

Thank you
Pasan Kamburugamuwa