Beam Schemas: current status

2018-08-28 Thread Reuven Lax
I wanted to send a quick note to the community about the current status of schema-aware PCollections in Beam. As some might remember we had a good discussion last year about the design of these schemas, involving many folks from different parts of the community. I sent a summary earlier this year

Re: Should we allow ValidatesRunner tests to have access to file systems?

2018-08-28 Thread Lukasz Cwik
I also agree about not having external dependencies in validates runner tests. One suggestion would have been to use attempted metrics but there is currently no way to get access to runner metrics from within a DoFn easily that is runner agnostic. This is likely a place for improvement since: *

Re: jira search in chrome omnibox

2018-08-28 Thread Valentyn Tymofieiev
Thanks for sharing. I have also found useful following custom search query for PRs: https://github.com/apache/beam/pulls?q=is%3Apr%20%s Sample usage: type 'pr', space, type: 'author:tvalentyn'. You could also incorporate 'author:' into the query:

Re: jira search in chrome omnibox

2018-08-28 Thread Daniel Oliveira
This seems pretty useful. Thanks Udi! On Mon, Aug 27, 2018 at 3:54 PM Udi Meiri wrote: > In case you want to quickly look up JIRA tickets, e.g., typing 'j', space, > 'BEAM-4696'. > Search URL: > https://issues.apache.org/jira/QuickSearch.jspa?searchString=%s > >

Re: [DISCUSS] Versioning, Hadoop related dependencies and enterprise users

2018-08-28 Thread Chamikara Jayalath
On Tue, Aug 28, 2018 at 12:05 PM Thomas Weise wrote: > I think there is an invalid assumption being made in this discussion, > which is that most projects comply with semantic versioning. The reality in > the open source big data space is unfortunately quite different. Ismaël has > well

Re: An example of Integration test case

2018-08-28 Thread Rakesh Kumar
Thank you Robin for your quick response. On Tue, Aug 28, 2018 at 1:25 PM Robin Qiu wrote: > Hi Rakesh, > > A python integration test example can be found here: > https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/wordcount_it_test.py > > Best, > Robin > > On Tue, Aug

Re: An example of Integration test case

2018-08-28 Thread Robin Qiu
Hi Rakesh, A python integration test example can be found here: https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/wordcount_it_test.py Best, Robin On Tue, Aug 28, 2018 at 1:10 PM Rakesh Kumar wrote: > Hi, > > I am writing my streaming application using Python SDK. I

An example of Integration test case

2018-08-28 Thread Rakesh Kumar
Hi, I am writing my streaming application using Python SDK. I also want to write an integration test cases. Do we have any good example of integration test that I can refer? Thank you, Rakesh -- Rakesh Kumar Software Engineer 510-761-1364 |

Re: [DISCUSS] Versioning, Hadoop related dependencies and enterprise users

2018-08-28 Thread Thomas Weise
I think there is an invalid assumption being made in this discussion, which is that most projects comply with semantic versioning. The reality in the open source big data space is unfortunately quite different. Ismaël has well characterized the situation and HBase isn't an exception. Another

Re: Design Proposal: Beam-Site Automation Reliability

2018-08-28 Thread Udi Meiri
FYI, we are about to add a new branch to apache/beam, named 'asf-site', which will contain generated website sources. On Thu, Jun 7, 2018 at 10:18 AM Jason Kuster wrote: > Sounds good; I'm really excited about these changes Scott. Thanks for > taking this on! > > On Tue, Jun 5, 2018 at 4:00 PM

Re: [DISCUSS] Versioning, Hadoop related dependencies and enterprise users

2018-08-28 Thread Raghu Angadi
Thanks for the IO versioning summary. KafkaIO's policy of 'let the user decide exact version at runtime' has been quite useful so far. How feasible is that for other connectors? Also, KafkaIO does not limit itself to minimum features available across all the supported versions. Some of the

Re: [DISCUSS] Versioning, Hadoop related dependencies and enterprise users

2018-08-28 Thread Chamikara Jayalath
Constrains to existing dependencies is a valid concern and we do not have a good solution for this currently. One way to handle this is be simply to close automatically created JIRAs with a comment and the tool will not try to create further JIRAs for the same dependency after this. But we should

Re: [DISCUSS] Versioning, Hadoop related dependencies and enterprise users

2018-08-28 Thread Andrew Pilloud
The Beam SQL module faces similar problems, several of our dependencies are constrained by maintaining compatibility with versions used by Calcite. We've written tests to detect some of these incompatibilities. Could we add integration tests for these major hadoop distros that ensure we maintain

Re: [DISCUSS] Versioning, Hadoop related dependencies and enterprise users

2018-08-28 Thread Chamikara Jayalath
Thanks Tim for raising this and Thanks JB and Ismaël for all the great points. I agree that one size fit all solution will not work when it comes to dependencies. Based on past examples, clearly there are many cases where we should proceed with caution and upgrade dependencies with care. That

Re: [DISCUSS] Versioning, Hadoop related dependencies and enterprise users

2018-08-28 Thread Ismaël Mejía
I think we should refine the strategy on dependencies discussed recently. Sorry to come late with this (I did not follow closely the previous discussion), but the current approach is clearly not in line with the industry reality (at least not for IO connectors + Hadoop + Spark/Flink use). A

Re: Publishing release artifacts to custom artifactory

2018-08-28 Thread Alexey Romanenko
Thomas, thanks, looks great. Do you think we have to add this command to “Contribution Guide”? Lukasz, yes, "-Poffline-repository" can be omitted in this case. I don’t remember why I added this =) > On 24 Aug 2018, at 22:15, Thomas Weise wrote: > > Alexey, publishing to custom repo with

Re: [DISCUSS] Versioning, Hadoop related dependencies and enterprise users

2018-08-28 Thread Jean-Baptiste Onofré
Hi Tim, regarding the IO, while ago (at the incubator time of the project), we discussed how to deal with different versions of the backend API and dependencies. I proposed to have a release cycle per IO and have a subproject per IO version, like for instance: sdks/java/io/elasticsearch-5

[DISCUSS] Versioning, Hadoop related dependencies and enterprise users

2018-08-28 Thread Tim Robertson
Hi folks, I'd like to revisit the discussion around our versioning policy specifically for the Hadoop ecosystem and make sure we are aware of the implications. As an example our policy today would have us on HBase 2.1 and I have reminders to address this. However, currently the versions of

Re: JB's back

2018-08-28 Thread Mikhail Gryzykhin
Welcome back JB! On Mon, Aug 27, 2018 at 11:10 PM Reuven Lax wrote: > Welcome back! > > On Mon, Aug 27, 2018 at 10:44 PM Jean-Baptiste Onofré > wrote: > >> Hi guys, >> >> Maybe you saw that I took some days off last week. I landed back last >> night, so, just time to unstack my e-mails and I'm

Re: JB's back

2018-08-28 Thread Reuven Lax
Welcome back! On Mon, Aug 27, 2018 at 10:44 PM Jean-Baptiste Onofré wrote: > Hi guys, > > Maybe you saw that I took some days off last week. I landed back last > night, so, just time to unstack my e-mails and I'm back ;) > > Regards > JB > -- > Jean-Baptiste Onofré > jbono...@apache.org >