Re: How does param desiredBundleSizeBytes of BoundedSource#split get determined at runtime?

2017-07-10 Thread Ivan
I think this can answer my question under Flink-runner https://github.com/apache/beam/blob/019d3002b0e2a7db9c5c2e84a0a95fad60f16422/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/SourceInputFormat.java the desiredBundleSizeBytes is determined by source.getEstima

How does param desiredBundleSizeBytes of BoundedSource#split get determined at runtime?

2017-07-10 Thread Ivan
Hi, we are trying to build a custom BoundedSource based on gRPC call. in class BoundedSource we got method below public abstract java.util.List

Re: [PROPOSAL] Connectors for memcache and Couchbase

2017-07-10 Thread Eugene Kirpichov
I think Madhusudan's proposal does not involve reading the whole contents of the memcached cluster - it's applied to a PCollection of keys. So I'd suggest to call it MemcachedIO.lookup() rather than MemcachedIO.read(). And it will not involve the questions of splitting - however, it *will* involve

Re: [BEAM-135] Utilities for "batching" elements in a DoFn

2017-07-10 Thread Robert Bradshaw
Sorry, just saw https://github.com/apache/beam/pull/2211 On Mon, Jul 10, 2017 at 5:37 PM, Robert Bradshaw wrote: > Any progress on this? > > On Thu, Mar 9, 2017 at 1:43 AM, Etienne Chauchot wrote: >> Hi all, >> >> We had a discussion with Kenn yesterday about point 1 bellow, I would like >> to n

Re: [BEAM-135] Utilities for "batching" elements in a DoFn

2017-07-10 Thread Robert Bradshaw
Any progress on this? On Thu, Mar 9, 2017 at 1:43 AM, Etienne Chauchot wrote: > Hi all, > > We had a discussion with Kenn yesterday about point 1 bellow, I would like > to note it here on the ML: > > Using new method timer.set() instead of timer.setForNowPlus() makes the > timer fire at the right

Re: [DISCUSS] Apache Beam 2.1.0 release next week ?

2017-07-10 Thread Kenneth Knowles
Have we heard anything about the remaining issues on https://s.apache.org/beam-2.1.0-burndown? Can we move them all to the following release? On Mon, Jul 10, 2017 at 1:22 PM, Jean-Baptiste Onofré wrote: > Hi all, > > all cherry-pick PRs have been merged on the release-2.1.0 branch. > > I'm launc

Re: [DISCUSS] Apache Beam 2.1.0 release next week ?

2017-07-10 Thread Jean-Baptiste Onofré
Hi all, all cherry-pick PRs have been merged on the release-2.1.0 branch. I'm launching couple of builds and tests. I will cut the RC1 just after. Stay tuned for the vote e-mail ! ;) Regards JB On 07/06/2017 06:43 AM, Jean-Baptiste Onofré wrote: No problem, just define the fix version in Jir

Re: MergeBot is here!

2017-07-10 Thread Jason Kuster
(quick update re #2 above): ~4 minutes after I reopened the ticket, it's fixed. https://github.com/apache/infrastructure-puppet/commit/709944291da5e8aea711cb8578f0594deb45e222 updates the website to the correct address. Infra is once again the best. On Mon, Jul 10, 2017 at 12:38 PM, Jason Kuster

Re: MergeBot is here!

2017-07-10 Thread Jason Kuster
Glad to hear everyone's pretty happy about it! Have a couple answers for your questions. Ted: I believe the MFA stuff (two-factor auth on github) is necessary for getting the additional features on GitHub (reviewer, etc), but may not be necessary for MergeBot. I'll check in with Infra and get back

Re: BEAM-934 - Jira permission and pull request

2017-07-10 Thread Apache Enthu
Thanks Kenn. Thanks, Almas On 11 Jul 2017 00:05, "Kenneth Knowles" wrote: > I've added you as a Contributor, which is the role you will need to assign > issues. > > On Mon, Jul 10, 2017 at 11:12 AM, Apache Enthu > wrote: > > > Hi could you please add me (eralmas7) as committer please? > > > >

Re: BEAM-934 - Jira permission and pull request

2017-07-10 Thread Kenneth Knowles
I've added you as a Contributor, which is the role you will need to assign issues. On Mon, Jul 10, 2017 at 11:12 AM, Apache Enthu wrote: > Hi could you please add me (eralmas7) as committer please? > > Thanks, > Almas > > On Mon, Jul 10, 2017 at 8:57 AM, Kenneth Knowles > wrote: > > > Just a ti

Re: BEAM-934 - Jira permission and pull request

2017-07-10 Thread Apache Enthu
Hi could you please add me (eralmas7) as committer please? Thanks, Almas On Mon, Jul 10, 2017 at 8:57 AM, Kenneth Knowles wrote: > Just a tiny correction - I think the JIRA role "contributor" for the Beam > can take JIRAs without a committer assigning to them. But definitely you > _must_ have t

Re: BEAM-933 - Not reproduceable

2017-07-10 Thread Apache Enthu
Thanks. Let's wait for Jenkins to report the same and we can review it again. On Mon, Jul 10, 2017 at 10:57 PM, Ted Yu wrote: > Looks like (some of) the findbugs warnings were not addressed. > > e.g. in TopWikipediaSessions.java : > > if (Math.abs(c.element().hashCode()) <= > I

Re: MergeBot is here!

2017-07-10 Thread Mark Liu
+1 Awesome work! Thank you Jason!!! Mark On Mon, Jul 10, 2017 at 10:05 AM, Robert Bradshaw < rober...@google.com.invalid> wrote: > +1, this is great! I'll second Ismaël's list requests, especially 1 and 3. > > On Mon, Jul 10, 2017 at 2:09 AM, Ismaël Mejía wrote: > > Excellent!, Automation of s

Re: BEAM-933 - Not reproduceable

2017-07-10 Thread Apache Enthu
Thanks Kenn. I have created pull request for the same. Waiting for jenkins to rebuild. While i have created the jira https://issues.apache.org/jira/browse/BEAM-2578 for issue mentioned below. Have fixed as well. Thanks, Almas On Mon, Jul 10, 2017 at 10:41 PM, Kenneth Knowles wrote: > I believe

Re: BEAM-933 - Not reproduceable

2017-07-10 Thread Apache Enthu
Any idea for flaky builds? https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/12966/console 2017-07-10T16:57:01.907 [ERROR] Failed to execute goal on project beam-sdks-java-io-hadoop-jdk1.8-tests: Could not resolve dependencies for project org.apache.beam:beam-sdks-java-io-hadoop-jdk

Re: BEAM-933 - Not reproduceable

2017-07-10 Thread Ted Yu
Looks like (some of) the findbugs warnings were not addressed. e.g. in TopWikipediaSessions.java : if (Math.abs(c.element().hashCode()) <= Integer.MAX_VALUE * samplingThreshold) { See https://stackoverflow.com/questions/23416264/bad-attempt-to-compute-absolute-value for why fin

Re: BEAM-933 - Not reproduceable

2017-07-10 Thread Kenneth Knowles
I believe you can create a JIRA without any special permissions. Here's a direct link that I think will work: https://issues.apache.org/jira/secure/CreateIssue!default.jspa Kenn On Mon, Jul 10, 2017 at 10:10 AM, Kenneth Knowles wrote: > Well, if it is not reproducible then could you issue a pul

Re: MergeBot is here!

2017-07-10 Thread Robert Bradshaw
+1, this is great! I'll second Ismaël's list requests, especially 1 and 3. On Mon, Jul 10, 2017 at 2:09 AM, Ismaël Mejía wrote: > Excellent!, Automation of such repetitive (and error-prone) tasks is > strongly welcomed. > > Thanks for making this happen Jason! > > Some comments: > > 1. I suppose

Re: BEAM-933 - Not reproduceable

2017-07-10 Thread Kenneth Knowles
Well, if it is not reproducible then could you issue a pull request deleting that bit of the pom.xml? That would resolve the issue, too. Kenn On Mon, Jul 10, 2017 at 10:01 AM, Apache Enthu wrote: > Thanks Kenneth. Unfortunately i'm still unable to reproduce the issue. Did > anyone had a chance

Re: BEAM-933 - Not reproduceable

2017-07-10 Thread Apache Enthu
Thanks Kenneth. Unfortunately i'm still unable to reproduce the issue. Did anyone had a chance to look at the other issue that i raised in my mail? Unfortunately as i am not a committer and hence am assuming i wont be entitle to create jira for the same. [INFO] --- maven-checkstyle-plugin:2.17:che

Mixed-Language Pipelines

2017-07-10 Thread Thomas Groh
Hey everyone; I've been working on a design for implementing multi-language pipelines within the Beam SDKs (also known as mix-and-match). This kind of pipeline lets us reuse transforms written in one language in any other language that supports the Runner API and the Fn API. Letting us write a tra

Re: [PROPOSAL] Connectors for memcache and Couchbase

2017-07-10 Thread Lukasz Cwik
Splitting on slabs should allow you to split more finely grained then per server since each server itself maintains this information. If you take a look at the memcached protocol, you can see that lru_crawler supports a metadump command which will enumerate all the key for a set of given slabs or f

Re: [PROPOSAL] Connectors for memcache and Couchbase

2017-07-10 Thread Ismaël Mejía
Hello, Thanks Lukasz for bring some of this subjects. I have briefly discussed with the guys working on this they are the same team who did HCatalogIO (Hive). We just analyzed the different libraries that allowed to develop this integration from Java and decided that the most complete implementat

Re: MergeBot is here!

2017-07-10 Thread Ismaël Mejía
Excellent!, Automation of such repetitive (and error-prone) tasks is strongly welcomed. Thanks for making this happen Jason! Some comments: 1. I suppose the code of mergebot is now part of Apache Infra, no? Do you know exactly where the code is hosted? And what is the procedure in case somebody

Jenkins build is back to normal : beam_Release_NightlySnapshot #473

2017-07-10 Thread Apache Jenkins Server
See