Re: [DISCUSS] SplittableDoFn Java SDK User Facing API

2019-04-02 Thread Lukasz Cwik
I was able to update the failing Watch transform in https://github.com/apache/beam/pull/8146 and this has now been merged. On Mon, Mar 18, 2019 at 10:32 AM Lukasz Cwik wrote: > Thanks Kenn, based upon the error message there was a small amount of code > that I missed when updating the code.

Re: [DISCUSS] SplittableDoFn Java SDK User Facing API

2019-03-18 Thread Lukasz Cwik
Thanks Kenn, based upon the error message there was a small amount of code that I missed when updating the code. I'll attempt to fix this in the next few days. On Mon, Jan 14, 2019 at 7:26 PM Kenneth Knowles wrote: > I wanted to use this thread to ping that the change to the user-facing API >

Re: [DISCUSS] SplittableDoFn Java SDK User Facing API

2019-01-14 Thread Kenneth Knowles
I wanted to use this thread to ping that the change to the user-facing API in order to wrap RestrictionTracker broke the Watch transform, which has been sickbayed for a long time. It would be helpful for experts to weigh in on https://issues.apache.org/jira/browse/BEAM-6352 about how the

Re: [DISCUSS] SplittableDoFn Java SDK User Facing API

2018-12-05 Thread Lukasz Cwik
Based upon the current Java SDK API, I was able to implement Runner initiated checkpointing that the Java SDK honors within PR https://github.com/apache/beam/pull/7200. This is an exciting first step to a splitting implementation, feel free to take a look and comment. I have added two basic

Re: [DISCUSS] SplittableDoFn Java SDK User Facing API

2018-11-30 Thread Robert Bradshaw
On Fri, Nov 30, 2018 at 10:14 PM Lukasz Cwik wrote: > > On Fri, Nov 30, 2018 at 1:02 PM Robert Bradshaw wrote: >> >> On Fri, Nov 30, 2018 at 6:38 PM Lukasz Cwik wrote: >> > >> > Sorry, for some reason I thought I had answered these. >> >> No problem, thanks for you patience :). >> >> > On Fri,

Re: [DISCUSS] SplittableDoFn Java SDK User Facing API

2018-11-30 Thread Lukasz Cwik
On Fri, Nov 30, 2018 at 1:02 PM Robert Bradshaw wrote: > On Fri, Nov 30, 2018 at 6:38 PM Lukasz Cwik wrote: > > > > Sorry, for some reason I thought I had answered these. > > No problem, thanks for you patience :). > > > On Fri, Nov 30, 2018 at 2:20 AM Robert Bradshaw > wrote: > >> > >> I

Re: [DISCUSS] SplittableDoFn Java SDK User Facing API

2018-11-30 Thread Robert Bradshaw
On Fri, Nov 30, 2018 at 6:38 PM Lukasz Cwik wrote: > > Sorry, for some reason I thought I had answered these. No problem, thanks for you patience :). > On Fri, Nov 30, 2018 at 2:20 AM Robert Bradshaw wrote: >> >> I still have outstanding questions (above) about >> >> 1) Why we need arbitrary

Re: [DISCUSS] SplittableDoFn Java SDK User Facing API

2018-11-30 Thread Lukasz Cwik
Note I have merged the PR but will continue to iterate based upon the feedback provided in this thread as it has been quite useful. On Fri, Nov 30, 2018 at 9:37 AM Lukasz Cwik wrote: > Sorry, for some reason I thought I had answered these. > > On Fri, Nov 30, 2018 at 2:20 AM Robert Bradshaw >

Re: [DISCUSS] SplittableDoFn Java SDK User Facing API

2018-11-30 Thread Lukasz Cwik
Sorry, for some reason I thought I had answered these. On Fri, Nov 30, 2018 at 2:20 AM Robert Bradshaw wrote: > I still have outstanding questions (above) about > > 1) Why we need arbitrary precision for backlog, instead of just using > a (much simpler) double. > Double lacks the precision for

Re: [DISCUSS] SplittableDoFn Java SDK User Facing API

2018-11-30 Thread Robert Bradshaw
I still have outstanding questions (above) about 1) Why we need arbitrary precision for backlog, instead of just using a (much simpler) double. 2) Whether its's worth passing backlog back to split requests, rather than (again) a double representing "portion of current remaining" which may change

Re: [DISCUSS] SplittableDoFn Java SDK User Facing API

2018-11-27 Thread Lukasz Cwik
I updated the PR addressing the last of Scott's comments and also migrated to use an integral fraction as Robert had recommended by using approach A for the proto representation and BigDecimal within the Java SDK: A: // Represents a non-negative decimal number: unscaled_value * 10^(-scale) message

Re: [DISCUSS] SplittableDoFn Java SDK User Facing API

2018-11-26 Thread Lukasz Cwik
On Mon, Nov 26, 2018 at 9:09 AM Ismaël Mejía wrote: > > Bundle finalization is unrelated to backlogs but is needed since there > is a class of data stores which need acknowledgement that says I have > successfully received your data and am now responsible for it such as > acking a message from a

Re: [DISCUSS] SplittableDoFn Java SDK User Facing API

2018-11-20 Thread Robert Bradshaw
On Tue, Nov 20, 2018 at 7:10 PM Lukasz Cwik wrote: > I'll perform the swap for a fraction because as I try to map more of the > spaces to an arbitrary byte[] I naturally first map the space onto natural > numbers before mapping to a byte[]. > > Any preference between these options: > A: > //

Re: [DISCUSS] SplittableDoFn Java SDK User Facing API

2018-11-20 Thread Lukasz Cwik
Ismael, I looked at the API around ByteKeyRangeTracker and OffsetRangeTracker figured out that the classes are named as such because they are trackers for the OffsetRange and ByteKeyRange classes. Some options are to: 1) Copy the ByteKeyRange and call it ByteKeyRestriction and similarly copy

Re: [DISCUSS] SplittableDoFn Java SDK User Facing API

2018-11-20 Thread Lukasz Cwik
I'll perform the swap for a fraction because as I try to map more of the spaces to an arbitrary byte[] I naturally first map the space onto natural numbers before mapping to a byte[]. Any preference between these options: A: // Represents a non-negative decimal number: unscaled_value *

Re: [DISCUSS] SplittableDoFn Java SDK User Facing API

2018-11-20 Thread Robert Bradshaw
I'm still trying to wrap my head around what is meant by backlog here, as it's different than what I've seen in previous discussions. Generally, the backlog represented a measure of the known but undone part of a restriction. This is useful for a runner to understand in some manner what progress

Re: [DISCUSS] SplittableDoFn Java SDK User Facing API

2018-11-19 Thread Lukasz Cwik
I also addressed a bunch of PR comments which clarified the contract/expectations as described in my previous e-mail and the splitting/backlog reporting/bundle finalization docs. On Mon, Nov 19, 2018 at 3:19 PM Lukasz Cwik wrote: > > > On Mon, Nov 19, 2018 at 3:06 PM Lukasz Cwik wrote: > >>

Re: [DISCUSS] SplittableDoFn Java SDK User Facing API

2018-11-19 Thread Lukasz Cwik
On Mon, Nov 19, 2018 at 3:06 PM Lukasz Cwik wrote: > Sorry for the late reply. > > On Thu, Nov 15, 2018 at 8:53 AM Ismaël Mejía wrote: > >> Some late comments, and my pre excuses if some questions look silly, >> but the last documents were a lot of info that I have not yet fully >> digested. >>

Re: [DISCUSS] SplittableDoFn Java SDK User Facing API

2018-11-19 Thread Lukasz Cwik
Sorry for the late reply. On Thu, Nov 15, 2018 at 8:53 AM Ismaël Mejía wrote: > Some late comments, and my pre excuses if some questions look silly, > but the last documents were a lot of info that I have not yet fully > digested. > > I have some questions about the ‘new’ Backlog concept

Re: [DISCUSS] SplittableDoFn Java SDK User Facing API

2018-11-15 Thread Ismaël Mejía
Some late comments, and my pre excuses if some questions look silly, but the last documents were a lot of info that I have not yet fully digested. I have some questions about the ‘new’ Backlog concept following a quick look at the PR https://github.com/apache/beam/pull/6969/files 1. Is the

Re: [DISCUSS] SplittableDoFn Java SDK User Facing API

2018-11-07 Thread Lukasz Cwik
On Wed, Nov 7, 2018 at 8:33 AM Robert Bradshaw wrote: > I think that not returning the users specific subclass should be fine. > Does the removal of markDone imply that the consumer always knows a > "final" key to claim on any given restriction? > Yes, each restriction needs to support claiming

Re: [DISCUSS] SplittableDoFn Java SDK User Facing API

2018-11-07 Thread Robert Bradshaw
I think that not returning the users specific subclass should be fine. Does the removal of markDone imply that the consumer always knows a "final" key to claim on any given restriction? On Wed, Nov 7, 2018 at 1:45 AM Lukasz Cwik wrote: > > I have started to work on how to change the user facing

[DISCUSS] SplittableDoFn Java SDK User Facing API

2018-11-06 Thread Lukasz Cwik
I have started to work on how to change the user facing API within the Java SDK to support splitting/checkpointing[1], backlog reporting[2] and bundle finalization[3]. I have this PR[4] which contains minimal interface/type definitions to convey how the API surface would change with these 4