Konstantin, Not a problem. Thanks for pointing me in the right direction.
David On Tue, Mar 22, 2016 at 5:17 PM, Konstantin Knauf < konstantin.kn...@tngtech.com> wrote: > Hi David, > > interesting use case, I think, this can be nicely done with a comap. Let > me know if you run into problems, unfortunately I am not aware of any > open source examples. > > Cheers, > > Konstnatin > > On 22.03.2016 21:07, David Brelloch wrote: > > Konstantin, > > > > For now the jobs will largely just involve incrementing or decrementing > > based on the json message coming in. We will probably look at adding > > windowing later but for now that isn't a huge priority. > > > > As an example of what we are looking to do lets say the following > > 3 message were read from the kafka topic: > > {"customerid": 1, "event": "addAdmin"} > > {"customerid": 1, "event": "addAdmin"} > > {"customerid": 1, "event": "deleteAdmin"} > > > > If the customer with id of 1 had said they care about that type of > > message we would expect to be tracking the number of admins and notify > > them that they currently have 2. The events are obviously much more > > complicated than that and they are not uniform but that is the general > > overview. > > > > I will take a look at using the comap operator. Do you know of any > > examples where it is doing something similar? Quickly looking I am not > > seeing it used anywhere outside of tests where it is largely just > > unifying the data coming in. > > > > I think accumulators will at least be a reasonable starting place for us > > so thank your for pointing me in that direction. > > > > Thanks for your help! > > > > David > > > > On Tue, Mar 22, 2016 at 3:27 PM, Konstantin Knauf > > <konstantin.kn...@tngtech.com <mailto:konstantin.kn...@tngtech.com>> > wrote: > > > > Hi David, > > > > I have no idea how many parallel jobs are possible in Flink, but > > generally speaking I do not think this approach will scale, because > you > > will always only have one job manager for coordination. But there is > > definitely someone on the list, who can tell you more about this. > > > > Regarding your 2nd question. Could you go into some more details, > what > > the jobs will do? Without knowing any details, I think a control > kafka > > topic which contains the "job creation/cancellation requests" of the > > users in combination with a comap-operator is the better solution > here. > > You could keep the currently active "jobs" as state in the comap and > and > > emit one record of the original stream per active user-job together > with > > some indicator on how to process it based on the request. What are > your > > concerns with respect to insight in the process? I think with some > nice > > accumulators you could get a good idea of what is going on, on the > other > > hand if I think about monitoring 1000s of jobs I am actually not so > > sure ;) > > > > Cheers, > > > > Konstantin > > > > On 22.03.2016 19 <tel:22.03.2016%2019>:16, David Brelloch wrote: > > > Hi all, > > > > > > We are currently evaluating flink for processing kafka messages > and are > > > running into some issues. The basic problem we are trying to solve > is > > > allowing our end users to dynamically create jobs to alert based > off the > > > messages coming from kafka. At launch we figure we need to support > at > > > least 15,000 jobs (3000 customers with 5 jobs each). I have the > example > > > kafka job running and it is working great. The questions I have > are: > > > > > > 1. On my local machine (admittedly much less powerful than we > > would be > > > using in production) things fall apart once I get to around 75 > jobs. > > > Can flink handle a situation like this where we are looking at > > > thousands of jobs? > > > 2. Is this approach even the right way to go? Is there a different > > > approach that would make more sense? Everything will be > listening to > > > the same kafka topic so the other thought we had was to have 1 > job > > > that processed everything and was configured by a separate > control > > > kafka topic. The concern we had there was we would almost > completely > > > lose insight into what was going on if there was a slow down. > > > 3. The current approach we are using for creating dynamic jobs is > > > building a common jar and then starting it with the > configuration > > > data for the individual job. Does this sound reasonable? > > > > > > > > > If any of these questions are answered elsewhere I apologize. I > > couldn't > > > find any of this being discussed elsewhere. > > > > > > Thanks for your help. > > > > > > David > > > > -- > > Konstantin Knauf * konstantin.kn...@tngtech.com > > <mailto:konstantin.kn...@tngtech.com> * +49-174-3413182 > > <tel:%2B49-174-3413182> > > TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring > > Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke > > Sitz: Unterföhring * Amtsgericht München * HRB 135082 > > > > > > -- > Konstantin Knauf * konstantin.kn...@tngtech.com * +49-174-3413182 > TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring > Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke > Sitz: Unterföhring * Amtsgericht München * HRB 135082 >