Hi Ian, On 30/03/17 17:26, "Ian Boston" <[email protected] on behalf of [email protected]> wrote:
>Hi Stefan, > >On 30 March 2017 at 15:55, Stefan Egli <[email protected]> wrote: > >> Hi Ian, >> >> On 30/03/17 16:37, "Ian Boston" <[email protected] on behalf of >> [email protected]> wrote: >> >Why Kafka ? >> >That doesnt make sense given all the prior discussions about not >>binding >> >to >> >proprietary third party stacks, unless this is being implemented >>because >> >you have been told to support a specific use case by a manager. >> >> It's requirements driven, yes. >> > >For the sake of the community they should be shared if they are to be >implemented under the Sling project. >If the requirements can't be shared, then perhaps it's better off >implementing somewhere else. >We don't want to get accused of not being a good Apache project ? The requirement itself is to run jobs via Kafka explicitly - ie the target deployment 'Kafka' is given. How that should be done is open. The fact that this should be done in a event/api compatible way is not a requirement, but rather a technical decision point and something that IIUC would be ideal (as it would mean least migration effort). But maybe there will be technical reason to revisit this. It can of course be implemented elsewhere, but the assumption/hope was, that it might be useful for others too. > > >> >> That doesn't mean it can't be implemented in an extensible way. As >> mentioned in the Jira I could see that this could use the MoM API and >>hook >> Kafka in that way - have to yet test how that would work. >> >> The main different between this approach and mom/jobs/core is that this >> aims to be compatible with event/api. >> > > >That is what mom/jobs/core was trying to do. >Sling Events API is semantically incompatible with distributed messaging, >which is why it failed to be 100% compatible. ymmv > >If you must have 100% compatibility, then don't use MoM API. I dont know >how you will really do "distributed' and "messaging" given parts of the >Sling Events API need to query a centralised store. IIUC the recommended >way of doing this in Kafka is to stream the events into something that can >be queried and query that (eg MongoDB, Cassandra etc). The MoM API >supports >that mechanism already with a listener on the management topics. >Apparently you can query Kafka directly, but its messy and requires your >data model becomes specific to Kafka. Not recommended. Maybe I don't see the full extend of this yet. IIUC then first main issue is JobManager.findJobs. And wrt that I was assuming to implement it exactly as you wrote: a service that listens to the job traffic, extracts status information and provides it in a way that can be queried (storing it somewhere central). I agree that quering Kafka directly is not ideal. Btw even storing it somewhere has its issues, as the status information will always potentially be slightly delayed. When you say its 'semantically incompatible' - is there something else you see other than findJobs? Thanks, Cheers, Stefan > >Best Regards >Ian > > > > >> >> Cheers, >> Stefan >> >> >>
