I'm happy to help facilitate/coordinate wherever I can. The Storm community has been very busy prepping our 1.0 release (which includes a ton of new features and may benefit Eagle), but the dust should settle soon.
Please let me know if there's anything I can do to help. -Taylor > On Jan 19, 2016, at 2:15 AM, Henry Saputra <[email protected]> wrote: > > I think we could start with a list of what Apache Eagle needed from > Topology lifecycle management and see if we do cross-post to both dev@ list > of Eagle and Storm to see if enough interest from Storm to work on it or > accept contributions from Eagle. > > - Henry > > On Fri, Jan 15, 2016 at 4:06 PM, Zhang, Edward (GDI Hadoop) < > [email protected]> wrote: > >> I had a short discussion with Henry about this. We probably need discuss a >> more graceful way to tackle the problem of whether eagle does this or >> storm does this. >> Today we know that Storm also provides topology view/statistics features >> but does not have topology lifecycle management UI. >> But can we ask Storm team to build this topology lifecycle UI? >> Right now, can we make our implementation more pluggable and extensible so >> later on if Storm has this feature, we can use that directly (at least >> from API lever) >> >> Do you want to communicate with storm community about this or probably >> Eagle committer can build this feature and contribute back to storm? >> >> What do you think? >> >> Thanks >> Edward >> >> >> >>> On 1/11/16, 14:45, "Edward Zhang" <[email protected]> wrote: >>> >>> Thanks for the valuable comments, you are right in the high-level >>> observations :-) >>> >>> (I participated some offline discussion on this proposal) I think this >>> proposal is based on the requirements that Eagle not only monitors >>> security >>> events from hadoop but also monitors security events from other data >>> source, for example cassandra and even more requirements are from hadoop >>> native metric monitoring. In last 2 months, when we want to onboard new >>> diverse datasources(for example mongo db metrics, hadoop native metrics >>> etc.), we find it is impossible for user(mostly operations team) to >>> understand storm topology or write code for very simple metrics >>> ingestion/alert rules. For monitoring perspective, user usually wants >>> metric/log onboarding is as simple as turn-key operation. >>> >>> It is possible for Eagle developers to develop applications for each data >>> source, but it might be better for Eagle to support logs/metrics with some >>> general schema. User just needs some configurations to get new data source >>> flow into Eagle and create policy on-the-fly against the streaming data. >>> Maybe I am wrong, but Eagle looks is an application to Storm framework, >>> but >>> Eagle would be a framework to monitoring applications e.g. security or >>> other data activity. >>> >>> The point of Eagle to separate application and framework is very correct. >>> In Eagle source code, the application code and framework code are >>> separated >>> from the beginning. For those features like policy restore after machine >>> fails, DSL, aggregation etc, I think Eagle team should look for >>> contributing back to Storm if people agree that is what stream framework >>> should have. >>> >>> We also found that Eagle monitoring uses streaming framework but streaming >>> framework is not customized for monitoring. The gap between monitoring >>> platform and streaming framework has to be filled to make sure monitoring >>> is reliable. For example compared to real-time streaming analytics, >>> monitoring does not want to have any false alert or missing alert >>> especially for security event, which requires more processing semantics >>> than popular streaming framework provides. We will actively explore help >>> from streaming projects. >>> >>> Thanks >>> >>> Edward >>> >>>> On Mon, Jan 11, 2016 at 10:55 AM, Julian Hyde <[email protected]> wrote: >>>> >>>> Can I make a high-level observation? (And, although I¹m a mentor of this >>>> project, I¹m not speaking as a mentor, just someone who has built >>>> various >>>> database and streaming systems over the years.) >>>> >>>> I¹ve noticed that Eagle is taking on several problems that ‹ >>>> architecturally speaking ‹ should be part of the underlying streaming >>>> system. This topology manager and also, a DSL declarative streaming >>>> queries, and making sure that streaming queries continue where they left >>>> off, even in the presence of failures of individual stream-processing >>>> nodes. >>>> >>>> These are very hard distributed systems problems, and they are >>>> horizontal >>>> problems that have nothing to do with Eagle¹s problem domain >>>> (security). It >>>> would be analogous to an application that is selling concert tickets >>>> deciding to develop HBase as part of the application. >>>> >>>> If the Eagle community wants to solve these problems, that¹s awesome, >>>> and >>>> you should go for it. Apache projects are great at pulling in people >>>> with >>>> diverse skills and when they gather momentum they can build some amazing >>>> technology. But I think Eagle should consider putting more architectural >>>> separation between your stream management technology and your >>>> application. >>>> You could do that by building separate modules (and testing them >>>> independently). Or you could contribute the functionality you need to >>>> the >>>> underlying system (e.g. Storm/Nimbus). Your project will run much more >>>> smoothly if you call out the hard problems you are trying to solve. >>>> >>>> And, as a side benefit, projects that have nothing to do with Hadoop >>>> security will be able to use (and test, and bug-fix) the technology you >>>> are >>>> developing. >>>> >>>> Julian >>>> >>>> >>>>> On Jan 11, 2016, at 8:57 AM, Hao Chen <[email protected]> wrote: >>>>> >>>>> Currently eagle is requiring user to manually manage topologies >>>> completely >>>>> independent of eagle components, which is not very smooth for the >>>> user >>>> and >>>>> management experience end-to-end from on-boarding datasource, starting >>>>> topologies, defining policy and also monitoring policy and execution >>>>> status, so how do you think we manage everything in single place and >>>>> dynamically manage topology lifecycle like >>>>> starting/stopping/status/monitoring as well policy >>>>> creation/modification/monitoring all in eagle ui only? So that user >>>> don't >>>>> need to touch storm anymore except specifying where the nimbus is when >>>>> setting up eagle. >>>>> >>>>> >>>>> -- >>>>> >>>>> Hao >> >>
