Simon, There are several reasons to decouple model execution from Storm:
- Reliability: It's much easier to handle a failed service than a failed bolt. You can also troubleshoot without having to bring down the topology - Complexity: you de-couple the model logic from Storm logic and can manage it independently of Storm - Portability: you can swap the model guts (switch from Spark to Flink, etc) and as long as you maintain the interface you are good to go - Consistency: since we want to expose our models the same way we expose threat intel then it makes sense to expose them as a service In our vision for Metron we want to make it easy to uptake and share models. I think well-defined interfaces and programmatic ways of deployment, lifecycle management, and scoring via well-defined REST interfaces will make this task easier. We can do a few things to With respect to PMML I personally had not had much luck with it in production. I would prefer models as POJOs. Thanks, James 04.07.2016, 16:07, "Simon Ball" <[email protected]>: > Since the models' parameters and execution algorithm are likely to be small, > why not have the model store push the model changes and scoring direct to the > bolts and execute within storm. This negates the overhead of a rest call to > the model server, and the need for discovery of the model server in zookeeper. > > Something like the way ranger policies are updated / cached in plugins would > seem to make sense, so that we're distributing the model execution directly > into the enrichment pipeline rather than collecting in a central service. > > This would work with simple models on single events, but may struggle with > correlation based models. However, those could be handled in storm by pushing > into a windowing trident topology or something of the sort, or even with a > parallel spark streaming job using the same method of distributing models. > > The real challenge here would be stateful online models, which seem like a > minority case which could be handled by a shared state store such as HBase. > > You still keep the ability to run different languages, and platforms, but > wrap managing the parallelism in storm bolts rather than yarn containers. > > We could also consider basing the model protocol on a a common model language > like pmml, thong that is likely to be highly limiting. > > Simon > >> On 4 Jul 2016, at 22:35, Casey Stella <[email protected]> wrote: >> >> This is great! I'll capture any requirements that anyone wants to >> contribute and ensure that the proposed architecture accommodates them. I >> think we should focus on a minimal set of requirements and an architecture >> that does not preclude a larger set. I have found that the best driver of >> requirements are installed users. :) >> >> For instance, I think a lot of questions about how often to update a model >> and such should be represented in the architecture by the ability to >> manually update a model, so as long as we have the ability to update, >> people can choose when and where to do it (i.e. time based or some other >> trigger). That being said, we don't want to cause too much effort for the >> user if we can avoid it with features. >> >> In terms of the questions laid out, here are the constraints from the >> proposed architecture as I see them. It'd be great to get a sense of >> whether these constraints are too onerous or where they're not opinionated >> enough : >> >> - Model versioning and retention >> - We do have the ability to update models, but the training and decision >> of when to update the model is left up to the user. We may want to >> think >> deeply about when and where automated model updates can fit >> - Also, retention is currently manual. It might be an easier win to >> set up policies around when to sunset models (after newer versions are >> added, for instance). >> - Model access controls management >> - The architecture proposes no constraints around this. As it stands >> now, models are held in HDFS, so it would inherit the same security >> capabilities from that (user/group permissions + Ranger, etc) >> - Requirements around concept drift >> - I'd love to hear user requirements around how we could automatically >> address concept drift. The architecture as it's proposed let's the user >> decide when to update models. >> - Requirements around model output >> - The architecture as it stands just mandates a JSON map input and JSON >> map output, so it's up to the model what they want to pass back. >> - It's also up to the model to document its own output. >> - Any model audit and logging requirements >> - The architecture proposes no constraints around this. I'd love to see >> community guidance around this. As it stands, we just log using the >> same >> mechanism as any YARN application. >> - What model metrics need to be exposed >> - The architecture proposes no constraints around this. I'd love to see >> community guidance around this. >> - Requirements around failure modes >> - We briefly touch on this in the document, but it is probably not >> complete. Service endpoint failure will result in blacklisting from a >> storm bolt perspective and node failure should result in a new >> container >> being started by the Yarn application master. Beyond that, the >> architecture isn't explicit. >> >>> On Mon, Jul 4, 2016 at 1:49 PM, James Sirota <[email protected]> wrote: >>> >>> I left a comment on the JIRA. I think your design is promising. One >>> other thing I would suggest is for us to crowd source requirements around >>> model management. Specifically: >>> >>> Model versioning and retention >>> Model access controls management >>> Requirements around concept drift >>> Requirements around model output >>> Any model audit and logging requirements >>> What model metrics need to be exposed >>> Requirements around failure modes >>> >>> 03.07.2016, 14:00, "Casey Stella" <[email protected]>: >>>> Hi all, >>>> >>>> I think we are at the point where we should try to tackle Model as a >>>> service for Metron. As such, I created a JIRA and proposed an >>> architecture >>>> for accomplishing this within Metron. >>>> >>>> My inclination is to be data science language/library agnostic and to >>>> provide a general purpose REST infrastructure for managing and serving >>>> models trained on historical data captured from Metron. The assumption is >>>> that we are within the hadoop ecosystem, so: >>>> >>>> - Models stored on HDFS >>>> - REST Model Services resource-managed via Yarn >>>> - REST Model Services discovered via Zookeeper. >>>> >>>> I would really appreciate community comment on the JIRA ( >>>> https://issues.apache.org/jira/browse/METRON-265). The proposed >>>> architecture is attached as a document to that JIRA. >>>> >>>> I look forward to feedback! >>>> >>>> Best, >>>> >>>> Casey >>> >>> ------------------- >>> Thank you, >>> >>> James Sirota >>> PPMC- Apache Metron (Incubating) >>> jsirota AT apache DOT org ------------------- Thank you, James Sirota PPMC- Apache Metron (Incubating) jsirota AT apache DOT org
