On November 23, 2017 10:04 am, Shuquan Huang wrote:
I think standard openstack python project is fine. Before the project creation, anything I can help with now? (
It seems like the first task is to design a stable interface between log-gearman-client and log-classify. To be honest, I haven't fully grasp the current crm script yet, so I'm not sure what needs to be covered. For example, is this change https://review.openstack.org/522399 should be part of the library? If not, then the interface would likely be something like: logclassify.logstash.process(logstash_event) Otherwise, I think it's fair to well define the log-gearman-worker requirements and build a filter interface like so: filter = logclassify.logstash.Filter(filename, ...) for line in log_lines: error_pr = filter.process_line(line) The library interface should at least address: 1/ Usage outside of the log-gearman-client context, such as integration test and simple command line 2/ Being able to swap out the crm114 implementation with other models Then we can define the internal logic of the crm114 model so that a generic model class could be designed within the logclassify library. With that in place, we'll be able to run integration test to verify the library correctly detect annomalies. Speaking of which, I think it's important to curate a dataset of success/failure logs with the expected anomalies to be found. Those will be super useful to prevent regression when trying out new settings or models. How to store and manage the dataset remains to be defined too. To give you an idea, fwiw, you can find my original dataset here: git clone https://softwarefactory-project.io/r/logreduce-tests Cheers, -Tristan
On 22/11/2017, 4:30 PM, "Tristan Cacqueray" <[email protected] on behalf of [email protected]> wrote: On November 21, 2017 5:48 pm, Clark Boylan wrote: > On Tue, Nov 21, 2017, at 09:17 AM, Tristan Cacqueray wrote: >>> > snip > >> Actually the rfc is this thread :-) >> >> Though I forgot to mention the first steps that could use comments before>> we move on: >> * create the openstack-infra/log-classify project, >> * import the log-classify.crm script, >> * wrap the script with a more user friendly interface, and >> * modify the puppet-log_processor to use that new project instead> > This sounds like a great place to start. Considering the interest> already forming around this I would say go ahead and create the project > and start with the import process so that people have a concrete place > to start working on this. I am sure it will evolve from there, but > getting started is often the most difficult step.> > Related to the last step we have temporarily disabled CRM classification> in the log processor pipeline because we treat the whole file path as a > unique file to classify which ended up filling our workers' disks with > classification files. I think one of the things we will want to address > early on is using the basename rather than the whole path to > significantly reduce the total number of data files on disk. This way we > can get it running in the log processor pipeline again for proper > production feedback of changes that are happening.> > Once again let me know if I can help with anything (happy to review new> project creation changes for example).> Excellent, project creation is proposed here:https://review.openstack.org/#/q/topic:log-classifyI'm open to suggestion regarding the name and structure of the project.Otherwise I'll create a standard openstack python project with:logclassify.logstash module to interface with the script using thedesign of the log-gearman-client.py (e.g. a process(event)). logclassify.cmd module to use the script standalone.And then write a first test and implementation of that basename base datafiles improvement.If that works ok, then a follow-up change will modify thelog-gearman-client to import logclassify instead of running the script directly.> Thank you for getting this started,> Clark> Thanks for the quick feedback!-Tristan _______________________________________________ OpenStack-Infra mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
pgpe1IJBqAgP_.pgp
Description: PGP signature
_______________________________________________ OpenStack-Infra mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
