Also start without NiFi. Start with some captured messages in a disk file accessible from the Metron server. Then use a kafka tool such as https://github.com/apache/metron/blob/master/metron-docker/compose/kafkazk/bin/produce-data.sh to pump that data into Kafka in a topic for Metron, as though it came from one of Metron’s sensors.
From: Matt Foley <[email protected]> on behalf of Matt Foley <[email protected]> Reply-To: "[email protected]" <[email protected]> Date: Thursday, June 8, 2017 at 3:46 PM To: "[email protected]" <[email protected]>, "[email protected]" <[email protected]> Cc: Otto Fowler <[email protected]> Subject: Re: Metron current version and Docker Otto, thanks for the pointer! Simone, in that article Casey shows us how to do the manual install on Ubuntu. So we’re good. Also, since this procedure does not use Docker in the build, you can go ahead and do the build on the server VM and not have to worry about moving files around from your Mac to the server. Did you read the article, and do you have any problems following it? For your other questions, please see in-line below. From: Otto Fowler <[email protected]> Date: Thursday, June 8, 2017 at 1:25 PM To: "[email protected]" <[email protected]>, Matt Foley <[email protected]>, "[email protected]" <[email protected]> Subject: Re: Metron current version and Docker https://community.hortonworks.com/content/kbentry/88843/manually-installing-apache-metron-on-ubuntu-1404.html On June 8, 2017 at 15:46:09, [email protected] ([email protected]<mailto:[email protected]>) wrote: Hello Matt, thanks for your email. Yes, I said that I would use an Ubuntu VM to install Metron. I'm old school and I'm not so familiar with Docker. ---[Matt] The article Otto referred us to shows that you can use an Ubuntu VM. Moreover, it seems that my CPU (MacBook) does not support virtualization. ---[Matt] Um, well, I guess this is unimportant since following the article’s instructions you can use your VM to do the build. But the fact is, on any Mac less than, say, 5 years old, both VirtualBox, Vagrant, and Docker should run fine. You just need to install those software packages. All modern Intel x86_64 cpus work with these virtualization tools. On the other hand, I don't know if a VM with CentOS 7 on my machine could run Docker. I mean due to the same problem of virtualization on the host CPU. ---[Matt] In my suggestion in the previous email, I was giving you a way to NOT NEED Docker, if you could use Centos7. However, we now have a similar way to not need Docker using Ubuntu, so go ahead and use Ubuntu since you’re more comfortable with it. Let assume that VM CentOS 7 support Docker, I'm not familiar with that distribution (I used several but never CentOS). Do you have rpm packages for the tools needed for Metron? ---[Matt] See previous item. I would use the 0.4.0 version, but after several day I'm a bit frustrated because I stil didn't completely understand the tools chain to run Metron. ---[Matt] The “Ambari manual install” procedure avoids the use of Docker, Ansible, and Vagrant, and therefore greatly simplifies the toolchain. I asked also into the dev-mailing list. My idea was to install NIFI as probe to catch network packets and fill in those into Metron. ---[Matt] One of the challenges of using opensource, is that each component tends to be a world unto itself. This is a Metron list, not a NiFi list. But you got some good responses in previous emails, the summary of which is: Use the Kafka message bus to transfer captured items from NiFi into Metron. Both NiFi and Metron use Kafka natively. Then, I still didn't understand where I should deploy ML model into Metron to run it as a service. ---[Matt] Please refer to the architecture diagram at https://github.com/apache/metron/blob/master/metron-analytics/metron-maas-service/README.md . The block labeled “YARN” shows the model, encapsulated in its “REST Model Service”, running separately from the rest of Metron. Metron provides infrastructure to provision the Yarn container, deploy the model in it, and monitor the health of the service; this is documented in the maas_service.sh and mass_deploy.sh sections of the same web page, immediately below the diagram. But you must understand that the model isn’t running “in” Metron! It is running alongside Metron. This diagram shows the model service running in the same cluster, but it doesn’t have to be; it just has to be accessible over the network. Metron accesses the model service as a separate entity via REST interface calls, whether it is running local or remote. The next section of that web page, titled Stellar Integration, specifies how Metron makes outcalls to the model’s REST interface. First it gets the model service’s URL from Zookeeper configuration (which was set up by the model service deployment tools) using the Stellar call named MAAS_GET_ENDPOINT, then it can apply the model (pass a set of arguments and get back a score or other result) using the Stellar call named MAAS_MODEL_APPLY. Does that make sense? Have you gone through the Example (Mock DGA Model Service) in the same web page? If not, you need to work through that. It will clarify many things for you about how all the moving parts fit together. Once you get Metron running on your Ubuntu VM, please actually do the installation of this example model service. And finally, how to moidfy the ML to include it into a RESTful app. ---[Matt] This is a key question. The answer, for better or worse, is mostly documented by example. Casey tried to give you help in his previous emails. I’m not an expert in this stuff, but my understanding (mostly from another discussion with Casey a couple months ago) is that a working Python ML model can be easily turned into a “model service” using the python Flask micro-service framework. This literally takes < 50 lines of code, mostly boilerplate. As an example, Casey pointed at the patch that added Flask-based REST interface to the example Mock DGA model (used in the above-referenced documentation): https://gist.github.com/cestella/8dd83031b8898a732b6a5a60fce1b616 Hopefully this helps. If you have other questions, I suggest you use this known-good code as the starting point. Thoroughly understand the Mock DGA example, and you’ll have a good start on writing your own. I'm sorry for the long list of questions/issues. I knwo that it is not so elegant into a mailing list but I'm a step from give up Metron. Even if I know that it would be a mistake. ---[Matt] Unfortunately, you will have to invest in learning Metron as a stand-alone system, before you learn to add MaaS to it. I suggest you start with a stripped down version. Set up Metron without MaaS first, establish that you can pump messages through it, then add the MaaS using the Mock DGA model as an example. Good luck, --Matt Thanks again if you could give some indications. Simone Il 8 giugno 2017 alle 20.40 Matt Foley <[email protected]> ha scritto: Hi Simone, If I recall your previous email, you said you want to use an Ubuntu VM. Can you use Centos 6 or 7 instead? The reason I ask is that for Centos there is an “Ambari manual install” procedure, which does not require Docker, Vagrant, or Ansible on the server. In this scenario you just install Docker on your development machine (I use a Mac), build the Metron RPMs and Ambari MPack there, scp them to the server, and proceed with Ambari install. This is in fact my main lab test method. But with Ubuntu, I’m not aware of a documented procedure for Ambari manual install, only install with Ansible playbooks combined with a Docker-based build. You’d have to figure out for yourself how to generate an ‘apt’ package, etc., instead. Perhaps other community members with more Ubuntu experience could assist here. Regarding 0.4.0 vs 0.3.1: At this point, 0.3.1 is several months old, from Feb 23. There have been about 140 commits since then, including both bug fixes and feature developments. However, I don’t think there have been major changes in the MaaS feature. Cheers, --Matt From: "[email protected]" <[email protected]> Reply-To: "[email protected]" <[email protected]> Date: Thursday, June 8, 2017 at 3:19 AM To: "[email protected]" <[email protected]> Subject: Metron current version and Docker Dear All, I'm newbe with Metron and actually I'm just figure out how to install it to perform some tests. Currently, I would start installing Metron in a single VM to do my tests. I don't know which are the differences between 0.3.1 and 0.4.0. Unfortunately for me, my CPU does not support virtualization. That means that I cannot use Docker. The only workaround that I found is to use AWS directly but for me that I have never used Mentor it could be a so big step... So the question is, do I lose many things if I start with Mentor 0.3.1 into a single VM without Docker? Best regards, Simone
