Thanks Jon I will look into the links you provided first thing tomorrow morning.
I installed Nifi and have it running on my node1. I suspected it would be part of the overall solution. I suspect that as sources increase I would consider installing it on the other vms???? F F Sent from my Bell Samsung device over Canada's largest network. -------- Original message -------- From: "[email protected]" <[email protected]> Date: 2017-09-06 11:05 PM (GMT-05:00) To: [email protected] Subject: Re: Clearing of data to start over Can you clarify what issues you're having with bro? I would be happy to help get that working. Re: kafka brokers, you can easily add or remove these after the initial install in Ambari. See this for more details - https://community.hortonworks.com/questions/617/is-it-possible-to-add-another-kafka-broker-to-my-c.html Adding another VM to a cluster is pretty straightforward - more details : https://docs.hortonworks.com/HDPDocuments/Ambari-1.6.1.0/bk_Monitoring_Hadoop_Book/content/monitor-chap2-4b_2x.html As long as you can get the data onto the right kafka topic, you should be good to go. I would suggest looking into nifi, logstash, rsyslog, etc. Jon On Wed, Sep 6, 2017 at 11:01 PM Frank Horsfall <[email protected]<mailto:[email protected]>> wrote: I'm on a role with questions. I'm curious to see if I can relieve processing pressure by adding a new vm. Would you know how I would go about it? Also I would like to pull data from sources instead of have the sources push data to my site. Have you come across this scenario? F Sent from my Bell Samsung device over Canada's largest network. -------- Original message -------- From: Frank Horsfall <[email protected]<mailto:[email protected]>> Date: 2017-09-06 10:51 PM (GMT-05:00) To: [email protected]<mailto:[email protected]> Subject: Re: Clearing of data to start over Also Laurens you recommended to make 3 Kafka brokers but the install wizard would not let me. As a result my node1 is the only broker currently. Would this cause a bottleneck? If so is there a method to install and configures the 2 additional brokers post initial install? kindest regards Frank Sent from my Bell Samsung device over Canada's largest network. -------- Original message -------- From: Frank Horsfall <[email protected]<mailto:[email protected]>> Date: 2017-09-06 10:38 PM (GMT-05:00) To: [email protected]<mailto:[email protected]> Subject: Re: Clearing of data to start over Thanks Laurens and Nick. I want to let the queues run over night to give us some possible insights into heap sizes etc. I currently have 3 vms configured each with 8 cores 500 gigs of data capacity and 30 gigs of memory. Elasticsearch has been configured with 10 gigs xmx. I've set storm worker childopts at 7 gigs for now so it takes a while to max out and generate heap errors. I deleted approx 6 million events and shut off the data generating apps. The idea is to see how much will be processed overnight. One thing that has me puzzled is why my bro app isn't emitting events. I double checked my config based on what's recommended but nothing is coming through. A mystery. lol Also I kept some notes during the whole process and want to share them if you are interested. let me know Frank Sent from my Bell Samsung device over Canada's largest network. -------- Original message -------- From: Laurens Vets <[email protected]<mailto:[email protected]>> Date: 2017-09-06 6:17 PM (GMT-05:00) To: [email protected]<mailto:[email protected]> Cc: Frank Horsfall <[email protected]<mailto:[email protected]>> Subject: Re: Clearing of data to start over Hi Frank, If you all your queues (Kafka/Storm) are empty, the following should work: - Deleting your elasticsearch indices: curl -X DELETE 'http://localhost:9200/snort_index_*', curl -X DELETE 'http://localhost:9200/yaf_index_*', etc... - Deleting your Hadoop data: Become the hdfs user: sudo su - hdfs Show what's been indexed in Hadoop: hdfs dfs -ls /apps/metron/indexing/indexed/ Output should show the following probably: /apps/metron/indexing/indexed/error /apps/metron/indexing/indexed/snort /apps/metron/indexing/indexed/yaf ... You can remove these with: hdfs dfs -rmr -skipTrash /apps/metron/indexing/indexed/error/ hdfs dfs -rmr -skipTrash /apps/metron/indexing/indexed/snort/ Or the individial files with hdfs dfs -rmr -skipTrash /apps/metron/indexing/indexed/error/FILENAME On 2017-09-06 13:59, Frank Horsfall wrote: Hello all, I have installed a 3 node system using the bare metal Centos 7 guideline. https://cwiki.apache.org/confluence/display/METRON/Metron+0.4.0+with+HDP+2.5+bare-metal+install+on+Centos+7+with+MariaDB+for+Metron+REST It has taken me a while to have all components working properly and I left the yaf,bro,snort apps running so quite a lot of data has been generated. Currently, I have almost 18 million events identified in Kibana. 16+ million are yaf based, and 2+ million are snort …. 190 events are my new squid telemetry, :). It looks like it still has a while to go before it catches up to current day. I recently shutdown the apps. My questions are: 1. Is there a way to wipe all my data and indices clean so that I may now begin with a fresh dataset? 2. Is there a way to configure yaf so that its data is meaningful ? It is currently creating what looks to be test data? 3. I have commented out the test snort rule but it is still generating the odd record which looks once again looks like test data. Can this be stopped as well? Kindest regards, Frank -- Jon
