Hey everyone,

I've had a few sticking points that I encountered while trying to create
some custom solutions using the Metron platform and could use some guidance.

1) My custom enrichment config is not writing to elasticsearch or may be
configured improperly.

My extractor config:
{
  "config" : {
    "columns" : {
         "ip" : 0,
         "host" : 1
    },
     "indicator_column" : "ip",
     "type" : "hostname",
     "separator" : ","
  },
  "extractor" : "CSV"
}

My enrichment config:
{
  "zkQuorum" : "node1:2181",
  "sensorToFieldList" : {
     "bro" : {
       "type" : "ENRICHMENT",
       "fieldToEnrichmentTypes" : {
         "ip_src_addr" : ["hostname"],
         "ip_dst_addr" : ["hostname"]
         }
      }
   }
}

A sample of the data i'm uploading:
0.0.0.0, "IGMP"
10.113.145.135, "GLAZER"
10.113.145.137, "GLAZER"
10.113.145.138, "GLAZER"

i'm uploading to zookeeper using the following command:
/usr/metron/0.2.1BETA/bin/flatfile_loader.sh -n
hostname_enrichment_config.json -i hostname_ref.csv -t enrichment -c hosts
-e hostname_extractor_config.json

2) We eventually want to parse this data as a live stream but the parser
errors out when I try sending data in. Here is the parser config:
{
  "parserClassName" : "org.apache.metron.parsers.csv.CSVParser",
  "writerClassName" :
"org.apache.metron.writer.hbase.SimpleHbaseEnrichmentWriter",
  "sensorTopic":"hostname",
  "parserConfig":
  {
    "shew.table" : "enrichment",
    "shew.cf" : "hosts",
    "shew.keyColumns" : "ip",
    "shew.enrichmentType" : "hostname",
    "columns" : {
      "ip" : 0,
      "host" : 1
    }
  }
}

3) We will be moving from replay to using kafka-python for sending data
captures and I am able to send bytes to a new topic, but when I try using
the json serializer via kafka producer my program exits without error and
no data is sent.
Here is the section of the python code i'm having trouble with:

producer = KafkaProducer(bootstrap_servers='50.253.243.17:6667',
value_serializer=lambda m: json.dumps(m).encode('ascii'), api_version=(0,
9))

for _ in range(100):
    producer.send('pcap', {'key': 'value'})
    producer.flush()

If anyone could point me in the right direction that would be great!! I'm
not sure if the first 2 problems are related to indexing or maybe I need to
create a bolt to pass on the data in storm?

Regards,

Tyler Moore
Software Engineer
Flyball Labs

Reply via email to