[GitHub] incubator-metron issue #419: METRON-664: Make the index configuration per-wr...

cestella Wed, 18 Jan 2017 09:05:37 -0800

Github user cestella commented on the issue:

    https://github.com/apache/incubator-metron/pull/419
  
    Testing Instructions beyond the normal smoke test (i.e. letting data
    flow through to the indices and checking them).
    
    ## Preliminaries
    
    Since I will use the squid topology to pass data through in a controlled
    way, we must install squid and generate one point of data:
    * `yum install -y squid`
    * `service squid start`
    * `squidclient http://www.yahoo.com`
    
    Also, set an environment variable to indicate `METRON_HOME`:
    * `export METRON_HOME=/usr/metron/0.3.0` 
    
    ## Free Up Space on the virtual machine
    
    First, let's free up some headroom on the virtual machine.  If you are 
running this on a
    multinode cluster, you would not have to do this.
    * Kill monit via `service monit stop`
    * Kill tcpreplay via `for i in $(ps -ef | grep tcpreplay | awk '{print 
$2}');do kill -9 $i;done`
    * Kill existing parser topologies via 
       * `storm kill snort`
       * `storm kill bro`
    * Kill flume via `for i in $(ps -ef | grep flume | awk '{print $2}');do 
kill -9 $i;done`
    * Kill yaf via `for i in $(ps -ef | grep yaf | awk '{print $2}');do kill -9 
$i;done`
    * Kill bro via `for i in $(ps -ef | grep bro | awk '{print $2}');do kill -9 
$i;done`
    
    ## Deploy the squid parser
    * Create the squid kafka topic: 
`/usr/hdp/current/kafka-broker/bin/kafka-topics.sh --zookeeper node1:2181 
--create --topic squid --partitions 1 --replication-factor 1`
    * Start via `$METRON_HOME/bin/start_parser_topology.sh -k node1:6667 -z 
node1:2181 -s squid`
    
    ### Test Case 0: Base Case Test
    * Delete any squid index that currently exists (if any do) via `curl 
-XDELETE "http://localhost:9200/squid*"`
    * Send 1 data points through and ensure that there are no data points in 
the index:
      * `cat /var/log/squid/access.log | 
/usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh --broker-list 
node1:6667 --topic squid`
      * `curl "http://localhost:9200/squid*/_search?pretty=true&q=*:*"; 2> 
/dev/null| grep "full_hostname" | wc -l` should yield  `1`
    * Validate that the Storm UI for the indexing topology indicates a warning 
in the console for both the "hdfsIndexingBolt" and "indexingBolt" to the effect 
of `java.lang.Exception: WARNING: Default and (likely) unoptimized writer 
config used for hdfs writer and sensor squid` and `java.lang.Exception: 
WARNING: Default and (likely) unoptimized writer config used for elasticsearch 
writer and sensor squid` respectively 
    
    ### Test Case 1: Adjusting batch sizes independently
    * Delete any squid index that currently exists (if any do) via `curl 
-XDELETE "http://localhost:9200/squid*"`
    * Create a file at `$METRON_HOME/config/zookeeper/indexing/squid.json` with 
the following contents:
    ```
    {
      "hdfs" : {
        "index": "squid",
        "batchSize": 1,
        "enabled" : true
      },
      "elasticsearch" : {
        "index": "squid",
        "batchSize": 5,
        "enabled" : true
      }
    }
    ```
    * Push the configs via `$METRON_HOME/bin/zk_load_configs.sh -m PUSH -i 
$METRON_HOME/config/zookeeper -z node1:2181`
    * Send 4 data points through and ensure:
      * `cat /var/log/squid/access.log /var/log/squid/access.log 
/var/log/squid/access.log /var/log/squid/access.log | 
/usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh --broker-list 
node1:6667 --topic squid`
      * `curl "http://localhost:9200/squid*/_search?pretty=true&q=*:*"; 2> 
/dev/null| grep "full_hostname" | wc -l` should yield  `0` 
      * `hadoop fs -cat /apps/metron/indexing/indexed/squid/enrichment-null* | 
wc -l` should yield `4`
    * Send a final data point through and ensure:
      * `cat /var/log/squid/access.log | 
/usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh --broker-list 
node1:6667 --topic squid`
      * `curl "http://localhost:9200/squid*/_search?pretty=true&q=*:*"; 2> 
/dev/null| grep "full_hostname" | wc -l` should yield  `5` 
      * `hadoop fs -cat /apps/metron/indexing/indexed/squid/enrichment-null* | 
wc -l` should yield `5`
     
    ### Test Case 2: Turn off HDFS writer
    * Delete any squid index that currently exists (if any do) via `curl 
-XDELETE "http://localhost:9200/squid*"`
    * Edit the file at `$METRON_HOME/config/zookeeper/indexing/squid.json` to 
the following contents:
    ```
    {
      "hdfs" : {
        "index": "squid",
        "batchSize": 1,
        "enabled" : false 
      },
      "elasticsearch" : {
        "index": "squid",
        "batchSize": 1,
        "enabled" : true
      }
    }
    ```
    * Send 1 data points through and ensure:
      * `cat /var/log/squid/access.log | 
/usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh --broker-list 
node1:6667 --topic squid`
      * `curl "http://localhost:9200/squid*/_search?pretty=true&q=*:*"; 2> 
/dev/null| grep "full_hostname" | wc -l` should yield  `1`
      * `hadoop fs -cat /apps/metron/indexing/indexed/squid/enrichment-null* | 
wc -l` should yield `0`
    
    ### Test Case 3: Stellar Management Functions
    * Execute the following in the stellar shell:
    ```
    Stellar, Go!
    Please note that functions are loading lazily in the background and will be 
unavailable until loaded fully.
    {es.clustername=metron, es.ip=node1, es.port=9300, 
es.date.format=yyyy.MM.dd.HH}
    [Stellar]>>> # Grab the indexing config
    [Stellar]>>> squid_config := CONFIG_GET('INDEXING', 'squid', true)
    [Stellar]>>>
    [Stellar]>>> # Update the index and batch size
    [Stellar]>>> squid_config := INDEXING_SET_BATCH( 
INDEXING_SET_INDEX(squid_config, 'hdfs', 'squid'), 'hdfs', 2)
    [Stellar]>>> # Push the config to zookeeper
    [Stellar]>>> CONFIG_PUT('INDEXING', squid_config, 'squid')
    [Stellar]>>> # Grab the updated config from zookeeper
    [Stellar]>>> CONFIG_GET('INDEXING', 'squid')
    {
      "hdfs" : {
        "index" : "squid",
        "batchSize" : 2,
        "enabled" : false
      },
      "elasticsearch" : {
        "index" : "squid",
        "batchSize" : 1,
        "enabled" : true
      }
    }
    ```
    * Confirm that the dump command from `$METRON_HOME/bin/zk_load_configs.sh 
-m DUMP -z node1:2181` contains the config with batch size of `1`
    * Now pull the configs locally via `$METRON_HOME/bin/zk_load_configs.sh -m 
PULL -z node1:2181 -o $METRON_HOME/config/zookeeper -f`
    * Check that the "hdfs" config at 
`$METRON_HOME/config/zookeeper/indexing/squid.json` is indeed:
    ```
    {
      "index" : "squid",
      "batchSize" : 2,
      "enabled" : false
    }
    ```




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-metron issue #419: METRON-664: Make the index configuration per-wr...

Reply via email to