[jira] [Commented] (METRON-641) Fix Kibana install file

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15839394#comment-15839394
 ] 

ASF GitHub Bot commented on METRON-641:
---

Github user mattf-horton commented on the issue:

https://github.com/apache/incubator-metron/pull/405
  
Let's do this, but need to also fix the other three places in this file 
where '{}' is used with format method.


> Fix Kibana install file
> ---
>
> Key: METRON-641
> URL: https://issues.apache.org/jira/browse/METRON-641
> Project: Metron
>  Issue Type: Bug
>Affects Versions: 0.3.0
>Reporter: Dima Kovalyov
>Priority: Minor
>
> Kibana installed as part of Metron mpack winthin Ambari will fail during 
> start with following error:
> {code}
> ValueError: zero length field name in format
> {code}
> We can fix it with:
> {code}
> sed -i 's@{}/kibana@{0}/kibana@g' 
> /incubator-metron/metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/KIBANA/4.5.1/package/scripts/kibana_master.py
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (METRON-354) Java OOMs are seen with single node quick-dev deployment

2017-01-25 Thread Jon Zeolla (JIRA)

 [ 
https://issues.apache.org/jira/browse/METRON-354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Zeolla reassigned METRON-354:
-

Assignee: Jon Zeolla

> Java OOMs are seen with single node quick-dev deployment
> 
>
> Key: METRON-354
> URL: https://issues.apache.org/jira/browse/METRON-354
> Project: Metron
>  Issue Type: Bug
>Affects Versions: 0.2.1BETA
> Environment: Quick dev setup installation using vagrant from HEAD
>Reporter: Anand Subramanian
>Assignee: Jon Zeolla
>Priority: Critical
>  Labels: metronqe
> Attachments: messages, messages-20160725, output-platform-info.rtf
>
>
> This is a single node quick-dev deployment from HEAD using vagrant on a 
> Macbook Pro. The platform-info output is attached.
> After startup, I noticed multiple java process OOM’s due to lack of swap 
> space. Here’s one pasted below for reference and the full /var/log/messages 
> files are attached (see messages-20160725 for the OOM errors). 
> I did not observe the OOM with earlier with single node vagrant deployments. 
> Note that I have not added additional topologies. 
> Is this possibly due to the new changes that was introduced related to 
> separating out indexing? Please let us know. Also, I was considering  
> increasing the # of CPUs and RAM on the vagrant and re-try the single node 
> setup. Please advice if it is good to try that route. 
> {code}
> Jul 25 10:41:28 node1 kernel: java invoked oom-killer: gfp_mask=0x280da, 
> order=0, oom_adj=0, oom_score_adj=0
> Jul 25 10:41:28 node1 kernel: java cpuset=/ mems_allowed=0
> Jul 25 10:41:28 node1 kernel: Pid: 27968, comm: java Not tainted 
> 2.6.32-573.22.1.el6.x86_64 #1
> Jul 25 10:41:28 node1 kernel: Call Trace:
> Jul 25 10:41:28 node1 kernel: [] ? 
> cpuset_print_task_mems_allowed+0x91/0xb0
> Jul 25 10:41:28 node1 kernel: [] ? dump_header+0x90/0x1b0
> Jul 25 10:41:28 node1 kernel: [] ? 
> security_real_capable_noaudit+0x3c/0x70
> Jul 25 10:41:28 node1 kernel: [] ? 
> oom_kill_process+0x82/0x2a0
> Jul 25 10:41:28 node1 kernel: [] ? 
> select_bad_process+0xe1/0x120
> Jul 25 10:41:28 node1 kernel: [] ? out_of_memory+0x220/0x3c0
> Jul 25 10:41:28 node1 kernel: [] ? 
> __alloc_pages_nodemask+0x93c/0x950
> Jul 25 10:41:28 node1 kernel: [] ? 
> wake_bit_function+0x0/0x50
> Jul 25 10:41:28 node1 kernel: [] ? 
> alloc_pages_vma+0x9a/0x150
> Jul 25 10:41:28 node1 kernel: [] ? 
> handle_pte_fault+0x73d/0xb20
> Jul 25 10:41:28 node1 kernel: [] ? 
> page_remove_rmap+0x54/0xa0
> Jul 25 10:41:28 node1 kernel: [] ? release_pages+0x178/0x250
> Jul 25 10:41:28 node1 kernel: [] ? 
> handle_mm_fault+0x299/0x3d0
> Jul 25 10:41:28 node1 kernel: [] ? 
> __do_page_fault+0x146/0x500
> Jul 25 10:41:28 node1 kernel: [] ? thread_return+0x4e/0x7d0
> Jul 25 10:41:28 node1 kernel: [] ? do_page_fault+0x3e/0xa0
> Jul 25 10:41:28 node1 kernel: [] ? page_fault+0x25/0x30
> Jul 25 10:41:28 node1 kernel: Mem-Info:
> Jul 25 10:41:28 node1 kernel: Node 0 DMA per-cpu:
> Jul 25 10:41:28 node1 kernel: CPU0: hi:0, btch:   1 usd:   0
> Jul 25 10:41:28 node1 kernel: CPU1: hi:0, btch:   1 usd:   0
> Jul 25 10:41:28 node1 kernel: CPU2: hi:0, btch:   1 usd:   0
> Jul 25 10:41:28 node1 kernel: CPU3: hi:0, btch:   1 usd:   0
> Jul 25 10:41:28 node1 kernel: Node 0 DMA32 per-cpu:
> Jul 25 10:41:28 node1 kernel: CPU0: hi:  186, btch:  31 usd:   0
> Jul 25 10:41:28 node1 kernel: CPU1: hi:  186, btch:  31 usd:   0
> Jul 25 10:41:28 node1 kernel: CPU2: hi:  186, btch:  31 usd:   0
> Jul 25 10:41:28 node1 kernel: CPU3: hi:  186, btch:  31 usd:   0
> Jul 25 10:41:28 node1 kernel: Node 0 Normal per-cpu:
> Jul 25 10:41:28 node1 kernel: CPU0: hi:  186, btch:  31 usd:   0
> Jul 25 10:41:28 node1 kernel: CPU1: hi:  186, btch:  31 usd:   0
> Jul 25 10:41:28 node1 kernel: CPU2: hi:  186, btch:  31 usd:   0
> Jul 25 10:41:28 node1 kernel: CPU3: hi:  186, btch:  31 usd:   0
> Jul 25 10:41:28 node1 kernel: active_anon:1652718 inactive_anon:271396 
> isolated_anon:0
> Jul 25 10:41:28 node1 kernel: active_file:180 inactive_file:250 
> isolated_file:0
> Jul 25 10:41:28 node1 kernel: unevictable:0 dirty:78 writeback:194 unstable:0
> Jul 25 10:41:28 node1 kernel: free:25337 slab_reclaimable:6085 
> slab_unreclaimable:20565
> Jul 25 10:41:28 node1 kernel: mapped:2265 shmem:758 pagetables:10191 bounce:0
> Jul 25 10:41:28 node1 kernel: Node 0 DMA free:15724kB min:124kB low:152kB 
> high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB 
> inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB 
> present:15336kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB 
> slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB 
> unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 

[jira] [Commented] (METRON-672) SolrIndexingIntegrationTest fails intermittently

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838947#comment-15838947
 ] 

ASF GitHub Bot commented on METRON-672:
---

Github user justinleet commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/424#discussion_r97915042
  
--- Diff: 
metron-platform/metron-enrichment/src/test/java/org/apache/metron/enrichment/integration/components/ConfigUploadComponent.java
 ---
@@ -38,6 +39,7 @@
   private String enrichmentConfigsPath;
   private String indexingConfigsPath;
   private String profilerConfigPath;
+  private Optional> 
postStartCallback = Optional.empty();
--- End diff --

Could this just use Consumer instead of Function? Since the second type 
parameter is Void, it seems like the Function is just being a Consumer anyway


> SolrIndexingIntegrationTest fails intermittently
> 
>
> Key: METRON-672
> URL: https://issues.apache.org/jira/browse/METRON-672
> Project: Metron
>  Issue Type: Bug
>Affects Versions: 0.3.0
>Reporter: Justin Leet
>Assignee: Casey Stella
>
> Adapted from a dev list conversation
> h4. Initial Error in Travis
> Jon noted this in the Travis builds
> {code}
> `test(org.apache.metron.solr.integration.SolrIndexingIntegrationTest):
> Took too long to complete: 150582 > 15`, more details below:
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
> 166.167 sec <<< FAILURE!
> test(org.apache.metron.solr.integration.SolrIndexingIntegrationTest)
> Time elapsed: 166.071 sec  <<< ERROR!
> {code}
> h4. Additional Notes
> Casey was able to reproduce this locally (but not in his IDE). Couple details 
> in the dev list excerpt.
> Fixing this should ideally include adding more detailed logging to hopefully 
> avoid these issues in the future.  As a note, unfortunately in this case, 
> Casey notes that logging seems to make this issue rarer.  Still, logging to 
> be able to understand the flow (and tuning logging level as appropriate) 
> would help resolve issues in the future.
> h4. Dev List Excerpt
> Per Casey:
> {quote}
> Ok, so now I'm concerned that this isn't a fluke.  Here's an excerpt from
> the failing logs on travis for my PR with substantially longer timeouts (
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/194575474/log.txt)
> {code}
> Running org.apache.metron.solr.integration.SolrIndexingIntegrationTest
> 0 vs 10 vs 0
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
> 317.056 sec <<< FAILURE!
> test(org.apache.metron.solr.integration.SolrIndexingIntegrationTest)
> Time elapsed: 316.949 sec  <<< ERROR!
> 

[jira] [Commented] (METRON-672) SolrIndexingIntegrationTest fails intermittently

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838948#comment-15838948
 ] 

ASF GitHub Bot commented on METRON-672:
---

Github user justinleet commented on the issue:

https://github.com/apache/incubator-metron/pull/424
  
Thanks for taking the effort to dig into this.  Great work.  Other than a 
couple minor comments, I'm very happy with this.


> SolrIndexingIntegrationTest fails intermittently
> 
>
> Key: METRON-672
> URL: https://issues.apache.org/jira/browse/METRON-672
> Project: Metron
>  Issue Type: Bug
>Affects Versions: 0.3.0
>Reporter: Justin Leet
>Assignee: Casey Stella
>
> Adapted from a dev list conversation
> h4. Initial Error in Travis
> Jon noted this in the Travis builds
> {code}
> `test(org.apache.metron.solr.integration.SolrIndexingIntegrationTest):
> Took too long to complete: 150582 > 15`, more details below:
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
> 166.167 sec <<< FAILURE!
> test(org.apache.metron.solr.integration.SolrIndexingIntegrationTest)
> Time elapsed: 166.071 sec  <<< ERROR!
> {code}
> h4. Additional Notes
> Casey was able to reproduce this locally (but not in his IDE). Couple details 
> in the dev list excerpt.
> Fixing this should ideally include adding more detailed logging to hopefully 
> avoid these issues in the future.  As a note, unfortunately in this case, 
> Casey notes that logging seems to make this issue rarer.  Still, logging to 
> be able to understand the flow (and tuning logging level as appropriate) 
> would help resolve issues in the future.
> h4. Dev List Excerpt
> Per Casey:
> {quote}
> Ok, so now I'm concerned that this isn't a fluke.  Here's an excerpt from
> the failing logs on travis for my PR with substantially longer timeouts (
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/194575474/log.txt)
> {code}
> Running org.apache.metron.solr.integration.SolrIndexingIntegrationTest
> 0 vs 10 vs 0
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
> 317.056 sec <<< FAILURE!
> test(org.apache.metron.solr.integration.SolrIndexingIntegrationTest)
> Time elapsed: 316.949 sec  <<< ERROR!
> java.lang.RuntimeException: Took too long to complete: 300783 > 30
> at 
> org.apache.metron.integration.ComponentRunner.process(ComponentRunner.java:131)
> at 
> org.apache.metron.indexing.integration.IndexingIntegrationTest.test(IndexingIntegrationTest.java:173)
> {code}
> I'm getting the impression that this isn't the timeout and we have a
> mystery on our hands.  Each of those lines "10 vs 10 vs 6" happen 15
> seconds apart.  That line means that it read 10 entries from kafka, 10
> 

[jira] [Commented] (METRON-672) SolrIndexingIntegrationTest fails intermittently

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838944#comment-15838944
 ] 

ASF GitHub Bot commented on METRON-672:
---

Github user justinleet commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/424#discussion_r97914890
  
--- Diff: 
metron-platform/metron-indexing/src/test/java/org/apache/metron/indexing/integration/IndexingIntegrationTest.java
 ---
@@ -184,6 +203,26 @@ public void test() throws Exception {
 }
   }
 
+  private void waitForIndex(String zookeeperQuorum) throws Exception {
+try(CuratorFramework client = getClient(zookeeperQuorum)) {
+  client.start();
+  byte[] bytes = null;
+  do {
+try {
+  bytes = 
ConfigurationsUtils.readSensorIndexingConfigBytesFromZookeeper(testSensorType, 
client);
+  Thread.sleep(1000);
+}
+catch(KeeperException.NoNodeException nne) {
+  //kindly ignore because the path might not exist just yet.
+}
+  }
+  while(bytes == null || bytes.length == 0);
+  return;
--- End diff --

Drop the return, since it's a void method.


> SolrIndexingIntegrationTest fails intermittently
> 
>
> Key: METRON-672
> URL: https://issues.apache.org/jira/browse/METRON-672
> Project: Metron
>  Issue Type: Bug
>Affects Versions: 0.3.0
>Reporter: Justin Leet
>Assignee: Casey Stella
>
> Adapted from a dev list conversation
> h4. Initial Error in Travis
> Jon noted this in the Travis builds
> {code}
> `test(org.apache.metron.solr.integration.SolrIndexingIntegrationTest):
> Took too long to complete: 150582 > 15`, more details below:
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
> 166.167 sec <<< FAILURE!
> test(org.apache.metron.solr.integration.SolrIndexingIntegrationTest)
> Time elapsed: 166.071 sec  <<< ERROR!
> {code}
> h4. Additional Notes
> Casey was able to reproduce this locally (but not in his IDE). Couple details 
> in the dev list excerpt.
> Fixing this should ideally include adding more detailed logging to hopefully 
> avoid these issues in the future.  As a note, unfortunately in this case, 
> Casey notes that logging seems to make this issue rarer.  Still, logging to 
> be able to understand the flow (and tuning logging level as appropriate) 
> would help resolve issues in the future.
> h4. Dev List Excerpt
> Per Casey:
> {quote}
> Ok, so now I'm concerned that this isn't a fluke.  Here's an excerpt from
> the failing logs on travis for my PR with substantially longer timeouts (
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/194575474/log.txt)
> {code}
> Running org.apache.metron.solr.integration.SolrIndexingIntegrationTest
> 0 vs 10 vs 0
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 

[jira] [Commented] (METRON-672) SolrIndexingIntegrationTest fails intermittently

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838943#comment-15838943
 ] 

ASF GitHub Bot commented on METRON-672:
---

Github user justinleet commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/424#discussion_r97914854
  
--- Diff: 
metron-platform/metron-indexing/src/test/java/org/apache/metron/indexing/integration/IndexingIntegrationTest.java
 ---
@@ -139,11 +146,22 @@ public void test() throws Exception {
   inputDocs.add(m);
 
 }
+final AtomicBoolean isLoaded = new AtomicBoolean(false);
 ConfigUploadComponent configUploadComponent = new 
ConfigUploadComponent()
 .withTopologyProperties(topologyProperties)
 .withGlobalConfigsPath(TestConstants.SAMPLE_CONFIG_PATH)
 .withEnrichmentConfigsPath(TestConstants.SAMPLE_CONFIG_PATH)
 .withIndexingConfigsPath(TestConstants.SAMPLE_CONFIG_PATH)
+.withPostStartCallback(component -> {
+  try {
+
waitForIndex(component.getTopologyProperties().getProperty(ZKServerComponent.ZOOKEEPER_PROPERTY));
+  } catch (Exception e) {
+e.printStackTrace();
+  }
+  isLoaded.set(true);
+  return null;
+  }
+);
 ;
--- End diff --

Can you kill the extra semicolon?


> SolrIndexingIntegrationTest fails intermittently
> 
>
> Key: METRON-672
> URL: https://issues.apache.org/jira/browse/METRON-672
> Project: Metron
>  Issue Type: Bug
>Affects Versions: 0.3.0
>Reporter: Justin Leet
>Assignee: Casey Stella
>
> Adapted from a dev list conversation
> h4. Initial Error in Travis
> Jon noted this in the Travis builds
> {code}
> `test(org.apache.metron.solr.integration.SolrIndexingIntegrationTest):
> Took too long to complete: 150582 > 15`, more details below:
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
> 166.167 sec <<< FAILURE!
> test(org.apache.metron.solr.integration.SolrIndexingIntegrationTest)
> Time elapsed: 166.071 sec  <<< ERROR!
> {code}
> h4. Additional Notes
> Casey was able to reproduce this locally (but not in his IDE). Couple details 
> in the dev list excerpt.
> Fixing this should ideally include adding more detailed logging to hopefully 
> avoid these issues in the future.  As a note, unfortunately in this case, 
> Casey notes that logging seems to make this issue rarer.  Still, logging to 
> be able to understand the flow (and tuning logging level as appropriate) 
> would help resolve issues in the future.
> h4. Dev List Excerpt
> Per Casey:
> {quote}
> Ok, so now I'm concerned that this isn't a fluke.  Here's an excerpt from
> the failing logs on travis for my PR with substantially longer timeouts (
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/194575474/log.txt)
> {code}
> Running org.apache.metron.solr.integration.SolrIndexingIntegrationTest
> 0 vs 10 vs 0
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> 

[jira] [Commented] (METRON-664) Make the index configuration per-writer with enabled/disabled

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838922#comment-15838922
 ] 

ASF GitHub Bot commented on METRON-664:
---

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-metron/pull/419


> Make the index configuration per-writer with enabled/disabled
> -
>
> Key: METRON-664
> URL: https://issues.apache.org/jira/browse/METRON-664
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Casey Stella
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-672) SolrIndexingIntegrationTest fails intermittently

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838876#comment-15838876
 ] 

ASF GitHub Bot commented on METRON-672:
---

Github user cestella commented on the issue:

https://github.com/apache/incubator-metron/pull/424
  
To test this locally, because it's extremely sporadic that it happens 
locally (1 out of every 50 times I run the test), I did the following:
* Built and installed the project in Maven: `mvn -DskipTests clean install`
* Ran the `metron-solr` project integration tests for at least 2 hours in a 
row ensuring that they don't fail: `echo "" > /tmp/output;while [ $(cat 
/tmp/output | grep "vs 6" | wc -l) -lt 1 ];do mvn install >& /tmp/output;done`


> SolrIndexingIntegrationTest fails intermittently
> 
>
> Key: METRON-672
> URL: https://issues.apache.org/jira/browse/METRON-672
> Project: Metron
>  Issue Type: Bug
>Affects Versions: 0.3.0
>Reporter: Justin Leet
>Assignee: Casey Stella
>
> Adapted from a dev list conversation
> h4. Initial Error in Travis
> Jon noted this in the Travis builds
> {code}
> `test(org.apache.metron.solr.integration.SolrIndexingIntegrationTest):
> Took too long to complete: 150582 > 15`, more details below:
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
> 166.167 sec <<< FAILURE!
> test(org.apache.metron.solr.integration.SolrIndexingIntegrationTest)
> Time elapsed: 166.071 sec  <<< ERROR!
> {code}
> h4. Additional Notes
> Casey was able to reproduce this locally (but not in his IDE). Couple details 
> in the dev list excerpt.
> Fixing this should ideally include adding more detailed logging to hopefully 
> avoid these issues in the future.  As a note, unfortunately in this case, 
> Casey notes that logging seems to make this issue rarer.  Still, logging to 
> be able to understand the flow (and tuning logging level as appropriate) 
> would help resolve issues in the future.
> h4. Dev List Excerpt
> Per Casey:
> {quote}
> Ok, so now I'm concerned that this isn't a fluke.  Here's an excerpt from
> the failing logs on travis for my PR with substantially longer timeouts (
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/194575474/log.txt)
> {code}
> Running org.apache.metron.solr.integration.SolrIndexingIntegrationTest
> 0 vs 10 vs 0
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
> 317.056 sec <<< FAILURE!
> test(org.apache.metron.solr.integration.SolrIndexingIntegrationTest)
> Time elapsed: 316.949 sec  <<< ERROR!
> java.lang.RuntimeException: Took too long to complete: 300783 > 30
> at 
> org.apache.metron.integration.ComponentRunner.process(ComponentRunner.java:131)
> at 
> 

[jira] [Commented] (METRON-672) SolrIndexingIntegrationTest fails intermittently

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838867#comment-15838867
 ] 

ASF GitHub Bot commented on METRON-672:
---

GitHub user cestella opened a pull request:

https://github.com/apache/incubator-metron/pull/424

METRON-672: SolrIndexingIntegrationTest fails intermittently

This failure is due to a change in default behavior when indexing was split 
off into a separate configuration file.  The default batch size was changed 
from `5` to `1` in particular.  This, by itself, is not a problem, but the 
`IndexingIntegrationTest` (base class for Solr and Elastic search integration 
tests):
* submits the configs
* starts the indexing topology
* writes the input data

The writing of the input data may happen before the topology fully loads or 
the configuration fully loads, especially if the machine running the unit tests 
is under load (like with travis).  As a result, the first record may end up 
with the default batch size (of 1) and write out immediately because the 
indexing configs haven't loaded into zookeeper just yet.  In that circumstance, 
eventually the configs load and the batch size is set to `5`.  Meanwhile we've 
written 10 records and are expecting 10 in return, but because you wrote the 
first out already and then the next 5, we have another 4 pending to be written 
by the `BulkMessageWriterBolt`.

So, the failure scenario is as follows:
* Message 1 is received and the indexing config hasn't loaded yet, so the 
batch size is 1 and it immediately gets written out
* Message 2 - 5 are each received and the indexing config has loaded, so 
the batch size is 5 and it queues up
* Message 6 is received and the batch writes out
* Messages 7 - 10 are received, but never make a full batch, so we time out 
waiting for them to write out

The fix is to ensure that we don't write out messages to kafka until the 
configs are loaded, which is what this PR does.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cestella/incubator-metron METRON-672

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-metron/pull/424.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #424






> SolrIndexingIntegrationTest fails intermittently
> 
>
> Key: METRON-672
> URL: https://issues.apache.org/jira/browse/METRON-672
> Project: Metron
>  Issue Type: Bug
>Affects Versions: 0.3.0
>Reporter: Justin Leet
>Assignee: Casey Stella
>
> Adapted from a dev list conversation
> h4. Initial Error in Travis
> Jon noted this in the Travis builds
> {code}
> `test(org.apache.metron.solr.integration.SolrIndexingIntegrationTest):
> Took too long to complete: 150582 > 15`, more details below:
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
> 166.167 sec <<< FAILURE!
> test(org.apache.metron.solr.integration.SolrIndexingIntegrationTest)
> Time elapsed: 166.071 sec  <<< ERROR!
> {code}
> h4. Additional Notes
> Casey was able to reproduce this locally (but not in his IDE). Couple details 
> in the dev list excerpt.
> Fixing this should ideally include adding more detailed logging to hopefully 
> avoid these issues in the future.  As a note, unfortunately in this case, 
> Casey notes that logging seems to make this issue rarer.  Still, logging to 
> be able to understand the flow (and tuning logging level as appropriate) 
> would help resolve issues in the future.
> h4. Dev List Excerpt
> Per Casey:
> {quote}
> Ok, so now I'm concerned that this isn't a fluke.  Here's an excerpt from
> the failing logs on travis for my PR with substantially longer timeouts (
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/194575474/log.txt)
> {code}
> Running org.apache.metron.solr.integration.SolrIndexingIntegrationTest
> 0 vs 10 vs 0
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> 

[jira] [Assigned] (METRON-672) SolrIndexingIntegrationTest fails intermittently

2017-01-25 Thread Casey Stella (JIRA)

 [ 
https://issues.apache.org/jira/browse/METRON-672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Casey Stella reassigned METRON-672:
---

Assignee: Casey Stella

> SolrIndexingIntegrationTest fails intermittently
> 
>
> Key: METRON-672
> URL: https://issues.apache.org/jira/browse/METRON-672
> Project: Metron
>  Issue Type: Bug
>Affects Versions: 0.3.0
>Reporter: Justin Leet
>Assignee: Casey Stella
>
> Adapted from a dev list conversation
> h4. Initial Error in Travis
> Jon noted this in the Travis builds
> {code}
> `test(org.apache.metron.solr.integration.SolrIndexingIntegrationTest):
> Took too long to complete: 150582 > 15`, more details below:
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
> 166.167 sec <<< FAILURE!
> test(org.apache.metron.solr.integration.SolrIndexingIntegrationTest)
> Time elapsed: 166.071 sec  <<< ERROR!
> {code}
> h4. Additional Notes
> Casey was able to reproduce this locally (but not in his IDE). Couple details 
> in the dev list excerpt.
> Fixing this should ideally include adding more detailed logging to hopefully 
> avoid these issues in the future.  As a note, unfortunately in this case, 
> Casey notes that logging seems to make this issue rarer.  Still, logging to 
> be able to understand the flow (and tuning logging level as appropriate) 
> would help resolve issues in the future.
> h4. Dev List Excerpt
> Per Casey:
> {quote}
> Ok, so now I'm concerned that this isn't a fluke.  Here's an excerpt from
> the failing logs on travis for my PR with substantially longer timeouts (
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/194575474/log.txt)
> {code}
> Running org.apache.metron.solr.integration.SolrIndexingIntegrationTest
> 0 vs 10 vs 0
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed 
> target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
> 317.056 sec <<< FAILURE!
> test(org.apache.metron.solr.integration.SolrIndexingIntegrationTest)
> Time elapsed: 316.949 sec  <<< ERROR!
> java.lang.RuntimeException: Took too long to complete: 300783 > 30
> at 
> org.apache.metron.integration.ComponentRunner.process(ComponentRunner.java:131)
> at 
> org.apache.metron.indexing.integration.IndexingIntegrationTest.test(IndexingIntegrationTest.java:173)
> {code}
> I'm getting the impression that this isn't the timeout and we have a
> mystery on our hands.  Each of those lines "10 vs 10 vs 6" happen 15
> seconds apart.  That line means that it read 10 entries from kafka, 10
> entries from the indexed data and 6 entries from HDFS.  It's that 6
> entries that is the problem.   Also of note, this does not seem to
> happen to me locally AND it's not consistent on Travis.  Given all
> that I'd say that it's a problem with the 

[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838733#comment-15838733
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user justinleet commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97895278
  
--- Diff: LICENSE ---
@@ -210,6 +210,12 @@ This product bundles some test examples from the Stix 
project (metron-platform/m
 
 This product bundles wait-for-it.sh, which is available under a "MIT 
Software License" license.  For details, see 
https://github.com/vishnubob/wait-for-it
 

+
+
--- End diff --

Updated with another license.  The testing DB used is under a share-alike 
3.0 (not 4.0) license, so I explicitly called it it and added a link to the 
GitHub repo.


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-608) Mpack to install a single-node test cluster

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838721#comment-15838721
 ] 

ASF GitHub Bot commented on METRON-608:
---

Github user mattf-horton commented on the issue:

https://github.com/apache/incubator-metron/pull/408
  
@dlyle65535 , I'll be happy to share my wip.  It will take me a few hours 
to brush it off, then I'll post it as a PR referring to 
[METRON-609](https://issues.apache.org/jira/browse/METRON-609).


> Mpack to install a single-node test cluster
> ---
>
> Key: METRON-608
> URL: https://issues.apache.org/jira/browse/METRON-608
> Project: Metron
>  Issue Type: Improvement
>Affects Versions: 0.3.0
> Environment: Linux, Ambari installation
>Reporter: Matt Foley
>Assignee: Matt Foley
> Fix For: Next + 1
>
>
> The current Mpack for Ambari install of Metron fails to correctly install 
> Elasticsearch if restricted to a single-node cluster.  Yet a single-node 
> install of Elasticsearch is certainly feasible, as shown by our quick-dev 
> environment.
> This is a short-term fix by providing a completely separate Mpack just for 
> the single-node scenario.  I'm also opening METRON-609 to enhance the 
> existing Mpack to handle the single-node and small-number-of-nodes scenario, 
> but that one will require much deeper testing and is likely to take a while 
> to complete.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838593#comment-15838593
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97879417
  
--- Diff: 
metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/stellar/GeoEnrichmentFunctions.java
 ---
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.enrichment.stellar;
+
+import org.apache.log4j.Logger;
+import org.apache.metron.common.dsl.Context;
+import org.apache.metron.common.dsl.ParseException;
+import org.apache.metron.common.dsl.Stellar;
+import org.apache.metron.common.dsl.StellarFunction;
+import org.apache.metron.enrichment.adapters.geo.GeoLiteDatabase;
+
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+
+public class GeoEnrichmentFunctions {
+  private static final Logger LOG = 
Logger.getLogger(GeoEnrichmentFunctions.class);
+
+  @Stellar(name="GET"
+  ,namespace="GEO"
+  ,description="Look up an IPV4 address and returns geographic 
information about it"
+  ,params = {
+  "ip - The IPV4 address to lookup" +
+  "fields - Optional list of GeoIP fields to grab. 
Options are locID, country, city, postalCode, dmaCode, latitude, longitude, 
location_point"
+}
+  ,returns = "If a Single field is requested a string of the 
field, If multiple fields a map of string of the fields, and null otherwise"
+  )
--- End diff --

Sure.  Will do.


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838582#comment-15838582
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user justinleet commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97878462
  
--- Diff: 
metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/stellar/GeoEnrichmentFunctions.java
 ---
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.enrichment.stellar;
+
+import org.apache.log4j.Logger;
+import org.apache.metron.common.dsl.Context;
+import org.apache.metron.common.dsl.ParseException;
+import org.apache.metron.common.dsl.Stellar;
+import org.apache.metron.common.dsl.StellarFunction;
+import org.apache.metron.enrichment.adapters.geo.GeoLiteDatabase;
+
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+
+public class GeoEnrichmentFunctions {
+  private static final Logger LOG = 
Logger.getLogger(GeoEnrichmentFunctions.class);
+
+  @Stellar(name="GET"
+  ,namespace="GEO"
+  ,description="Look up an IPV4 address and returns geographic 
information about it"
+  ,params = {
+  "ip - The IPV4 address to lookup" +
+  "fields - Optional list of GeoIP fields to grab. 
Options are locID, country, city, postalCode, dmaCode, latitude, longitude, 
location_point"
+}
+  ,returns = "If a Single field is requested a string of the 
field, If multiple fields a map of string of the fields, and null otherwise"
+  )
--- End diff --

Do we want to kick off a discuss thread for this?  Seems like we might as 
well start a discussion, rather than potentially having it buried here.  And 
the discussion is useful even before everyone's set with this PR.


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838577#comment-15838577
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user justinleet commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97877931
  
--- Diff: 
metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/adapters/geo/GeoLiteDatabase.java
 ---
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.enrichment.adapters.geo;
+
+import com.maxmind.db.CHMCache;
+import com.maxmind.geoip2.DatabaseReader;
+import com.maxmind.geoip2.exception.GeoIp2Exception;
+import com.maxmind.geoip2.model.CityResponse;
+import com.maxmind.geoip2.record.City;
+import com.maxmind.geoip2.record.Country;
+import com.maxmind.geoip2.record.Location;
+import com.maxmind.geoip2.record.Postal;
+import org.apache.commons.validator.routines.InetAddressValidator;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.net.InetAddress;
+import java.net.UnknownHostException;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.locks.Lock;
+import java.util.concurrent.locks.ReentrantReadWriteLock;
+import java.util.zip.GZIPInputStream;
+
+public enum GeoLiteDatabase {
+  INSTANCE;
+
+  protected static final Logger LOG = 
LoggerFactory.getLogger(GeoLiteDatabase.class);
+  public static final String GEO_HDFS_FILE = "geo.hdfs.file";
+  public static final String GEO_HDFS_FILE_DEFAULT = 
"/apps/metron/geo/default/GeoLite2-City.mmdb.gz";
+
+  private static ReentrantReadWriteLock lock = new 
ReentrantReadWriteLock();
+  private static final Lock readLock = lock.readLock();
+  private static final Lock writeLock = lock.writeLock();
+  private static InetAddressValidator ipvalidator = new 
InetAddressValidator();
+  private static volatile String hdfsLoc = GEO_HDFS_FILE_DEFAULT;
+  private static DatabaseReader reader = null;
+
+  public synchronized void updateIfNecessary(Map 
globalConfig) {
+// Reload database if necessary (file changes on HDFS)
+LOG.trace("[Metron] Determining if GeoIpDatabase update required");
+String hdfsFile = GEO_HDFS_FILE_DEFAULT;
+if (globalConfig != null) {
+  hdfsFile = (String) globalConfig.getOrDefault(GEO_HDFS_FILE, 
GEO_HDFS_FILE_DEFAULT);
+}
+
+// Always update if we don't have a DatabaseReader
+if (reader == null || !hdfsLoc.equals(hdfsFile)) {
--- End diff --

`UpdateIfNecessary` is actually synchronized, so it's already locked.  
Given that updates are rare and this is only run on global configs being 
updated, impact will be very low (since it's called very rarely).


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-668) Remove the "tickUpdate" profile config and make the "init" phase not reset variables

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838563#comment-15838563
 ] 

ASF GitHub Bot commented on METRON-668:
---

Github user cestella closed the pull request at:

https://github.com/apache/incubator-metron/pull/420


> Remove the "tickUpdate" profile config and make the "init" phase not reset 
> variables
> 
>
> Key: METRON-668
> URL: https://issues.apache.org/jira/browse/METRON-668
> Project: Metron
>  Issue Type: Improvement
>Reporter: Casey Stella
>Assignee: Casey Stella
>
> Originally during work on the MAD outlier work, I conceived of a need for a 
> new callback in the profile configuration, "tickUpdate" that ran at the tick 
> and had the variables from the tick that just completed available to it.  
> This was done so that I could merge the state accumulated in the tick with a 
> lookback window of state for MAD.  The problem is that the "init" phase 
> happens after this and blows away the changes done in "tickUpdate", so it 
> never worked like intended.
> It occurs to me that what we really want is not to have two separate config 
> phases, but only one, "init" and to not reset the variables on the tick for 
> the profile.  You can, of course, choose to update them by overwriting them 
> in the "init" phase *or* you can choose to use them as part of your init.
> For context, this would make the example for MAD:
> {code:javascript}
> {
>   "profiles": [
> {
>   "profile": "sketchy_mad",
>   "foreach": "'global'",
>   "onlyif": "true",
>   "init" : {
> "s": "OUTLIER_MAD_STATE_MERGE(PROFILE_GET('sketchy_mad',
> 'global', 5, 'MINUTES'))"
>},
>   "tickUpdate": {
> "s": "OUTLIER_MAD_STATE_MERGE(PROFILE_GET('sketchy_mad',
> 'global', 5, 'MINUTES'), s)"
> },
>   "update": {
> "s": "OUTLIER_MAD_ADD(s, value)"
> },
>   "result": "s"
> }
>   ]
> }
> {code}
> is functionally equivalent to
> {code:javascript}
> {
>   "profiles": [
> {
>   "profile": "sketchy_mad",
>   "foreach": "'global'",
>   "onlyif": "true",
>   "init" : {
> "s": "OUTLIER_MAD_STATE_MERGE(PROFILE_GET('sketchy_mad',
> 'global', 5, 'MINUTES'))"
>},
>   "update": {
> "s": "OUTLIER_MAD_ADD(s, value)"
> },
>   "result": "s"
> }
>   ]
> }
> {code}
> This resets the MAD state to the last 5 minute window.  If we did NOT reset 
> the state and keep accumulating state (provided we did not clear the 
> variables on init, we could do the following:
> {code:javascript}
> {
>   "profiles": [
> {
>   "profile": "sketchy_mad",
>   "foreach": "'global'",
>   "onlyif": "true",
>   "init" : {
> "s": "if exists(s) then s else 
> OUTLIER_MAD_STATE_MERGE(PROFILE_GET('sketchy_mad',
> 'global', 5, 'MINUTES'))"
>},
>   "update": {
> "s": "OUTLIER_MAD_ADD(s, value)"
> },
>   "result": "s"
> }
>   ]
> }
> {code}
> s would get initialized sensibly and then always accumulate as long as the 
> topology continued (rather than having a fixed lookback).
> In short, making init to not reset the variables shouldn't cause any harm and 
> should provide another set of use-cases for the profiler.  Also, tickUpdate 
> has no function whatsoever and should be removed because it gets overwritten 
> by init directly after being called.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-668) Remove the "tickUpdate" profile config and make the "init" phase not reset variables

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838564#comment-15838564
 ] 

ASF GitHub Bot commented on METRON-668:
---

GitHub user cestella reopened a pull request:

https://github.com/apache/incubator-metron/pull/420

METRON-668: Remove the "tickUpdate" profile config and make the "init" 
phase not reset variables

Please see description at 
[METRON-668](https://issues.apache.org/jira/browse/METRON-668) for a full 
description.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cestella/incubator-metron METRON-668

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-metron/pull/420.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #420


commit 7cb5c60462448b6c35a9d1def58489903a649834
Author: cstella 
Date:   2017-01-19T21:58:27Z

METRON-668: Remove the 'tickUpdate' profile config and make the 'init' 
phase not reset variables

commit ad94ed4916605b1416c62463fc2de96f76c85f9f
Author: cstella 
Date:   2017-01-19T22:00:47Z

Fixed docs

commit a99c013a3e850a26720ad9a7bfede226660e18ed
Author: cstella 
Date:   2017-01-23T19:07:24Z

Updating unit tests to function properly.

commit 78a472fda1e65c0679089127d149582f861c82a7
Author: cstella 
Date:   2017-01-23T20:56:20Z

TEMPORARY UPDATE TO SEE SOMETHING, DO NOT MERGE YET

commit 2d4aa56324f86cc5e3b73d486a2fde008ec211b8
Author: cstella 
Date:   2017-01-23T21:24:32Z

updating one more time to indicate writes vs flush

commit b78e31fef6244aba98b369553125c8c667ff0a35
Author: cstella 
Date:   2017-01-23T21:27:02Z

updating again.

commit 1cd6fa49b7d950e7433733c0ff683d42079dbca7
Author: cstella 
Date:   2017-01-25T20:01:35Z

better logging.

commit 80a47ba0c402e70f0020447aff69e5feddef1dce
Author: cstella 
Date:   2017-01-25T20:39:23Z

whoops, taht's a bad idea.




> Remove the "tickUpdate" profile config and make the "init" phase not reset 
> variables
> 
>
> Key: METRON-668
> URL: https://issues.apache.org/jira/browse/METRON-668
> Project: Metron
>  Issue Type: Improvement
>Reporter: Casey Stella
>Assignee: Casey Stella
>
> Originally during work on the MAD outlier work, I conceived of a need for a 
> new callback in the profile configuration, "tickUpdate" that ran at the tick 
> and had the variables from the tick that just completed available to it.  
> This was done so that I could merge the state accumulated in the tick with a 
> lookback window of state for MAD.  The problem is that the "init" phase 
> happens after this and blows away the changes done in "tickUpdate", so it 
> never worked like intended.
> It occurs to me that what we really want is not to have two separate config 
> phases, but only one, "init" and to not reset the variables on the tick for 
> the profile.  You can, of course, choose to update them by overwriting them 
> in the "init" phase *or* you can choose to use them as part of your init.
> For context, this would make the example for MAD:
> {code:javascript}
> {
>   "profiles": [
> {
>   "profile": "sketchy_mad",
>   "foreach": "'global'",
>   "onlyif": "true",
>   "init" : {
> "s": "OUTLIER_MAD_STATE_MERGE(PROFILE_GET('sketchy_mad',
> 'global', 5, 'MINUTES'))"
>},
>   "tickUpdate": {
> "s": "OUTLIER_MAD_STATE_MERGE(PROFILE_GET('sketchy_mad',
> 'global', 5, 'MINUTES'), s)"
> },
>   "update": {
> "s": "OUTLIER_MAD_ADD(s, value)"
> },
>   "result": "s"
> }
>   ]
> }
> {code}
> is functionally equivalent to
> {code:javascript}
> {
>   "profiles": [
> {
>   "profile": "sketchy_mad",
>   "foreach": "'global'",
>   "onlyif": "true",
>   "init" : {
> "s": "OUTLIER_MAD_STATE_MERGE(PROFILE_GET('sketchy_mad',
> 'global', 5, 'MINUTES'))"
>},
>   "update": {
> "s": "OUTLIER_MAD_ADD(s, value)"
> },
>   "result": "s"
> }
>   ]
> }
> {code}
> This resets the MAD state to the last 5 minute window.  If we did NOT reset 
> the state and keep accumulating state (provided we did not clear the 
> variables on init, we could do the following:
> {code:javascript}
> {
>   "profiles": [
> {
>   "profile": "sketchy_mad",
>   "foreach": "'global'",
>   "onlyif": "true",
>   "init" : {
> "s": "if exists(s) then s else 
> 

[jira] [Commented] (METRON-674) Archived Telemetry Filenames in HDFS Contain 'Null'

2017-01-25 Thread Justin Leet (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838562#comment-15838562
 ] 

Justin Leet commented on METRON-674:


http://storm.apache.org/releases/1.0.1/storm-hdfs.html

Per the docs, the format is
{code}
{prefix}{componentId}-{taskId}-{rotationNum}-{timestamp}{extension}
{code}
Given a file like: 
enrichment-null-0-0-1484752251563.json

The null field is componentId.  I'd have to dig into why it's actually null 
though.

> Archived Telemetry Filenames in HDFS Contain 'Null'
> ---
>
> Key: METRON-674
> URL: https://issues.apache.org/jira/browse/METRON-674
> Project: Metron
>  Issue Type: Bug
>Affects Versions: 0.3.0
>Reporter: Nick Allen
>Priority: Minor
>
> When running "Quick Dev", I have noticed that all of the archived telemetry 
> files in HDFS contain null in the name.
> {code}
> [root@node1 0.3.0]# hdfs dfs -ls /apps/metron/indexing/indexed/*
> Found 2 items
> -rw-r--r--   1 storm hadoop 644753 2017-01-19 17:47 
> /apps/metron/indexing/indexed/bro/enrichment-null-0-0-1484847868551.json
> -rw-r--r--   1 storm hadoop   14107767 2017-01-19 18:46 
> /apps/metron/indexing/indexed/bro/enrichment-null-0-0-1484848728527.json
> Found 5 items
> -rwxrwxrwx   1 storm hadoop 205699 2017-01-16 21:57 
> /apps/metron/indexing/indexed/snort/enrichment-null-0-0-1484603710250.json
> -rwxrwxrwx   1 storm hadoop5773871 2017-01-17 14:34 
> /apps/metron/indexing/indexed/snort/enrichment-null-0-0-1484603925156.json
> -rwxrwxrwx   1 storm hadoop 253870 2017-01-17 13:43 
> /apps/metron/indexing/indexed/snort/enrichment-null-0-0-1484660437793.json
> -rwxrwxrwx   1 storm hadoop   24023035 2017-01-17 19:45 
> /apps/metron/indexing/indexed/snort/enrichment-null-0-0-1484660672723.json
> -rwxrwxrwx   1 storm hadoop2063857 2017-01-17 19:02 
> /apps/metron/indexing/indexed/snort/enrichment-null-0-0-1484679265343.json
> Found 147 items
> -rwxrwxrwx   1 storm hadoop   18199681 2017-01-18 16:35 
> /apps/metron/indexing/indexed/yaf/enrichment-null-0-0-1484752251563.json
> -rwxrwxrwx   1 storm hadoop 216895 2017-01-19 17:47 
> /apps/metron/indexing/indexed/yaf/enrichment-null-0-0-1484846918122.json
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838548#comment-15838548
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user justinleet commented on the issue:

https://github.com/apache/incubator-metron/pull/421
  
Updated to replace `EnrichmentAdapter.initializeAdapter()` with 
`EnrichmentAdapter.initializeAdapter(Map config)`.  Also added 
`updateAdapter(Map config);`.  Right now the configs only get 
used by GeoAdapter, but all adapters can use configs it is passed as it needs.  
All implementing classes have the method updated, but simply ignore the config 
param.

This removes the direct dependency of `GenericEnrichmentBolt` on 
`GeoLiteDatabase`.  It will simply call init and update to delegate to each 
adapter what to do (if anything).  Update is called during `reloadCallback` if 
global config updates.

I'll want to spin it up again to further validate, but this addresses 
@nickwallen's catch on topologies not using geo needing geo data to exist and 
@dlyle65535's concern about the `GenericEnrichmentBolt` needing to know about 
geo for everything.  Now errors should only occur if a topology using geo data 
can't find it.


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (METRON-674) Archived Telemetry Filenames in HDFS Contain 'Null'

2017-01-25 Thread Nick Allen (JIRA)
Nick Allen created METRON-674:
-

 Summary: Archived Telemetry Filenames in HDFS Contain 'Null'
 Key: METRON-674
 URL: https://issues.apache.org/jira/browse/METRON-674
 Project: Metron
  Issue Type: Bug
Affects Versions: 0.3.0
Reporter: Nick Allen
Priority: Minor


When running "Quick Dev", I have noticed that all of the archived telemetry 
files in HDFS contain null in the name.

{code}
[root@node1 0.3.0]# hdfs dfs -ls /apps/metron/indexing/indexed/*
Found 2 items
-rw-r--r--   1 storm hadoop 644753 2017-01-19 17:47 
/apps/metron/indexing/indexed/bro/enrichment-null-0-0-1484847868551.json
-rw-r--r--   1 storm hadoop   14107767 2017-01-19 18:46 
/apps/metron/indexing/indexed/bro/enrichment-null-0-0-1484848728527.json
Found 5 items
-rwxrwxrwx   1 storm hadoop 205699 2017-01-16 21:57 
/apps/metron/indexing/indexed/snort/enrichment-null-0-0-1484603710250.json
-rwxrwxrwx   1 storm hadoop5773871 2017-01-17 14:34 
/apps/metron/indexing/indexed/snort/enrichment-null-0-0-1484603925156.json
-rwxrwxrwx   1 storm hadoop 253870 2017-01-17 13:43 
/apps/metron/indexing/indexed/snort/enrichment-null-0-0-1484660437793.json
-rwxrwxrwx   1 storm hadoop   24023035 2017-01-17 19:45 
/apps/metron/indexing/indexed/snort/enrichment-null-0-0-1484660672723.json
-rwxrwxrwx   1 storm hadoop2063857 2017-01-17 19:02 
/apps/metron/indexing/indexed/snort/enrichment-null-0-0-1484679265343.json
Found 147 items
-rwxrwxrwx   1 storm hadoop   18199681 2017-01-18 16:35 
/apps/metron/indexing/indexed/yaf/enrichment-null-0-0-1484752251563.json
-rwxrwxrwx   1 storm hadoop 216895 2017-01-19 17:47 
/apps/metron/indexing/indexed/yaf/enrichment-null-0-0-1484846918122.json
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838512#comment-15838512
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97868280
  
--- Diff: 
metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/adapters/geo/GeoLiteDatabase.java
 ---
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.enrichment.adapters.geo;
+
+import com.maxmind.db.CHMCache;
+import com.maxmind.geoip2.DatabaseReader;
+import com.maxmind.geoip2.exception.GeoIp2Exception;
+import com.maxmind.geoip2.model.CityResponse;
+import com.maxmind.geoip2.record.City;
+import com.maxmind.geoip2.record.Country;
+import com.maxmind.geoip2.record.Location;
+import com.maxmind.geoip2.record.Postal;
+import org.apache.commons.validator.routines.InetAddressValidator;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.net.InetAddress;
+import java.net.UnknownHostException;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.locks.Lock;
+import java.util.concurrent.locks.ReentrantReadWriteLock;
+import java.util.zip.GZIPInputStream;
+
+public enum GeoLiteDatabase {
+  INSTANCE;
+
+  protected static final Logger LOG = 
LoggerFactory.getLogger(GeoLiteDatabase.class);
+  public static final String GEO_HDFS_FILE = "geo.hdfs.file";
+  public static final String GEO_HDFS_FILE_DEFAULT = 
"/apps/metron/geo/default/GeoLite2-City.mmdb.gz";
+
+  private static ReentrantReadWriteLock lock = new 
ReentrantReadWriteLock();
+  private static final Lock readLock = lock.readLock();
+  private static final Lock writeLock = lock.writeLock();
+  private static InetAddressValidator ipvalidator = new 
InetAddressValidator();
+  private static volatile String hdfsLoc = GEO_HDFS_FILE_DEFAULT;
+  private static DatabaseReader reader = null;
+
+  public synchronized void updateIfNecessary(Map 
globalConfig) {
+// Reload database if necessary (file changes on HDFS)
+LOG.trace("[Metron] Determining if GeoIpDatabase update required");
+String hdfsFile = GEO_HDFS_FILE_DEFAULT;
+if (globalConfig != null) {
+  hdfsFile = (String) globalConfig.getOrDefault(GEO_HDFS_FILE, 
GEO_HDFS_FILE_DEFAULT);
+}
+
+// Always update if we don't have a DatabaseReader
+if (reader == null || !hdfsLoc.equals(hdfsFile)) {
--- End diff --

Does this null check need to be protected by a lock?  I am really not sure.


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838281#comment-15838281
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user justinleet commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97842488
  
--- Diff: 
metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/stellar/GeoEnrichmentFunctions.java
 ---
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.enrichment.stellar;
+
+import org.apache.log4j.Logger;
+import org.apache.metron.common.dsl.Context;
+import org.apache.metron.common.dsl.ParseException;
+import org.apache.metron.common.dsl.Stellar;
+import org.apache.metron.common.dsl.StellarFunction;
+import org.apache.metron.enrichment.adapters.geo.GeoLiteDatabase;
+
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+
+public class GeoEnrichmentFunctions {
+  private static final Logger LOG = 
Logger.getLogger(GeoEnrichmentFunctions.class);
+
+  @Stellar(name="GET"
+  ,namespace="GEO"
+  ,description="Look up an IPV4 address and returns geographic 
information about it"
+  ,params = {
+  "ip - The IPV4 address to lookup" +
+  "fields - Optional list of GeoIP fields to grab. 
Options are locID, country, city, postalCode, dmaCode, latitude, longitude, 
location_point"
+}
+  ,returns = "If a Single field is requested a string of the 
field, If multiple fields a map of string of the fields, and null otherwise"
+  )
+  public static class GeoGet implements StellarFunction {
+boolean initialized = false;
+
+@Override
+public Object apply(List args, Context context) throws 
ParseException {
+  if(!initialized) {
+return null;
+  }
+  if(args.size() > 2) {
+return null;
--- End diff --

Now throwing an IllegalArgumentException and adjusted the test.


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838246#comment-15838246
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user dlyle65535 commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97839650
  
--- Diff: metron-deployment/packaging/docker/rpm-docker/SPECS/metron.spec 
---
@@ -317,6 +316,8 @@ This package installs the Metron Profiler %{metron_home}
 # 
~~
 
 %changelog
+* Thu Jan 19 2017 Justin Leet  - 0.3.0
--- End diff --

I think that's where I'm at with it. How about we let @justinleet's mentor 
thread confirm/deny that it's okay to keep it and then handle it as a directed 
effort after?


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838249#comment-15838249
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user cestella commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97839783
  
--- Diff: metron-deployment/packaging/docker/rpm-docker/SPECS/metron.spec 
---
@@ -317,6 +316,8 @@ This package installs the Metron Profiler %{metron_home}
 # 
~~
 
 %changelog
+* Thu Jan 19 2017 Justin Leet  - 0.3.0
--- End diff --

sounds good :)


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838239#comment-15838239
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user dlyle65535 commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97838057
  
--- Diff: metron-deployment/packaging/docker/rpm-docker/SPECS/metron.spec 
---
@@ -317,6 +316,8 @@ This package installs the Metron Profiler %{metron_home}
 # 
~~
 
 %changelog
+* Thu Jan 19 2017 Justin Leet  - 0.3.0
--- End diff --

But so we're clear- with the changelog in place you can interrogate the rpm 
directly to find out what's changed "rpm -q --changelog". We'd lose that 
capability. I'm still good with omitting it, but I wanted to make sure everyone 
was aware of that consequence.


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838243#comment-15838243
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user cestella commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97838709
  
--- Diff: metron-deployment/packaging/docker/rpm-docker/SPECS/metron.spec 
---
@@ -317,6 +316,8 @@ This package installs the Metron Profiler %{metron_home}
 # 
~~
 
 %changelog
+* Thu Jan 19 2017 Justin Leet  - 0.3.0
--- End diff --

Hmm, you're right @dlyle65535, we do lose some capabilities by not having 
the changelog in there.  You know what would be cool, if we could generate the 
spec files as part of maven's `generate-source` and fill in the changelog with 
the information from git.  Anyway, best of both worlds kinda thing, but not 
effort that should be done as part of this PR.


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838233#comment-15838233
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user justinleet commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97837151
  
--- Diff: metron-deployment/packaging/docker/rpm-docker/SPECS/metron.spec 
---
@@ -317,6 +316,8 @@ This package installs the Metron Profiler %{metron_home}
 # 
~~
 
 %changelog
+* Thu Jan 19 2017 Justin Leet  - 0.3.0
--- End diff --

I sent out the email before I saw this, but I'm +1 on just dropping it.  
I'll pull it in this PR, and whatever answer we get from the mentors will just 
be for our info.


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838231#comment-15838231
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user cestella commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97836741
  
--- Diff: metron-deployment/packaging/docker/rpm-docker/SPECS/metron.spec 
---
@@ -317,6 +316,8 @@ This package installs the Metron Profiler %{metron_home}
 # 
~~
 
 %changelog
+* Thu Jan 19 2017 Justin Leet  - 0.3.0
--- End diff --

+1 to that @dlyle65535 


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838230#comment-15838230
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user dlyle65535 commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97836561
  
--- Diff: metron-deployment/packaging/docker/rpm-docker/SPECS/metron.spec 
---
@@ -317,6 +316,8 @@ This package installs the Metron Profiler %{metron_home}
 # 
~~
 
 %changelog
+* Thu Jan 19 2017 Justin Leet  - 0.3.0
--- End diff --

I'd just yank that section or leave it blank if it's required. No point in 
figuring out how to keep it if we're agreed we don't need it.


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838189#comment-15838189
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user cestella commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97832200
  
--- Diff: metron-deployment/packaging/docker/rpm-docker/SPECS/metron.spec 
---
@@ -317,6 +316,8 @@ This package installs the Metron Profiler %{metron_home}
 # 
~~
 
 %changelog
+* Thu Jan 19 2017 Justin Leet  - 0.3.0
--- End diff --

It is true that apache recommends strongly (or maybe outright forbids) 
things like author tags in source code (see 
[here](https://mail-archives.apache.org/mod_mbox/www-community/200306.mbox/%3c20030609234538.ga22...@lyra.org%3E)
 for a discussion, the reasoning was mostly around the author tags not being 
accurately representative.

For this situation, however, this is a changelog, so it doesn't have the 
problem of accurate representation that author tags do.  That being said, it is 
redundant information because such information is stored in git.

I'd recommend doing a dev list discussion with a subject that starts with 
[MENTORS] on this to see what the ASF wants.


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838165#comment-15838165
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user dlyle65535 commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97830250
  
--- Diff: 
metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/bolt/GenericEnrichmentBolt.java
 ---
@@ -149,9 +154,10 @@ public JSONObject load(CacheKey key) throws Exception {
 cache = CacheBuilder.newBuilder().maximumSize(maxCacheSize)
 .expireAfterWrite(maxTimeRetain, TimeUnit.MINUTES)
 .build(loader);
+
GeoLiteDatabase.INSTANCE.update((String)getConfigurations().getGlobalConfig().get(GeoLiteDatabase.GEO_HDFS_FILE));
 boolean success = adapter.initializeAdapter();
--- End diff --

@justinleet - I think that's a great solution, thanks!


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838162#comment-15838162
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user justinleet commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97829967
  
--- Diff: metron-platform/metron-data-management/README.md ---
@@ -250,3 +250,18 @@ The parameters for the utility are as follows:
 | -l | --log4j | No   | The log4j properties 
file to load

|
 | -n | --enrichment_config | No   | The JSON document 
describing the enrichments to configure.  Unlike other loaders, this is run 
first if specified. 
   |
 
+### GeoLite2 Loader
+
+The shell script `$METRON_HOME/bin/geo_enrichment_load.sh` will retrieve 
MaxMind GeoLite2 data and load data into HDFS, and update the configuration.
+
+THIS SCRIPT WILL NOT UPDATE AMBARI'S GLOBAL.JSON, JUST THE ZK CONFIGS.  
CHANGES WILL GO INTO EFFECT, BUT WILL NOT PERSIST PAST AN AMBARI RESTART UNTIL 
UPDATED THERE.
+
--- End diff --

This gets into the whole "How do we manage configs discussion?".  
Unfortunately, it's in a really awkward spot.  I might be able to add a service 
action to sorta take care of it, but it still probably does the same end around 
of Ambari's management, except it's going through the UI. I don't know that 
there's a good solution to this until we unify our config management, which is 
the real answer here.

Given that I don't think there was originally support for updating the db, 
I'm inclined to clean it up as part of unifying config management.  It's ugly 
and I don't like it, but I can't come up with a good, short-term, alternative 
that isn't ugly for other reason anyway.


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838150#comment-15838150
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user justinleet commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97828446
  
--- Diff: metron-deployment/packaging/docker/rpm-docker/SPECS/metron.spec 
---
@@ -317,6 +316,8 @@ This package installs the Metron Profiler %{metron_home}
 # 
~~
 
 %changelog
+* Thu Jan 19 2017 Justin Leet  - 0.3.0
--- End diff --

I'm not even sure who we should talk to about how it should be handled.  
I'm more than happy to adjust the changelog as needed, but I need to know how 
to change it first.


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838129#comment-15838129
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user justinleet commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97826728
  
--- Diff: 
metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/adapters/geo/GeoLiteDatabase.java
 ---
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.enrichment.adapters.geo;
+
+import com.maxmind.db.CHMCache;
+import com.maxmind.geoip2.DatabaseReader;
+import com.maxmind.geoip2.exception.GeoIp2Exception;
+import com.maxmind.geoip2.model.CityResponse;
+import com.maxmind.geoip2.record.City;
+import com.maxmind.geoip2.record.Country;
+import com.maxmind.geoip2.record.Location;
+import com.maxmind.geoip2.record.Postal;
+import org.apache.commons.validator.routines.InetAddressValidator;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.net.InetAddress;
+import java.net.UnknownHostException;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.locks.Lock;
+import java.util.concurrent.locks.ReentrantReadWriteLock;
+import java.util.zip.GZIPInputStream;
+
+public enum GeoLiteDatabase {
+  INSTANCE;
--- End diff --

It's primary because the database file could be used and accessed by 
multiple threads of execution, depending on Storm's parallelism. If multiple 
tasks are running, they could potentially start stomping on each other's data.

The burden shouldn't be bad, because it primarily consists of reading.  The 
alternative is having to manage potentially every thread pulling down it's own 
version of the file and loading it while avoiding tripping up other threads.

Having said that, I am definitely open to other ideas if we have good 
alternatives.


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-600) Fix Metron Website

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838121#comment-15838121
 ] 

ASF GitHub Bot commented on METRON-600:
---

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-metron/pull/399


> Fix Metron Website
> --
>
> Key: METRON-600
> URL: https://issues.apache.org/jira/browse/METRON-600
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Ryan Merriman
>
> h3. Issue 1
> Podling web sites MUST include a clear disclaimer on their website and in all 
> documentation (including releases) stating that they are in incubation. 
> Podlings SHOULD use the following text for all disclaimers (replace the 
> underlined phrases as appropriate):
> Apache Podling-Name is an effort undergoing incubation at The Apache Software 
> Foundation (ASF), sponsored by the name of Apache TLP sponsor. Incubation is 
> required of all newly accepted projects until a further review indicates that 
> the infrastructure, communications, and decision making process have 
> stabilized in a manner consistent with other successful ASF projects. While 
> incubation status is not necessarily a reflection of the completeness or 
> stability of the code, it does indicate that the project has yet to be fully 
> endorsed by the ASF.
> h3. Issue 2
> Podlings websites SHOULD contain the Apache Incubator Project logo as sign of 
> affiliation
> Apache Project Web Sites typically include several standard pages. Each page 
> is formatted with a navigation bar on the left and a project standard header 
> that includes the Incubator graphic.
> [We need to make the Logo more prominent and move towards the top of the page 
> rather than having it on the bottom like we do]
> h3. Issue 3
> The sources for every podling site sources should be maintained in the 
> podling's site SVN or git directory
> h3. Issue 4
> Previous statement of the problem:
> bq. A downloads page needs to be created with links per release.  The link to 
> the artifact needs to be using the mirror site for apache.  For example, the 
> 0.3.0 release would be 
> http://www.apache.org/dyn/closer.lua/incubator/metron/0.3.0/apache-metron-0.3.0-incubating.tar.gz.
>   The MD5, SHA and Signature can be from the apache release site.  Look at 
> the storm page as an example: http://storm.apache.org/downloads.html  (Note 
> https://maven.apache.org/download.cgi is a better example because it allows 
> changing the mirror if needed.)
> A separate Downloads page is needed, rather than the current button, in order 
> to satisfy these three requirements of 
> http://www.apache.org/dev/release-download-pages.html :
> * All links to the downloadable distribution artifacts MUST NOT reference the 
> main Apache dist web site (dist.apache.org). Instead, they should use the 
> standard scripting mechanisms to distribute the load between the mirror sites 
> (see http://www.apache.org/dyn/closer.cgi/ ).
> * All links to checksums, detached signatures and public keys MUST reference 
> the main Apache dist web site (dist.apache.org), and use https (SSL).
> * The site SHOULD provide clear and easy links to the public keys, checksums 
> and detached signatures from the download release page, and include a 
> reminder text with links to more information for users.
> Detailed discussion and examples are provided in [Comment 
> 15723335|https://issues.apache.org/jira/browse/METRON-600?focusedCommentId=15723335]
>  below.
> h3. Issue 5
> [Lets try to conform as much as possible to the following suggested template]
> Project Home Page: the primary entry point to the site; contains project 
> description, news, invitation to join the project.
> [We have this, great]
> License Page: usually, the Apache License 2.0
> [We don't have this, we should probably put it under the about page]
> Downloads: many projects in incubation will release code, and this page 
> describes them and has links to the download pages that redirect to Apache 
> Mirror sites.
> [We have this, great]
> Documentation: this page describes the project documentation, including 
> javadoc for Java projects; guides, tutorials, and links to external 
> documentation.
> [We should probably just link to the wiki so we don't have to maintain this 
> in two places]
> Committers: a list of current committers on the project.
> [We need to update this from our status page that can be found here.  Need to 
> make sure both are consistent.
> http://incubator.apache.org/projects/metron.html
> ]
> Mailing Lists: there are several mailing lists that the community might be 
> interested in, and this page contains mailto: links that allow easy 
> subscription (and unsubscription) to any of them.
> [We should probably put this under our community page and also link to the 
> apache status 

[jira] [Commented] (METRON-600) Fix Metron Website

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838114#comment-15838114
 ] 

ASF GitHub Bot commented on METRON-600:
---

Github user cestella commented on the issue:

https://github.com/apache/incubator-metron/pull/399
  
+1 by inspection


> Fix Metron Website
> --
>
> Key: METRON-600
> URL: https://issues.apache.org/jira/browse/METRON-600
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Ryan Merriman
>
> h3. Issue 1
> Podling web sites MUST include a clear disclaimer on their website and in all 
> documentation (including releases) stating that they are in incubation. 
> Podlings SHOULD use the following text for all disclaimers (replace the 
> underlined phrases as appropriate):
> Apache Podling-Name is an effort undergoing incubation at The Apache Software 
> Foundation (ASF), sponsored by the name of Apache TLP sponsor. Incubation is 
> required of all newly accepted projects until a further review indicates that 
> the infrastructure, communications, and decision making process have 
> stabilized in a manner consistent with other successful ASF projects. While 
> incubation status is not necessarily a reflection of the completeness or 
> stability of the code, it does indicate that the project has yet to be fully 
> endorsed by the ASF.
> h3. Issue 2
> Podlings websites SHOULD contain the Apache Incubator Project logo as sign of 
> affiliation
> Apache Project Web Sites typically include several standard pages. Each page 
> is formatted with a navigation bar on the left and a project standard header 
> that includes the Incubator graphic.
> [We need to make the Logo more prominent and move towards the top of the page 
> rather than having it on the bottom like we do]
> h3. Issue 3
> The sources for every podling site sources should be maintained in the 
> podling's site SVN or git directory
> h3. Issue 4
> Previous statement of the problem:
> bq. A downloads page needs to be created with links per release.  The link to 
> the artifact needs to be using the mirror site for apache.  For example, the 
> 0.3.0 release would be 
> http://www.apache.org/dyn/closer.lua/incubator/metron/0.3.0/apache-metron-0.3.0-incubating.tar.gz.
>   The MD5, SHA and Signature can be from the apache release site.  Look at 
> the storm page as an example: http://storm.apache.org/downloads.html  (Note 
> https://maven.apache.org/download.cgi is a better example because it allows 
> changing the mirror if needed.)
> A separate Downloads page is needed, rather than the current button, in order 
> to satisfy these three requirements of 
> http://www.apache.org/dev/release-download-pages.html :
> * All links to the downloadable distribution artifacts MUST NOT reference the 
> main Apache dist web site (dist.apache.org). Instead, they should use the 
> standard scripting mechanisms to distribute the load between the mirror sites 
> (see http://www.apache.org/dyn/closer.cgi/ ).
> * All links to checksums, detached signatures and public keys MUST reference 
> the main Apache dist web site (dist.apache.org), and use https (SSL).
> * The site SHOULD provide clear and easy links to the public keys, checksums 
> and detached signatures from the download release page, and include a 
> reminder text with links to more information for users.
> Detailed discussion and examples are provided in [Comment 
> 15723335|https://issues.apache.org/jira/browse/METRON-600?focusedCommentId=15723335]
>  below.
> h3. Issue 5
> [Lets try to conform as much as possible to the following suggested template]
> Project Home Page: the primary entry point to the site; contains project 
> description, news, invitation to join the project.
> [We have this, great]
> License Page: usually, the Apache License 2.0
> [We don't have this, we should probably put it under the about page]
> Downloads: many projects in incubation will release code, and this page 
> describes them and has links to the download pages that redirect to Apache 
> Mirror sites.
> [We have this, great]
> Documentation: this page describes the project documentation, including 
> javadoc for Java projects; guides, tutorials, and links to external 
> documentation.
> [We should probably just link to the wiki so we don't have to maintain this 
> in two places]
> Committers: a list of current committers on the project.
> [We need to update this from our status page that can be found here.  Need to 
> make sure both are consistent.
> http://incubator.apache.org/projects/metron.html
> ]
> Mailing Lists: there are several mailing lists that the community might be 
> interested in, and this page contains mailto: links that allow easy 
> subscription (and unsubscription) to any of them.
> [We should probably put this under our community page and also link to 

[jira] [Commented] (METRON-666) Fix javadoc doclint errors

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838110#comment-15838110
 ] 

ASF GitHub Bot commented on METRON-666:
---

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-metron/pull/418


> Fix javadoc doclint errors
> --
>
> Key: METRON-666
> URL: https://issues.apache.org/jira/browse/METRON-666
> Project: Metron
>  Issue Type: Bug
>Affects Versions: 0.3.0
>Reporter: Matt Foley
>Assignee: Matt Foley
>
> Java 8 includes "doclint" as part of javadocs.  As a result, running javadoc 
> on current code base has fatal errors, mostly (not all) related to use of 
> "" (not allowed ever), or unmatched "" (not preceded by a matching 
> "").  It is, however, happy with unmatched "", so that's the thing to 
> use for paragraph separators.  Put it on the same line as the next line of 
> text to avoid a warning about "empty ".
> There are other errors fixed here too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838105#comment-15838105
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user justinleet commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97824391
  
--- Diff: 
metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/bolt/GenericEnrichmentBolt.java
 ---
@@ -149,9 +154,10 @@ public JSONObject load(CacheKey key) throws Exception {
 cache = CacheBuilder.newBuilder().maximumSize(maxCacheSize)
 .expireAfterWrite(maxTimeRetain, TimeUnit.MINUTES)
 .build(loader);
+
GeoLiteDatabase.INSTANCE.update((String)getConfigurations().getGlobalConfig().get(GeoLiteDatabase.GEO_HDFS_FILE));
 boolean success = adapter.initializeAdapter();
--- End diff --

You're right that if we adjust the enrichment adapter to accept 
configuration values, this can be pushed to the GeoAdapter, where I agree it 
makes more sense.

Are there concerns over changing the interface method directly, or would we 
prefer to give `EnrichmentAdapter` a `initializeAdapter(Map 
config)` with default impl that calls `initializeAdapter()` with no args?  The 
second option makes it a bit ugly that `GeoAdapter` still has the original 
`initializeAdapter()`, but has the benefit of keeping the interface backwards 
compatible to anybody with custom adapters (Is this a thing that people have?).

And you are correct that there is an issue with initializing a missing DB 
(Great catch!). If we make this change, I believe the only time you'd have an 
issue is if you set up a geo enrichment and didn't actually set up the DB.  
Which is an entirely reasonable issue to have an error with.

@dlyle65535 Would this address your concerns about having the init in the 
GenericEnrichmentBolt in a satisfactory manner?


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838084#comment-15838084
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user cestella commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97821590
  
--- Diff: 
metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/stellar/GeoEnrichmentFunctions.java
 ---
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.enrichment.stellar;
+
+import org.apache.log4j.Logger;
+import org.apache.metron.common.dsl.Context;
+import org.apache.metron.common.dsl.ParseException;
+import org.apache.metron.common.dsl.Stellar;
+import org.apache.metron.common.dsl.StellarFunction;
+import org.apache.metron.enrichment.adapters.geo.GeoLiteDatabase;
+
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+
+public class GeoEnrichmentFunctions {
+  private static final Logger LOG = 
Logger.getLogger(GeoEnrichmentFunctions.class);
+
+  @Stellar(name="GET"
+  ,namespace="GEO"
+  ,description="Look up an IPV4 address and returns geographic 
information about it"
+  ,params = {
+  "ip - The IPV4 address to lookup" +
+  "fields - Optional list of GeoIP fields to grab. 
Options are locID, country, city, postalCode, dmaCode, latitude, longitude, 
location_point"
+}
+  ,returns = "If a Single field is requested a string of the 
field, If multiple fields a map of string of the fields, and null otherwise"
+  )
+  public static class GeoGet implements StellarFunction {
+boolean initialized = false;
+
+@Override
+public Object apply(List args, Context context) throws 
ParseException {
+  if(!initialized) {
+return null;
+  }
+  if(args.size() > 2) {
+return null;
--- End diff --

So it is true that we do not tend to throw exceptions on some errors in 
Stellar functions for reasons that may or may not be valid.  The original 
intent was to not stop topologies because an error has happened in a specific 
function.  I happen to think that this isn't correct reasoning anymore, but 
that's a discussion for another thread.

In this specific case, that the user has passed in too many arguments, I 
think you could make a reasonable argument for a couple of different semantics:
* do nothing, passing in too many arguments does no harm
* throw an exception as Nick suggests
* return null

I actually think I agree in this case with Nick, I'd rather err on the side 
of caution and throw an exception here so we don't have people using functions 
wrong and not realizing it.


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838058#comment-15838058
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user justinleet commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97819095
  
--- Diff: 
metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/adapters/geo/GeoLiteDatabase.java
 ---
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.enrichment.adapters.geo;
+
+import com.maxmind.db.CHMCache;
+import com.maxmind.geoip2.DatabaseReader;
+import com.maxmind.geoip2.exception.GeoIp2Exception;
+import com.maxmind.geoip2.model.CityResponse;
+import com.maxmind.geoip2.record.City;
+import com.maxmind.geoip2.record.Country;
+import com.maxmind.geoip2.record.Location;
+import com.maxmind.geoip2.record.Postal;
+import org.apache.commons.validator.routines.InetAddressValidator;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.net.InetAddress;
+import java.net.UnknownHostException;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.locks.Lock;
+import java.util.concurrent.locks.ReentrantReadWriteLock;
+import java.util.zip.GZIPInputStream;
+
+public enum GeoLiteDatabase {
+  INSTANCE;
+
+  protected static final Logger LOG = 
LoggerFactory.getLogger(GeoLiteDatabase.class);
+  public static final String GEO_HDFS_FILE = "geo.hdfs.file";
+  public static final String GEO_HDFS_FILE_DEFAULT = 
"/apps/metron/geo/default/GeoLite2-City.mmdb.gz";
+
+  private static ReentrantReadWriteLock lock = new 
ReentrantReadWriteLock();
+  private static final Lock readLock = lock.readLock();
+  private static final Lock writeLock = lock.writeLock();
+  private static InetAddressValidator ipvalidator = new 
InetAddressValidator();
+  private static volatile String hdfsLoc = GEO_HDFS_FILE_DEFAULT;
+  private static DatabaseReader reader = null;
+
+  public synchronized void updateIfNecessary(Map 
globalConfig) {
+// Reload database if necessary (file changes on HDFS)
+LOG.trace("[Metron] Determining if GeoIpDatabase update required");
+String hdfsFile = GEO_HDFS_FILE_DEFAULT;
+if (globalConfig != null) {
+  hdfsFile = (String) globalConfig.getOrDefault(GEO_HDFS_FILE, 
GEO_HDFS_FILE_DEFAULT);
+}
+
+// Always update if we don't have a DatabaseReader
+if (reader == null || !hdfsLoc.equals(hdfsFile)) {
+  // Update
+  hdfsLoc = hdfsFile;
+  update(hdfsFile);
+} else {
+  LOG.trace("[Metron] Update to GeoIpDatabase unnecessary");
+}
+  }
+
+  @SuppressWarnings("unchecked")
+  public void update(String hdfsFile) {
+// If nothing is set (or it's been unset, use the defaults)
+if (hdfsFile == null || hdfsFile.isEmpty()) {
+  LOG.debug("[Metron] Using default for {}: {}", GEO_HDFS_FILE, 
GEO_HDFS_FILE_DEFAULT);
+  hdfsFile = GEO_HDFS_FILE_DEFAULT;
+}
+
+FileSystem fs;
+try {
+  fs = FileSystem.get(new Configuration());
+} catch (IOException e) {
+  LOG.error("[Metron] Unable to retrieve get HDFS FileSystem");
+  throw new IllegalStateException("[Metron] Unable to get HDFS 
FileSystem");
+}
+
+try (GZIPInputStream gis = new GZIPInputStream(fs.open(new 
Path(hdfsFile {
+  writeLock.lock();
+  LOG.info("[Metron] Update to GeoIP data started with {}", hdfsFile);
+  // InputStream based DatabaseReaders are always in memory.
+  DatabaseReader newReader = new 

[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838045#comment-15838045
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user cestella commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97817739
  
--- Diff: 
metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/stellar/GeoEnrichmentFunctions.java
 ---
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.enrichment.stellar;
+
+import org.apache.log4j.Logger;
+import org.apache.metron.common.dsl.Context;
+import org.apache.metron.common.dsl.ParseException;
+import org.apache.metron.common.dsl.Stellar;
+import org.apache.metron.common.dsl.StellarFunction;
+import org.apache.metron.enrichment.adapters.geo.GeoLiteDatabase;
+
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+
+public class GeoEnrichmentFunctions {
+  private static final Logger LOG = 
Logger.getLogger(GeoEnrichmentFunctions.class);
+
+  @Stellar(name="GET"
+  ,namespace="GEO"
+  ,description="Look up an IPV4 address and returns geographic 
information about it"
+  ,params = {
+  "ip - The IPV4 address to lookup" +
+  "fields - Optional list of GeoIP fields to grab. 
Options are locID, country, city, postalCode, dmaCode, latitude, longitude, 
location_point"
+}
+  ,returns = "If a Single field is requested a string of the 
field, If multiple fields a map of string of the fields, and null otherwise"
+  )
--- End diff --

Outside of the scope of this PR, I agree, but I just wanted to chime in and 
agree with the sentiment.  Now that we have geo IP enrichment, honestly, 
stellar enrichments aren't missing any features that the other enrichments 
have.  Worthy of a discussion, for sure.


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838044#comment-15838044
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user cestella commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97818256
  
--- Diff: metron-platform/metron-data-management/README.md ---
@@ -250,3 +250,18 @@ The parameters for the utility are as follows:
 | -l | --log4j | No   | The log4j properties 
file to load

|
 | -n | --enrichment_config | No   | The JSON document 
describing the enrichments to configure.  Unlike other loaders, this is run 
first if specified. 
   |
 
+### GeoLite2 Loader
+
+The shell script `$METRON_HOME/bin/geo_enrichment_load.sh` will retrieve 
MaxMind GeoLite2 data and load data into HDFS, and update the configuration.
+
+THIS SCRIPT WILL NOT UPDATE AMBARI'S GLOBAL.JSON, JUST THE ZK CONFIGS.  
CHANGES WILL GO INTO EFFECT, BUT WILL NOT PERSIST PAST AN AMBARI RESTART UNTIL 
UPDATED THERE.
+
--- End diff --

It's annoying, for sure.  It's a persistent problem and one of the reasons 
for that discuss thread about how to handle configurations.  I believe the 
conclusion arrived at there is that in the future state, we should push changes 
through ambari which will then update zookeeper, thereby avoiding this 
unfortunate state.


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838042#comment-15838042
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user justinleet commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97818189
  
--- Diff: 
metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/stellar/GeoEnrichmentFunctions.java
 ---
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.enrichment.stellar;
+
+import org.apache.log4j.Logger;
+import org.apache.metron.common.dsl.Context;
+import org.apache.metron.common.dsl.ParseException;
+import org.apache.metron.common.dsl.Stellar;
+import org.apache.metron.common.dsl.StellarFunction;
+import org.apache.metron.enrichment.adapters.geo.GeoLiteDatabase;
+
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+
+public class GeoEnrichmentFunctions {
+  private static final Logger LOG = 
Logger.getLogger(GeoEnrichmentFunctions.class);
+
+  @Stellar(name="GET"
+  ,namespace="GEO"
+  ,description="Look up an IPV4 address and returns geographic 
information about it"
+  ,params = {
+  "ip - The IPV4 address to lookup" +
+  "fields - Optional list of GeoIP fields to grab. 
Options are locID, country, city, postalCode, dmaCode, latitude, longitude, 
location_point"
+}
+  ,returns = "If a Single field is requested a string of the 
field, If multiple fields a map of string of the fields, and null otherwise"
+  )
+  public static class GeoGet implements StellarFunction {
+boolean initialized = false;
+
+@Override
+public Object apply(List args, Context context) throws 
ParseException {
+  if(!initialized) {
+return null;
+  }
+  if(args.size() > 2) {
+return null;
--- End diff --

Is there a well defined expectation for what happens in cases like this?  I 
don't think I've seen anything, and on reflection I'm pretty sure I just based 
that off what other functions did.

@cestella Can you shed some light on what perceptions/expectations exist 
for this case?  Or shame my reading comprehension by pointing me to where we 
have them documented, if we do?


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838040#comment-15838040
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user cestella commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97817234
  
--- Diff: 
metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/adapters/geo/GeoLiteDatabase.java
 ---
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.enrichment.adapters.geo;
+
+import com.maxmind.db.CHMCache;
+import com.maxmind.geoip2.DatabaseReader;
+import com.maxmind.geoip2.exception.GeoIp2Exception;
+import com.maxmind.geoip2.model.CityResponse;
+import com.maxmind.geoip2.record.City;
+import com.maxmind.geoip2.record.Country;
+import com.maxmind.geoip2.record.Location;
+import com.maxmind.geoip2.record.Postal;
+import org.apache.commons.validator.routines.InetAddressValidator;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.net.InetAddress;
+import java.net.UnknownHostException;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.locks.Lock;
+import java.util.concurrent.locks.ReentrantReadWriteLock;
+import java.util.zip.GZIPInputStream;
+
+public enum GeoLiteDatabase {
+  INSTANCE;
--- End diff --

So, I'll defend the singleton use:
* We can reuse the same database across multiple storm tasks on the same 
storm worker
* Even without a singleton, you'd have to deal with the locking in the 
stellar function since those are effectively static.  i.e. You're reading a 
database that could be being updated/replaced.

I'll let Justin respond with his reasons, but those are the ones that came 
to my head.


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838036#comment-15838036
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user justinleet commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97816667
  
--- Diff: metron-deployment/roles/snort/files/snort.conf ---
@@ -586,7 +586,6 @@ include $RULE_PATH/community.rules
 # include $RULE_PATH/malware-tools.rules
 # include $RULE_PATH/misc.rules
 # include $RULE_PATH/multimedia.rules
-# include $RULE_PATH/mysql.rules
--- End diff --

Good catch. I'll add it back in.


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838013#comment-15838013
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97812946
  
--- Diff: 
metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/stellar/GeoEnrichmentFunctions.java
 ---
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.enrichment.stellar;
+
+import org.apache.log4j.Logger;
+import org.apache.metron.common.dsl.Context;
+import org.apache.metron.common.dsl.ParseException;
+import org.apache.metron.common.dsl.Stellar;
+import org.apache.metron.common.dsl.StellarFunction;
+import org.apache.metron.enrichment.adapters.geo.GeoLiteDatabase;
+
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+
+public class GeoEnrichmentFunctions {
+  private static final Logger LOG = 
Logger.getLogger(GeoEnrichmentFunctions.class);
+
+  @Stellar(name="GET"
+  ,namespace="GEO"
+  ,description="Look up an IPV4 address and returns geographic 
information about it"
+  ,params = {
+  "ip - The IPV4 address to lookup" +
+  "fields - Optional list of GeoIP fields to grab. 
Options are locID, country, city, postalCode, dmaCode, latitude, longitude, 
location_point"
+}
+  ,returns = "If a Single field is requested a string of the 
field, If multiple fields a map of string of the fields, and null otherwise"
+  )
--- End diff --

Seems useful.  Would we ever deprecate the current way of doing enrichment 
in favor of just using Stellar and a function like this?  May take some more 
work, but that might simplify things.

(Comment for future state; obviously outside the scope of your PR)


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838018#comment-15838018
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97800043
  
--- Diff: metron-platform/metron-data-management/README.md ---
@@ -250,3 +250,18 @@ The parameters for the utility are as follows:
 | -l | --log4j | No   | The log4j properties 
file to load

|
 | -n | --enrichment_config | No   | The JSON document 
describing the enrichments to configure.  Unlike other loaders, this is run 
first if specified. 
   |
 
+### GeoLite2 Loader
+
+The shell script `$METRON_HOME/bin/geo_enrichment_load.sh` will retrieve 
MaxMind GeoLite2 data and load data into HDFS, and update the configuration.
+
+THIS SCRIPT WILL NOT UPDATE AMBARI'S GLOBAL.JSON, JUST THE ZK CONFIGS.  
CHANGES WILL GO INTO EFFECT, BUT WILL NOT PERSIST PAST AN AMBARI RESTART UNTIL 
UPDATED THERE.
+
--- End diff --

This seems like something that could cause a really annoying problem for a 
user. Is there no simple action we can take in this PR to stop Ambari from 
clobbering things?

Obviously, deploying/updating the geo database in an Ambari "service 
action" would be ideal. Would be nice to have one way to do it that does not 
break on restart.

Or maybe we need to accept this for now and once we have Ansible 
deployments using the MPack, we can implement an Ambari "service action" for 
this?


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838014#comment-15838014
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97813156
  
--- Diff: 
metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/adapters/geo/GeoLiteDatabase.java
 ---
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.enrichment.adapters.geo;
+
+import com.maxmind.db.CHMCache;
+import com.maxmind.geoip2.DatabaseReader;
+import com.maxmind.geoip2.exception.GeoIp2Exception;
+import com.maxmind.geoip2.model.CityResponse;
+import com.maxmind.geoip2.record.City;
+import com.maxmind.geoip2.record.Country;
+import com.maxmind.geoip2.record.Location;
+import com.maxmind.geoip2.record.Postal;
+import org.apache.commons.validator.routines.InetAddressValidator;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.net.InetAddress;
+import java.net.UnknownHostException;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.locks.Lock;
+import java.util.concurrent.locks.ReentrantReadWriteLock;
+import java.util.zip.GZIPInputStream;
+
+public enum GeoLiteDatabase {
+  INSTANCE;
--- End diff --

Why did you go with a singleton?  I'm sure you thought through this and 
there are good reasons, so just want to understand.  I just don't like the 
extra burden that using a singleton brings on (like the extra locking).


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838016#comment-15838016
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97810219
  
--- Diff: 
metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/adapters/geo/GeoLiteDatabase.java
 ---
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.enrichment.adapters.geo;
+
+import com.maxmind.db.CHMCache;
+import com.maxmind.geoip2.DatabaseReader;
+import com.maxmind.geoip2.exception.GeoIp2Exception;
+import com.maxmind.geoip2.model.CityResponse;
+import com.maxmind.geoip2.record.City;
+import com.maxmind.geoip2.record.Country;
+import com.maxmind.geoip2.record.Location;
+import com.maxmind.geoip2.record.Postal;
+import org.apache.commons.validator.routines.InetAddressValidator;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.net.InetAddress;
+import java.net.UnknownHostException;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.locks.Lock;
+import java.util.concurrent.locks.ReentrantReadWriteLock;
+import java.util.zip.GZIPInputStream;
+
+public enum GeoLiteDatabase {
+  INSTANCE;
+
+  protected static final Logger LOG = 
LoggerFactory.getLogger(GeoLiteDatabase.class);
+  public static final String GEO_HDFS_FILE = "geo.hdfs.file";
+  public static final String GEO_HDFS_FILE_DEFAULT = 
"/apps/metron/geo/default/GeoLite2-City.mmdb.gz";
+
+  private static ReentrantReadWriteLock lock = new 
ReentrantReadWriteLock();
+  private static final Lock readLock = lock.readLock();
+  private static final Lock writeLock = lock.writeLock();
+  private static InetAddressValidator ipvalidator = new 
InetAddressValidator();
+  private static volatile String hdfsLoc = GEO_HDFS_FILE_DEFAULT;
+  private static DatabaseReader reader = null;
+
+  public synchronized void updateIfNecessary(Map 
globalConfig) {
+// Reload database if necessary (file changes on HDFS)
+LOG.trace("[Metron] Determining if GeoIpDatabase update required");
+String hdfsFile = GEO_HDFS_FILE_DEFAULT;
+if (globalConfig != null) {
+  hdfsFile = (String) globalConfig.getOrDefault(GEO_HDFS_FILE, 
GEO_HDFS_FILE_DEFAULT);
+}
+
+// Always update if we don't have a DatabaseReader
+if (reader == null || !hdfsLoc.equals(hdfsFile)) {
+  // Update
+  hdfsLoc = hdfsFile;
+  update(hdfsFile);
+} else {
+  LOG.trace("[Metron] Update to GeoIpDatabase unnecessary");
+}
+  }
+
+  @SuppressWarnings("unchecked")
+  public void update(String hdfsFile) {
+// If nothing is set (or it's been unset, use the defaults)
+if (hdfsFile == null || hdfsFile.isEmpty()) {
+  LOG.debug("[Metron] Using default for {}: {}", GEO_HDFS_FILE, 
GEO_HDFS_FILE_DEFAULT);
+  hdfsFile = GEO_HDFS_FILE_DEFAULT;
+}
+
+FileSystem fs;
+try {
+  fs = FileSystem.get(new Configuration());
+} catch (IOException e) {
+  LOG.error("[Metron] Unable to retrieve get HDFS FileSystem");
+  throw new IllegalStateException("[Metron] Unable to get HDFS 
FileSystem");
+}
+
+try (GZIPInputStream gis = new GZIPInputStream(fs.open(new 
Path(hdfsFile {
+  writeLock.lock();
+  LOG.info("[Metron] Update to GeoIP data started with {}", hdfsFile);
+  // InputStream based DatabaseReaders are always in memory.
+  DatabaseReader newReader = new 

[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838011#comment-15838011
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97796850
  
--- Diff: metron-deployment/packaging/docker/rpm-docker/SPECS/metron.spec 
---
@@ -317,6 +316,8 @@ This package installs the Metron Profiler %{metron_home}
 # 
~~
 
 %changelog
+* Thu Jan 19 2017 Justin Leet  - 0.3.0
--- End diff --

Unrelated to this PR, but I thought it was un-Apache to put individual's 
names and emails in the source code?


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838017#comment-15838017
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97805521
  
--- Diff: 
metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/stellar/GeoEnrichmentFunctions.java
 ---
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.enrichment.stellar;
+
+import org.apache.log4j.Logger;
+import org.apache.metron.common.dsl.Context;
+import org.apache.metron.common.dsl.ParseException;
+import org.apache.metron.common.dsl.Stellar;
+import org.apache.metron.common.dsl.StellarFunction;
+import org.apache.metron.enrichment.adapters.geo.GeoLiteDatabase;
+
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+
+public class GeoEnrichmentFunctions {
+  private static final Logger LOG = 
Logger.getLogger(GeoEnrichmentFunctions.class);
+
+  @Stellar(name="GET"
+  ,namespace="GEO"
+  ,description="Look up an IPV4 address and returns geographic 
information about it"
+  ,params = {
+  "ip - The IPV4 address to lookup" +
+  "fields - Optional list of GeoIP fields to grab. 
Options are locID, country, city, postalCode, dmaCode, latitude, longitude, 
location_point"
+}
+  ,returns = "If a Single field is requested a string of the 
field, If multiple fields a map of string of the fields, and null otherwise"
+  )
+  public static class GeoGet implements StellarFunction {
+boolean initialized = false;
+
+@Override
+public Object apply(List args, Context context) throws 
ParseException {
+  if(!initialized) {
+return null;
+  }
+  if(args.size() > 2) {
+return null;
--- End diff --

Should we throw an exception?  If a user misuses the function, then let's 
tell them.


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838015#comment-15838015
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97798436
  
--- Diff: metron-deployment/roles/snort/files/snort.conf ---
@@ -586,7 +586,6 @@ include $RULE_PATH/community.rules
 # include $RULE_PATH/malware-tools.rules
 # include $RULE_PATH/misc.rules
 # include $RULE_PATH/multimedia.rules
-# include $RULE_PATH/mysql.rules
--- End diff --

I don't think that we actually want to remove the MySQL rules from Snort.  
These rules help Snort create alerts based on MySQL databases that may be 
running in the protected network.


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838012#comment-15838012
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97813018
  
--- Diff: 
metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/bolt/GenericEnrichmentBolt.java
 ---
@@ -149,9 +154,10 @@ public JSONObject load(CacheKey key) throws Exception {
 cache = CacheBuilder.newBuilder().maximumSize(maxCacheSize)
 .expireAfterWrite(maxTimeRetain, TimeUnit.MINUTES)
 .build(loader);
+
GeoLiteDatabase.INSTANCE.update((String)getConfigurations().getGlobalConfig().get(GeoLiteDatabase.GEO_HDFS_FILE));
 boolean success = adapter.initializeAdapter();
--- End diff --

Does this mean that we will attempt to load the geo database into memory in 
every `GenericEnrichmentBolt`; even ones not doing geo-enrichment?  Maybe that 
is one reason, we have singleton for the GeoLiteDatabase; to avoid repeated 
initialization?

Along the same lines, If I choose to not do geo-enrichment, do I still need 
to have a valid Maxmind file in HDFS?  Or would that cause all 
GenericEnrichmentBolts to fail to initiailize?

Would it make sense to update the EnrichmentAdapter interface to accept 
configuration values, so that this initialization occurs in the GeoAdapter 
where it makes more sense?  Then only the bolts doing geo-enrichment attempt to 
initialize the geo data?




> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-608) Mpack to install a single-node test cluster

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837895#comment-15837895
 ] 

ASF GitHub Bot commented on METRON-608:
---

Github user dlyle65535 commented on the issue:

https://github.com/apache/incubator-metron/pull/408
  
@mattf-horton - I need an MPack that works for 1-N nodes in order to 
complete [METRON-671](https://issues.apache.org/jira/browse/METRON-671). I 
suspect I'm working through stuff you've already fixed. Would it make sense to 
push out a PR with the unified branch and let me test it? Then I could use 
finish METRON-671 on top of your changes.


> Mpack to install a single-node test cluster
> ---
>
> Key: METRON-608
> URL: https://issues.apache.org/jira/browse/METRON-608
> Project: Metron
>  Issue Type: Improvement
>Affects Versions: 0.3.0
> Environment: Linux, Ambari installation
>Reporter: Matt Foley
>Assignee: Matt Foley
> Fix For: Next + 1
>
>
> The current Mpack for Ambari install of Metron fails to correctly install 
> Elasticsearch if restricted to a single-node cluster.  Yet a single-node 
> install of Elasticsearch is certainly feasible, as shown by our quick-dev 
> environment.
> This is a short-term fix by providing a completely separate Mpack just for 
> the single-node scenario.  I'm also opening METRON-609 to enhance the 
> existing Mpack to handle the single-node and small-number-of-nodes scenario, 
> but that one will require much deeper testing and is likely to take a while 
> to complete.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-270) Add Zeppelin to the platform

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837894#comment-15837894
 ] 

ASF GitHub Bot commented on METRON-270:
---

GitHub user justinleet opened a pull request:

https://github.com/apache/incubator-metron/pull/423

METRON-270: Add Zeppelin to the platform

Adds Zeppelin to the Ambari Management Pack portion.  Adding to the Ansible 
/ quick dev portion is happening a bit in parallel with Nick's work on 
METRON-346.

This ticket existed before the mpack, so if we're not comfortable having 
this just cover the mpack I'd prefer to split this ticket rather than force 
them together.  This isn't packaging up any notebooks, so there's not deviation 
in functionality until notebooks are added.

Essentially, this ties into metron-indexing (because we rely on data 
generated there).  I'm open to adjusting this if anybody feels strongly about 
it, but I think it's currently the best place.  Zeppelin notebook JSON files 
are loaded from metron-platform/metron-indexing/src/main/config/zeppelin/.  
These files can be placed into subdirs for organization if desired. (e.g. 
zeppelin/bro/bro.json) and must end with .json.

A custom action is added to the mpack to import these notebooks.  This 
action is available regardless of whether or not Metron itself is running. 
Zeppelin configuration is autopopulated by the management pack.

Zeppelin allows for duplicate notebook names (they'll be given differing 
IDs). I didn't implement a way to track installed notebooks, but this is 
potentially a good future feature (to allow us to delete all installed 
notebooks, etc.).  Once the management pack installs the notebooks, they're 
treated as belonging to Zeppelin entirely and can be managed there.

For testing, I created a few notebooks (valid, invalid, misnamed and in 
subdirs), updated the spec files, recreated the RPMs, and ran this up on a 
pretty constrained local cluster. I was able to see them and run sections of 
them as appropriate.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/justinleet/incubator-metron zeppelin_dashboard

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-metron/pull/423.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #423


commit e5b392cc73d49fa4d9a3420df556e480c8f918af
Author: justinjleet 
Date:   2017-01-23T13:36:24Z

Working initially

commit 1cafd1d7c03d9fc25d7d100ddb369bc43bec4aad
Author: justinjleet 
Date:   2017-01-24T20:46:24Z

Building directory for loading notebook files

commit 58f2ed354c86d66d75a850d97a3af38ad6fc4a53
Author: justinjleet 
Date:   2017-01-25T14:57:11Z

Updating READMEs

commit c7e3958d6c99d723d9010d804a10a38ffbd053e9
Author: justinjleet 
Date:   2017-01-25T15:11:10Z

Removing extraneous spec change from testing, and updating README




> Add Zeppelin to the platform
> 
>
> Key: METRON-270
> URL: https://issues.apache.org/jira/browse/METRON-270
> Project: Metron
>  Issue Type: New Feature
>Reporter: James Sirota
>Assignee: Justin Leet
>  Labels: METRON_ML
>
> I propose adding Zeppelin to the platform to aid in interactive dashboarding 
> and data visualizations 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (METRON-673) Unable to connect to a Secured Kafka cluster

2017-01-25 Thread Bas van de Lustgraaf (JIRA)

 [ 
https://issues.apache.org/jira/browse/METRON-673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bas van de Lustgraaf updated METRON-673:

Description: 
metron-parser is unable to connect to a Kerberized Kafka cluster for consuming 
and producing purposes.

The initial error from the storm worker.log that indicated it was not working:

{noformat}
2017-01-13 15:31:39.793 o.a.s.k.PartitionManager [INFO] Read partition 
information from: /suricata/partition_0  --> null
{noformat}

This error is a known error, and could be solved by following the instruction 
on the following page 
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_storm-component-guide/content/storm-kafka-kerb.html.
 (In this case the the lack of spoutConfig.securityProtocol is the problem).

Also take into account that 2.5.0.0 has a issue with the KafkaSpout reading 
from a secured Kafka cluster (attached: Hortonworks Technical Alert: Storm 
kafkaspout to secure Kafka issue in HDP 2.5.0).

After changing the pom.xml of the incubator-project, to package the 
metron-parsers JAR with the storm-kafka dependency version 1.0.1.2.5.3.0-37, I 
tried to deploy the metron-parsers topology (attached:  
extra_kafka_spout_config.json).

{noformat}
storm jar metron-parsers-0.3.0-uber.jar 
org.apache.metron.parsers.topology.ParserTopologyCLI -k kn00:6667 -z kn00:2181 
-s suricata –esc extra_kafka_spout_config.json
{noformat}

To set the spoutConfig.securityProtocol=PLAINTEXTSASL we used the 
metron-parsers -esc parameter to pass the securityProtocol setting to the 
topology. Unfortunately, Metron only allows a predefined list of parameters 
that can be passed to the KafkaSpout. This approach resulted in an 
IllegalArgumentException (source: storm jar).

{noformat}
java.lang.IllegalArgumentException: Configuration keys for spout config must be 
one of: 
retryDelayMaxMs,retryDelayMultiplier,retryInitialDelayMs,stateUpdateIntervalMs,bufferSizeBytes,fetchMaxWait,fetchSizeBytes,maxOffsetBehind,metricsTimeBucketSizeInSecs,socketTimeoutMs
at 
org.apache.metron.common.spout.kafka.SpoutConfigOptions.coerceMap(SpoutConfigOptions.java:68)
at 
org.apache.metron.parsers.topology.ParserTopologyCLI.readSpoutConfig(ParserTopologyCLI.java:340)
at 
org.apache.metron.parsers.topology.ParserTopologyCLI.main(ParserTopologyCLI.java:291)
Caused by: java.lang.IllegalArgumentException: No enum constant 
org.apache.metron.common.spout.kafka.SpoutConfigOptions.securityProtocol
at java.lang.Enum.valueOf(Enum.java:238)
at 
org.apache.metron.common.spout.kafka.SpoutConfigOptions.valueOf(SpoutConfigOptions.java:28)
at 
org.apache.metron.common.spout.kafka.SpoutConfigOptions.coerceMap(SpoutConfigOptions.java:64)
... 2 more
{noformat}

I have solved this by changing the Metron code, as showed below (or see the 
git_diff.txt).

{noformat}
vi 
/metron-platform/metron-common/src/main/java/org/apache/metron/common/spout/kafka/SpoutConfigOptions.java
### add line to function SpoutConfigOptions
securityProtocol( (config, val) -> config.securityProtocol = convertVal(val, 
String.class)),
{noformat}

And change the KafkaProducer, to make it aware of the SASL_PLAINTEXT option.

{noformat}
vi 
/metron-platform/metron-writer/src/main/java/org/apache/metron/writer/kafka/KafkaWriter.java
 
b/metron-platform/metron-writer/src/main/java/org/apache/metron/writer/kafka/KafkaWriter.java
### add line to class KafkaWriter
producerConfig.put("security.protocol", "SASL_PLAINTEXT");
{noformat}

The problem with my version is that the change for the producer is hard coded. 
The ideal option is to make this as a parameter when starting the parser 
topology, like with the consumer.

This also may prevent you running the enrichment or index part of metron.

  was:
metron-parser is unable to connect to a Kerberized Kafka cluster for consuming 
and producing purposes.

The initial error from the storm worker.log that indicated it was not working:

{noformat}
2017-01-13 15:31:39.793 o.a.s.k.PartitionManager [INFO] Read partition 
information from: /suricata/partition_0  --> null
{noformat}

This error is a known error, and could be solved by following the instruction 
on the following page 
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_storm-component-guide/content/storm-kafka-kerb.html.
 (In this case the the lack of spoutConfig.securityProtocol is the problem).

Also take into account that 2.5.0.0 has a issue with the KafkaSpout reading 
from a secured Kafka cluster (attached: Hortonworks Technical Alert: Storm 
kafkaspout to secure Kafka issue in HDP 2.5.0).

After changing the pom.xml of the incubator-project, to package the 
metron-parsers JAR with the storm-kafka dependency version 1.0.1.2.5.3.0-37, I 
tried to deploy the metron-parsers topology (attached:  
extra_kafka_spout_config.json).

{noformat}
storm jar metron-parsers-0.3.0-uber.jar 

[jira] [Commented] (METRON-666) Fix javadoc doclint errors

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837737#comment-15837737
 ] 

ASF GitHub Bot commented on METRON-666:
---

Github user ottobackwards commented on the issue:

https://github.com/apache/incubator-metron/pull/418
  
+1



> Fix javadoc doclint errors
> --
>
> Key: METRON-666
> URL: https://issues.apache.org/jira/browse/METRON-666
> Project: Metron
>  Issue Type: Bug
>Affects Versions: 0.3.0
>Reporter: Matt Foley
>Assignee: Matt Foley
>
> Java 8 includes "doclint" as part of javadocs.  As a result, running javadoc 
> on current code base has fatal errors, mostly (not all) related to use of 
> "" (not allowed ever), or unmatched "" (not preceded by a matching 
> "").  It is, however, happy with unmatched "", so that's the thing to 
> use for paragraph separators.  Put it on the same line as the next line of 
> text to avoid a warning about "empty ".
> There are other errors fixed here too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-666) Fix javadoc doclint errors

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837718#comment-15837718
 ] 

ASF GitHub Bot commented on METRON-666:
---

Github user justinleet commented on the issue:

https://github.com/apache/incubator-metron/pull/418
  
I'm +1 on this, and definitely glad to see it.


> Fix javadoc doclint errors
> --
>
> Key: METRON-666
> URL: https://issues.apache.org/jira/browse/METRON-666
> Project: Metron
>  Issue Type: Bug
>Affects Versions: 0.3.0
>Reporter: Matt Foley
>Assignee: Matt Foley
>
> Java 8 includes "doclint" as part of javadocs.  As a result, running javadoc 
> on current code base has fatal errors, mostly (not all) related to use of 
> "" (not allowed ever), or unmatched "" (not preceded by a matching 
> "").  It is, however, happy with unmatched "", so that's the thing to 
> use for paragraph separators.  Put it on the same line as the next line of 
> text to avoid a warning about "empty ".
> There are other errors fixed here too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837706#comment-15837706
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user justinleet commented on the issue:

https://github.com/apache/incubator-metron/pull/421
  
Forgot to comment on it, but the GEO_GET call was updated a bit ago per my 
comment about lists and fields. That change should also be reviewed and given 
any feedback.


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837696#comment-15837696
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user justinleet commented on the issue:

https://github.com/apache/incubator-metron/pull/421
  
As a note to anyone coming in late, at least one comment (David's) is still 
relevant, but hidden behind a collapsed outdated diff, because the line he 
commented on needed to be deleted, but the point he brought up is still in 
active conversation.


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837656#comment-15837656
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user justinleet commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97769348
  
--- Diff: 
metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/configuration/metron-env.xml
 ---
@@ -225,6 +163,8 @@ indexing.executors=0
 kafka.zk={{ zookeeper_quorum }}
 kafka.broker={{ kafka_brokers }}
 kafka.start=WHERE_I_LEFT_OFF
+# GEO #
+geo.hdfs.file={{ geoip_hdfs_file }}
--- End diff --

Shouldn't be there at all, deleted it.


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-283) Migrate Geo Enrichment outside of MySQL

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837653#comment-15837653
 ] 

ASF GitHub Bot commented on METRON-283:
---

Github user justinleet commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/421#discussion_r97769238
  
--- Diff: LICENSE ---
@@ -210,6 +210,12 @@ This product bundles some test examples from the Stix 
project (metron-platform/m
 
 This product bundles wait-for-it.sh, which is available under a "MIT 
Software License" license.  For details, see 
https://github.com/vishnubob/wait-for-it
 

+
+
--- End diff --

Added header with license similar to the MIT one aboce


> Migrate Geo Enrichment outside of MySQL
> ---
>
> Key: METRON-283
> URL: https://issues.apache.org/jira/browse/METRON-283
> Project: Metron
>  Issue Type: Improvement
>Reporter: James Sirota
>Assignee: Justin Leet
>Priority: Minor
>
> We need to migrate our enrichment SQL store from MySQL to Phoenix or some 
> other SQL on Hbase library.  Or alternatively come up with a way to do this 
> without using SQL.  This way we don't have a dependency on MySQL and there is 
> one less thing that we need to install on our platform 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-354) Java OOMs are seen with single node quick-dev deployment

2017-01-25 Thread Anand Subramanian (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837582#comment-15837582
 ] 

Anand Subramanian commented on METRON-354:
--

[~zeo...@gmail.com] - sorry about the delay in responding. I spun up a quick 
dev recently and haven't seen this issue after running for a couple of hours. 
This indeed could be due to the various housekeeping activities that has 
happened since this ticket was logged. 

Please feel free to go ahead and resolve the issue as not reproducible.

> Java OOMs are seen with single node quick-dev deployment
> 
>
> Key: METRON-354
> URL: https://issues.apache.org/jira/browse/METRON-354
> Project: Metron
>  Issue Type: Bug
>Affects Versions: 0.2.1BETA
> Environment: Quick dev setup installation using vagrant from HEAD
>Reporter: Anand Subramanian
>Priority: Critical
>  Labels: metronqe
> Attachments: messages, messages-20160725, output-platform-info.rtf
>
>
> This is a single node quick-dev deployment from HEAD using vagrant on a 
> Macbook Pro. The platform-info output is attached.
> After startup, I noticed multiple java process OOM’s due to lack of swap 
> space. Here’s one pasted below for reference and the full /var/log/messages 
> files are attached (see messages-20160725 for the OOM errors). 
> I did not observe the OOM with earlier with single node vagrant deployments. 
> Note that I have not added additional topologies. 
> Is this possibly due to the new changes that was introduced related to 
> separating out indexing? Please let us know. Also, I was considering  
> increasing the # of CPUs and RAM on the vagrant and re-try the single node 
> setup. Please advice if it is good to try that route. 
> {code}
> Jul 25 10:41:28 node1 kernel: java invoked oom-killer: gfp_mask=0x280da, 
> order=0, oom_adj=0, oom_score_adj=0
> Jul 25 10:41:28 node1 kernel: java cpuset=/ mems_allowed=0
> Jul 25 10:41:28 node1 kernel: Pid: 27968, comm: java Not tainted 
> 2.6.32-573.22.1.el6.x86_64 #1
> Jul 25 10:41:28 node1 kernel: Call Trace:
> Jul 25 10:41:28 node1 kernel: [] ? 
> cpuset_print_task_mems_allowed+0x91/0xb0
> Jul 25 10:41:28 node1 kernel: [] ? dump_header+0x90/0x1b0
> Jul 25 10:41:28 node1 kernel: [] ? 
> security_real_capable_noaudit+0x3c/0x70
> Jul 25 10:41:28 node1 kernel: [] ? 
> oom_kill_process+0x82/0x2a0
> Jul 25 10:41:28 node1 kernel: [] ? 
> select_bad_process+0xe1/0x120
> Jul 25 10:41:28 node1 kernel: [] ? out_of_memory+0x220/0x3c0
> Jul 25 10:41:28 node1 kernel: [] ? 
> __alloc_pages_nodemask+0x93c/0x950
> Jul 25 10:41:28 node1 kernel: [] ? 
> wake_bit_function+0x0/0x50
> Jul 25 10:41:28 node1 kernel: [] ? 
> alloc_pages_vma+0x9a/0x150
> Jul 25 10:41:28 node1 kernel: [] ? 
> handle_pte_fault+0x73d/0xb20
> Jul 25 10:41:28 node1 kernel: [] ? 
> page_remove_rmap+0x54/0xa0
> Jul 25 10:41:28 node1 kernel: [] ? release_pages+0x178/0x250
> Jul 25 10:41:28 node1 kernel: [] ? 
> handle_mm_fault+0x299/0x3d0
> Jul 25 10:41:28 node1 kernel: [] ? 
> __do_page_fault+0x146/0x500
> Jul 25 10:41:28 node1 kernel: [] ? thread_return+0x4e/0x7d0
> Jul 25 10:41:28 node1 kernel: [] ? do_page_fault+0x3e/0xa0
> Jul 25 10:41:28 node1 kernel: [] ? page_fault+0x25/0x30
> Jul 25 10:41:28 node1 kernel: Mem-Info:
> Jul 25 10:41:28 node1 kernel: Node 0 DMA per-cpu:
> Jul 25 10:41:28 node1 kernel: CPU0: hi:0, btch:   1 usd:   0
> Jul 25 10:41:28 node1 kernel: CPU1: hi:0, btch:   1 usd:   0
> Jul 25 10:41:28 node1 kernel: CPU2: hi:0, btch:   1 usd:   0
> Jul 25 10:41:28 node1 kernel: CPU3: hi:0, btch:   1 usd:   0
> Jul 25 10:41:28 node1 kernel: Node 0 DMA32 per-cpu:
> Jul 25 10:41:28 node1 kernel: CPU0: hi:  186, btch:  31 usd:   0
> Jul 25 10:41:28 node1 kernel: CPU1: hi:  186, btch:  31 usd:   0
> Jul 25 10:41:28 node1 kernel: CPU2: hi:  186, btch:  31 usd:   0
> Jul 25 10:41:28 node1 kernel: CPU3: hi:  186, btch:  31 usd:   0
> Jul 25 10:41:28 node1 kernel: Node 0 Normal per-cpu:
> Jul 25 10:41:28 node1 kernel: CPU0: hi:  186, btch:  31 usd:   0
> Jul 25 10:41:28 node1 kernel: CPU1: hi:  186, btch:  31 usd:   0
> Jul 25 10:41:28 node1 kernel: CPU2: hi:  186, btch:  31 usd:   0
> Jul 25 10:41:28 node1 kernel: CPU3: hi:  186, btch:  31 usd:   0
> Jul 25 10:41:28 node1 kernel: active_anon:1652718 inactive_anon:271396 
> isolated_anon:0
> Jul 25 10:41:28 node1 kernel: active_file:180 inactive_file:250 
> isolated_file:0
> Jul 25 10:41:28 node1 kernel: unevictable:0 dirty:78 writeback:194 unstable:0
> Jul 25 10:41:28 node1 kernel: free:25337 slab_reclaimable:6085 
> slab_unreclaimable:20565
> Jul 25 10:41:28 node1 kernel: mapped:2265 shmem:758 pagetables:10191 bounce:0
> Jul 25 10:41:28 node1 kernel: Node 0 DMA free:15724kB min:124kB low:152kB 
> high:184kB active_anon:0kB 

[jira] [Commented] (METRON-666) Fix javadoc doclint errors

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837367#comment-15837367
 ] 

ASF GitHub Bot commented on METRON-666:
---

Github user mattf-horton commented on the issue:

https://github.com/apache/incubator-metron/pull/418
  
Do we need any changes, to commit these modest fixes to javadoc comments?  
I'd like to experiment with adding javadocs to the doc site.  Thanks.


> Fix javadoc doclint errors
> --
>
> Key: METRON-666
> URL: https://issues.apache.org/jira/browse/METRON-666
> Project: Metron
>  Issue Type: Bug
>Affects Versions: 0.3.0
>Reporter: Matt Foley
>Assignee: Matt Foley
>
> Java 8 includes "doclint" as part of javadocs.  As a result, running javadoc 
> on current code base has fatal errors, mostly (not all) related to use of 
> "" (not allowed ever), or unmatched "" (not preceded by a matching 
> "").  It is, however, happy with unmatched "", so that's the thing to 
> use for paragraph separators.  Put it on the same line as the next line of 
> text to avoid a warning about "empty ".
> There are other errors fixed here too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (METRON-608) Mpack to install a single-node test cluster

2017-01-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837360#comment-15837360
 ] 

ASF GitHub Bot commented on METRON-608:
---

Github user mattf-horton commented on the issue:

https://github.com/apache/incubator-metron/pull/408
  
@JonZeolla , I removed the obsolete comments about MySQL installation in 
Mpack.  Thanks for the suggestion.


> Mpack to install a single-node test cluster
> ---
>
> Key: METRON-608
> URL: https://issues.apache.org/jira/browse/METRON-608
> Project: Metron
>  Issue Type: Improvement
>Affects Versions: 0.3.0
> Environment: Linux, Ambari installation
>Reporter: Matt Foley
>Assignee: Matt Foley
> Fix For: Next + 1
>
>
> The current Mpack for Ambari install of Metron fails to correctly install 
> Elasticsearch if restricted to a single-node cluster.  Yet a single-node 
> install of Elasticsearch is certainly feasible, as shown by our quick-dev 
> environment.
> This is a short-term fix by providing a completely separate Mpack just for 
> the single-node scenario.  I'm also opening METRON-609 to enhance the 
> existing Mpack to handle the single-node and small-number-of-nodes scenario, 
> but that one will require much deeper testing and is likely to take a while 
> to complete.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)