[GitHub] metron issue #1275: METRON-1878: Add Metron as a Knox service

2018-12-07 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1275
  
EDIT - that's the rest logs, not management UI logs. When I shutdown the 
management UI, the exceptions stop.


---


[GitHub] metron issue #1275: METRON-1878: Add Metron as a Knox service

2018-12-07 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1275
  
I will be looking into this further, but I see a lot of the following in 
the management UI logs. This repeated for more than 80k lines in a matter of 
minutes:

```
18/12/06 20:13:23 INFO zookeeper.ClientCnxn: Opening socket connection to 
server node1/192.168.66.121:2181. Will not attempt to authenticate using SASL 
(unknown error)
18/12/06 20:13:23 WARN zookeeper.ClientCnxn: Session 0x1678019c6bb0099 for 
server null, unexpected error, closing socket connection and attempting 
reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at 
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
18/12/06 20:13:23 ERROR controller.RestExceptionHandler: Encountered error: 
Unable to get column metadata
org.apache.metron.rest.RestException: Unable to get column metadata
at 
org.apache.metron.rest.service.impl.SearchServiceImpl.search(SearchServiceImpl.java:95)
at 
org.apache.metron.rest.controller.SearchController.search(SearchController.java:54)
at sun.reflect.GeneratedMethodAccessor239.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:209)
at 
org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:136)
at 
org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:102)
at 
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:877)
at 
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:783)
at 
org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87)
at 
org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:991)
at 
org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:925)
at 
org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:974)
at 
org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:877)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:661)
at 
org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:851)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:742)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at 
org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at 
org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:320)
at 
org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:127)
at 
org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:91)
at 
org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
at 
org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:119)
at 
org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
at 
org.springframework.security.web.session.SessionManagementFilter.doFilter(SessionManagementFilter.java:137)
at 
org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
at 
org.springframework.security.web.authentication.AnonymousAuthenticationFilter.doFilter(AnonymousAuthenticationFilter.java:111)
at 
org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334

[GitHub] metron issue #1275: METRON-1878: Add Metron as a Knox service

2018-12-07 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1275
  
@merrimanr Where does this leave the management UI if I enable Knox?


---


[GitHub] metron issue #1292: METRON-1925 Provide Verbose View of Profile Results in R...

2018-12-06 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1292
  
@ottobackwards Maybe another optional array could be used, similar to the 
geoget functions, that would allow you to specify a list of desired fields to 
return.


https://github.com/apache/metron/blob/c08cd07f36cd9bf2608a586a209bf809130a069a/metron-platform/metron-enrichment/src/test/java/org/apache/metron/enrichment/stellar/GeoEnrichmentFunctionsTest.java#L140


---


[GitHub] metron issue #1292: METRON-1925 Provide Verbose View of Profile Results in R...

2018-12-06 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1292
  
This seems pretty useful, thanks for the submission @nickwallen.

> For lack of a better name, I just called it PROFILE_VIEW. I would be open 
to alternatives. I did not want to add additional options to the already 
complex PROFILE_GET.

Heh, that was going to be my first question. I think simplifying the 
client-facing function signatures, as you've done, makes sense. I don't see any 
modifications to the existing PROFILE_GET function, though. The underlying 
logic seems like it should be nearly identical. Is there any common 
functionality that could be pulled out and shared between the two?


---


[GitHub] metron issue #1269: METRON-1879 Allow Elasticsearch to Auto-Generate the Doc...

2018-12-04 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1269
  
Refresh me on this - prior to this change, did we provide both a doc id and 
a guid that were identical? And now with this change you would expect to see 
guid != doc id (bc guid generated by Metron, doc id now generated by ES), 
except where otherwise configured for backwards compatibility?


---


[GitHub] metron pull request #1254: METRON-1849 Elasticsearch Index Write Functionali...

2018-12-04 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1254#discussion_r238883436
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/dao/ElasticsearchDao.java
 ---
@@ -196,7 +196,7 @@ public ElasticsearchDao 
withRefreshPolicy(WriteRequest.RefreshPolicy refreshPoli
   }
 
   protected Optional getIndexName(String guid, String sensorType) 
throws IOException {
-return updateDao.getIndexName(guid, sensorType);
+return updateDao.findIndexNameByGUID(guid, sensorType);
--- End diff --

Is sensorType not a component to retrieving the index name? Also, would we 
want any parity between the updateDao's find method name vs the 
ElasticsearchDao's getIndexName method name?


---


[GitHub] metron pull request #1254: METRON-1849 Elasticsearch Index Write Functionali...

2018-12-04 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1254#discussion_r238881060
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/ElasticsearchWriter.java
 ---
@@ -56,90 +60,111 @@
*/
   private transient ElasticsearchClient client;
 
+  /**
+   * Responsible for writing documents.
+   *
+   * Uses a {@link TupleBasedDocument} to maintain the relationship 
between
+   * a {@link Tuple} and the document created from the contents of that 
tuple. If
+   * a document cannot be written, the associated tuple needs to be failed.
+   */
+  private transient BulkDocumentWriter documentWriter;
+
   /**
* A simple data formatter used to build the appropriate Elasticsearch 
index name.
*/
   private SimpleDateFormat dateFormat;
 
-
   @Override
   public void init(Map stormConf, TopologyContext topologyContext, 
WriterConfiguration configurations) {
-
 Map globalConfiguration = 
configurations.getGlobalConfig();
-client = ElasticsearchClientFactory.create(globalConfiguration);
 dateFormat = ElasticsearchUtils.getIndexFormat(globalConfiguration);
+
+// only create the document writer, if one does not already exist. 
useful for testing.
+if(documentWriter == null) {
+  client = ElasticsearchClientFactory.create(globalConfiguration);
+  documentWriter = new ElasticsearchBulkDocumentWriter<>(client);
+}
   }
 
   @Override
-  public BulkWriterResponse write(String sensorType, WriterConfiguration 
configurations, Iterable tuples, List messages) throws 
Exception {
+  public BulkWriterResponse write(String sensorType,
+  WriterConfiguration configurations,
+  Iterable tuplesIter,
+  List messages) {
 
 // fetch the field name converter for this sensor type
 FieldNameConverter fieldNameConverter = 
FieldNameConverters.create(sensorType, configurations);
+String indexPostfix = dateFormat.format(new Date());
+String indexName = ElasticsearchUtils.getIndexName(sensorType, 
indexPostfix, configurations);
+
+// the number of tuples must match the number of messages
+List tuples = Lists.newArrayList(tuplesIter);
+int batchSize = tuples.size();
+if(messages.size() != batchSize) {
+  throw new IllegalStateException(format("Expect same number of tuples 
and messages; |tuples|=%d, |messages|=%d",
+  tuples.size(), messages.size()));
+}
 
-final String indexPostfix = dateFormat.format(new Date());
-BulkRequest bulkRequest = new BulkRequest();
-for(JSONObject message: messages) {
+// create a document from each message
+List documents = new ArrayList<>();
+for(int i=0; ihttps://github.com/apache/metron/blob/master/metron-stellar/stellar-common/src/main/java/org/apache/metron/stellar/common/utils/ConversionUtils.java#L39
 for conversions.


---


[GitHub] metron pull request #1254: METRON-1849 Elasticsearch Index Write Functionali...

2018-12-04 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1254#discussion_r238876879
  
--- Diff: 
metron-platform/metron-indexing/src/main/java/org/apache/metron/indexing/dao/update/Document.java
 ---
@@ -89,46 +91,29 @@ public void setGuid(String guid) {
 this.guid = guid;
   }
 
-  @Override
-  public String toString() {
-return "Document{" +
-"timestamp=" + timestamp +
-", document=" + document +
-", guid='" + guid + '\'' +
-", sensorType='" + sensorType + '\'' +
-'}';
-  }
-
   @Override
   public boolean equals(Object o) {
-if (this == o) {
-  return true;
-}
-if (o == null || getClass() != o.getClass()) {
-  return false;
-}
-
+if (this == o) return true;
+if (!(o instanceof Document)) return false;
 Document document1 = (Document) o;
-
-if (timestamp != null ? !timestamp.equals(document1.timestamp) : 
document1.timestamp != null) {
-  return false;
-}
-if (document != null ? !document.equals(document1.document) : 
document1.document != null) {
-  return false;
-}
-if (guid != null ? !guid.equals(document1.guid) : document1.guid != 
null) {
-  return false;
-}
-return sensorType != null ? sensorType.equals(document1.sensorType)
-: document1.sensorType == null;
+return Objects.equals(timestamp, document1.timestamp) &&
--- End diff --

Is this change to equals and hashcode an auto-create from IntelliJ?


---


[GitHub] metron pull request #1254: METRON-1849 Elasticsearch Index Write Functionali...

2018-12-04 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1254#discussion_r238882021
  
--- Diff: 
metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/writer/ElasticsearchWriterTest.java
 ---
@@ -18,170 +18,241 @@
 
 package org.apache.metron.elasticsearch.writer;
 
-import static org.junit.Assert.assertEquals;
-import static org.mockito.Mockito.mock;
-import static org.mockito.Mockito.when;
+import org.apache.metron.common.Constants;
+import org.apache.metron.common.configuration.writer.WriterConfiguration;
+import org.apache.metron.common.writer.BulkWriterResponse;
+import org.apache.storm.task.TopologyContext;
+import org.apache.storm.tuple.Tuple;
+import org.json.simple.JSONObject;
+import org.junit.Before;
+import org.junit.Test;
 
-import com.google.common.collect.ImmutableList;
+import java.util.ArrayList;
 import java.util.Collection;
 import java.util.HashMap;
+import java.util.List;
 import java.util.Map;
-import org.apache.metron.common.writer.BulkWriterResponse;
-import org.apache.storm.tuple.Tuple;
-import org.elasticsearch.action.bulk.BulkItemResponse;
-import org.elasticsearch.action.bulk.BulkResponse;
-import org.junit.Test;
+import java.util.UUID;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertFalse;
+import static org.junit.Assert.assertNotNull;
+import static org.junit.Assert.assertTrue;
+import static org.junit.Assert.fail;
+import static org.mockito.Mockito.mock;
+import static org.mockito.Mockito.when;
 
 public class ElasticsearchWriterTest {
-@Test
-public void testSingleSuccesses() throws Exception {
-Tuple tuple1 = mock(Tuple.class);
 
-BulkResponse response = mock(BulkResponse.class);
-when(response.hasFailures()).thenReturn(false);
+Map stormConf;
+TopologyContext topologyContext;
+WriterConfiguration writerConfiguration;
 
-BulkWriterResponse expected = new BulkWriterResponse();
-expected.addSuccess(tuple1);
+@Before
+public void setup() {
+topologyContext = mock(TopologyContext.class);
 
-ElasticsearchWriter esWriter = new ElasticsearchWriter();
-BulkWriterResponse actual = 
esWriter.buildWriteReponse(ImmutableList.of(tuple1), response);
+writerConfiguration = mock(WriterConfiguration.class);
+when(writerConfiguration.getGlobalConfig()).thenReturn(globals());
 
-assertEquals("Response should have no errors and single success", 
expected, actual);
+stormConf = new HashMap();
 }
 
 @Test
-public void testMultipleSuccesses() throws Exception {
-Tuple tuple1 = mock(Tuple.class);
-Tuple tuple2 = mock(Tuple.class);
-
-BulkResponse response = mock(BulkResponse.class);
-when(response.hasFailures()).thenReturn(false);
+public void shouldWriteSuccessfully() {
+// create a writer where all writes will be successful
+float probabilityOfSuccess = 1.0F;
+ElasticsearchWriter esWriter = new ElasticsearchWriter();
+esWriter.setDocumentWriter( new 
BulkDocumentWriterStub<>(probabilityOfSuccess));
+esWriter.init(stormConf, topologyContext, writerConfiguration);
 
-BulkWriterResponse expected = new BulkWriterResponse();
-expected.addSuccess(tuple1);
-expected.addSuccess(tuple2);
+// create a tuple and a message associated with that tuple
+List tuples = createTuples(1);
+List messages = createMessages(1);
 
-ElasticsearchWriter esWriter = new ElasticsearchWriter();
-BulkWriterResponse actual = 
esWriter.buildWriteReponse(ImmutableList.of(tuple1, tuple2), response);
+BulkWriterResponse response = esWriter.write("bro", 
writerConfiguration, tuples, messages);
 
-assertEquals("Response should have no errors and two successes", 
expected, actual);
+// response should only contain successes
+assertFalse(response.hasErrors());
+assertTrue(response.getSuccesses().contains(tuples.get(0)));
 }
 
 @Test
-public void testSingleFailure() throws Exception {
-Tuple tuple1 = mock(Tuple.class);
-
-BulkResponse response = mock(BulkResponse.class);
-when(response.hasFailures()).thenReturn(true);
-
-Exception e = new IllegalStateException();
-BulkItemResponse itemResponse = buildBulkItemFailure(e);
-
when(response.iterator()).thenReturn(ImmutableList.of(itemRespons

[GitHub] metron issue #1284: METRON-1867 Remove `/api/v1/update/replace` endpoint

2018-12-04 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1284
  
+1 by inspection pending details on the following: 

- Is this endpoint superseded by another implementation, or just removed 
altogether?
- Any users of the REST API using this directly for any reason? Not sure if 
we have to deprecate this - i.e. I'm unclear of whether or not we consider the 
REST API client facing or not, or if it's just middleware to us for the UI. 


---


[GitHub] metron pull request #1286: METRON-1889: Add any missing timestamp fields to ...

2018-12-03 Thread mmiklavc
Github user mmiklavc closed the pull request at:

https://github.com/apache/metron/pull/1286


---


[GitHub] metron issue #1286: METRON-1889: Add any missing timestamp fields to unified...

2018-12-03 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1286
  
Kick Travis


---


[GitHub] metron pull request #1286: METRON-1889: Add any missing timestamp fields to ...

2018-12-03 Thread mmiklavc
GitHub user mmiklavc reopened a pull request:

https://github.com/apache/metron/pull/1286

METRON-1889: Add any missing timestamp fields to unified enrichment topology

## Contributor Comments

https://issues.apache.org/jira/browse/METRON-1889

This is done in reference to this discussion on the mailing list wrt 
deprecating the split join topology. 
https://lists.apache.org/thread.html/e3a0f5634db74d44c4e2f2e0399cc3f0777f1e0dce2e1f764cc0a39f@%3Cdev.metron.apache.org%3E

The split-join enrichment topology has the following timestamp keys:

**Adapter timestamps**

```
  "adapter:hostfromjsonlistadapter:begin:ts": "1542840777691",
  "adapter:hostfromjsonlistadapter:end:ts": "1542840777691",
  "adapter:threatinteladapter:begin:ts": "1542840784545",
  "adapter:threatinteladapter:end:ts": "1542840784545",
  "adapter:geoadapter:begin:ts": "1542840783196",
  "adapter:geoadapter:end:ts": "1542840783196"
```

**Split and Join Bolt timestamps**

```
  "enrichmentsplitterbolt:splitter:begin:ts": "1542840777617",
  "enrichmentsplitterbolt:splitter:end:ts": "1542840777617",
  "enrichmentjoinbolt:joiner:ts": "1542840783234",
  "threatintelsplitterbolt:splitter:begin:ts": "1542840783247",
  "threatintelsplitterbolt:splitter:end:ts": "1542840783247",
  "threatinteljoinbolt:joiner:ts": "1542840784555"
```

The unified enrichment topology currently only has the following timestamp
keys:

```
  "parallelenricher:splitter:begin:ts": "1542807098066",
  "parallelenricher:splitter:end:ts": "1542807098066",
  "parallelenricher:enrich:begin:ts": "1542807098066"
```

The enrichment end timestamp is missing in the case when there are no 
enrichments. Also, the adapter timestamps did not carry over from the 
split-join topology.

In this PR, I've corrected for the missing `enrich:end:ts` case for when 
there are no enrichments and added back in the adapter timestamps.

This is the new timestamp output for the unified enrichment topology after 
my changes:

**Adapter timestamps**

```
  "adapter:geoadapter:begin:ts": "1542882553866",
  "adapter:geoadapter:end:ts": "1542882553866",
  "adapter:threatinteladapter:begin:ts": "1542882553869",
  "adapter:threatinteladapter:end:ts": "1542882553870",
  "adapter:hostfromjsonlistadapter:begin:ts": "1542882553866",
  "adapter:hostfromjsonlistadapter:end:ts": "1542882553866"
```

**Enrichment bolt timestamps**

```
  "parallelenricher:enrich:begin:ts": "1542882553869",
  "parallelenricher:enrich:end:ts": "1542882553870",
  "parallelenricher:splitter:begin:ts": "1542882553869",
  "parallelenricher:splitter:end:ts": "1542882553869"
```

**Testing**

Spin up full dev and you should see bro records in Elasticsearch (or Solr) 
that now contain the adapter timestamps. By default, the unified enrichment 
topology should be running, but make sure of this when you evaluate the records 
in the bro index.

## Pull Request Checklist

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [x] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?


### For code changes:
- [x] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
- [x] Have you included steps or a guide to how the change may be verified 
and tested manually?
- [x] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
  ```
  mvn -q clean integration-test install && 
dev-utilities/build-utils/verify_licenses.sh 
  ```

- [x] Have you written or updated unit tests and or integration tests to 
verify your changes?
- n/a If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?

[GitHub] metron pull request #1286: METRON-1889: Add any missing timestamp fields to ...

2018-11-29 Thread mmiklavc
GitHub user mmiklavc opened a pull request:

https://github.com/apache/metron/pull/1286

METRON-1889: Add any missing timestamp fields to unified enrichment topology

## Contributor Comments

https://issues.apache.org/jira/browse/METRON-1889

This is done in reference to this discussion on the mailing list wrt 
deprecating the split join topology. 
https://lists.apache.org/thread.html/e3a0f5634db74d44c4e2f2e0399cc3f0777f1e0dce2e1f764cc0a39f@%3Cdev.metron.apache.org%3E

The split-join enrichment topology has the following timestamp keys:

**Adapter timestamps**

```
  "adapter:hostfromjsonlistadapter:begin:ts": "1542840777691",
  "adapter:hostfromjsonlistadapter:end:ts": "1542840777691",
  "adapter:threatinteladapter:begin:ts": "1542840784545",
  "adapter:threatinteladapter:end:ts": "1542840784545",
  "adapter:geoadapter:begin:ts": "1542840783196",
  "adapter:geoadapter:end:ts": "1542840783196"
```

**Split and Join Bolt timestamps**

```
  "enrichmentsplitterbolt:splitter:begin:ts": "1542840777617",
  "enrichmentsplitterbolt:splitter:end:ts": "1542840777617",
  "enrichmentjoinbolt:joiner:ts": "1542840783234",
  "threatintelsplitterbolt:splitter:begin:ts": "1542840783247",
  "threatintelsplitterbolt:splitter:end:ts": "1542840783247",
  "threatinteljoinbolt:joiner:ts": "1542840784555"
```

The unified enrichment topology currently only has the following timestamp
keys:

```
  "parallelenricher:splitter:begin:ts": "1542807098066",
  "parallelenricher:splitter:end:ts": "1542807098066",
  "parallelenricher:enrich:begin:ts": "1542807098066"
```

The enrichment end timestamp is missing in the case when there are no 
enrichments. Also, the adapter timestamps did not carry over from the 
split-join topology.

In this PR, I've corrected for the missing `enrich:end:ts` case for when 
there are no enrichments and added back in the adapter timestamps.

This is the new timestamp output for the unified enrichment topology after 
my changes:

**Adapter timestamps**

```
  "adapter:geoadapter:begin:ts": "1542882553866",
  "adapter:geoadapter:end:ts": "1542882553866",
  "adapter:threatinteladapter:begin:ts": "1542882553869",
  "adapter:threatinteladapter:end:ts": "1542882553870",
  "adapter:hostfromjsonlistadapter:begin:ts": "1542882553866",
  "adapter:hostfromjsonlistadapter:end:ts": "1542882553866"
```

**Enrichment bolt timestamps**

```
  "parallelenricher:enrich:begin:ts": "1542882553869",
  "parallelenricher:enrich:end:ts": "1542882553870",
  "parallelenricher:splitter:begin:ts": "1542882553869",
  "parallelenricher:splitter:end:ts": "1542882553869"
```

**Testing**

Spin up full dev and you should see bro records in Elasticsearch (or Solr) 
that now contain the adapter timestamps. By default, the unified enrichment 
topology should be running, but make sure of this when you evaluate the records 
in the bro index.

## Pull Request Checklist

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [x] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?


### For code changes:
- [x] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
- [x] Have you included steps or a guide to how the change may be verified 
and tested manually?
- [x] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
  ```
  mvn -q clean integration-test install && 
dev-utilities/build-utils/verify_licenses.sh 
  ```

- [x] Have you written or updated unit tests and or integration tests to 
verify your changes?
- n/a If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?

[GitHub] metron issue #1285: METRON-1913 metron-alert UI - Build broken by missing tr...

2018-11-29 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1285
  
I presume a successful run of Travis will prove this out. I'm +1 pending 
the build completing.


---


[GitHub] metron issue #1261: METRON-1860 [WIP] new developer option for ansible in do...

2018-11-29 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1261
  
@ottobackwards - yeah, that's probably reasonable. The base HDP image 
create option would be an additional improvement to existing functionality.


---


[GitHub] metron issue #1261: METRON-1860 [WIP] new developer option for ansible in do...

2018-11-29 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1261
  
@ottobackwards I like @justinleet's suggestion on having a base option. I'd 
imagine what we could have is a mechanism that allows you to build an image 
with just HDP of a specific version, and snapshot that if you like for general 
use, along with the option of just running against a standard image. I dunno, 
have we just re-hashed quickdev here? I can't recall specifically how or why it 
became problematic for us to maintain. It seems like there should be a path 
through this that allows us to not go stale, but also save the 20 min building 
the base installation, as Justin pointed out.


---


[GitHub] metron issue #1247: METRON-1845 Correct Test Data Load in Elasticsearch Inte...

2018-11-28 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1247
  
Still +1 on latest commit - that's a nice improvement. Glad the retry 
policy ended up working out!


---


[GitHub] metron issue #1247: METRON-1845 Correct Test Data Load in Elasticsearch Inte...

2018-11-28 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1247
  
lgtm @nickwallen, +1 pending Travis


---


[GitHub] metron issue #1247: METRON-1845 Correct Test Data Load in Elasticsearch Inte...

2018-11-28 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1247
  
As a side note, I tend to like the approach of using the DAO layers to 
read/write the way you've done here @nickwallen . I've used this approach in 
the past, and it helped with test coverage and simplifying the amount of extra 
custom test code required. It might be worth an overall discussion on how we 
want to architect testing other similar endpoints going forward.


---


[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

2018-11-28 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1247#discussion_r237204881
  
--- Diff: 
metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/integration/components/ElasticSearchComponent.java
 ---
@@ -194,35 +215,41 @@ public Client getClient() {
 return client;
   }
 
-  public BulkResponse add(String indexName, String sensorType, String... 
docs) throws IOException {
+  public void add(UpdateDao updateDao, String indexName, String 
sensorType, String... docs)
+  throws IOException, ParseException {
 List d = new ArrayList<>();
 Collections.addAll(d, docs);
-return add(indexName, sensorType, d);
+add(updateDao, indexName, sensorType, d);
   }
 
-  public BulkResponse add(String indexName, String sensorType, 
Iterable docs)
-  throws IOException {
-BulkRequestBuilder bulkRequest = getClient().prepareBulk();
-for (String doc : docs) {
-  IndexRequestBuilder indexRequestBuilder = getClient()
-  .prepareIndex(indexName, sensorType + "_doc");
-
-  indexRequestBuilder = indexRequestBuilder.setSource(doc);
-  Map esDoc = JSONUtils.INSTANCE
-  .load(doc, JSONUtils.MAP_SUPPLIER);
-  indexRequestBuilder.setId((String) esDoc.get(Constants.GUID));
-  Object ts = esDoc.get("timestamp");
-  if (ts != null) {
-indexRequestBuilder = 
indexRequestBuilder.setTimestamp(ts.toString());
-  }
-  bulkRequest.add(indexRequestBuilder);
-}
+  public void add(UpdateDao updateDao, String indexName, String 
sensorType, Iterable docs)
--- End diff --

The IndexDao handles all of the plumbing for the init that you're 
duplicating in this test class.


---


[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

2018-11-28 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1247#discussion_r237202358
  
--- Diff: 
metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/integration/ElasticsearchSearchIntegrationTest.java
 ---
@@ -97,45 +118,63 @@ protected static InMemoryComponent startIndex() throws 
Exception {
 return es;
   }
 
-  protected static void loadTestData() throws ParseException, IOException {
+  protected static void loadTestData() throws Exception {
 ElasticSearchComponent es = (ElasticSearchComponent) indexComponent;
 
+// define the bro index template
+String broIndex = "bro_index_2017.01.01.01";
 JSONObject broTemplate = JSONUtils.INSTANCE.load(new 
File(broTemplatePath), JSONObject.class);
 addTestFieldMappings(broTemplate, "bro_doc");
-
es.getClient().admin().indices().prepareCreate("bro_index_2017.01.01.01")
-.addMapping("bro_doc", 
JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
+es.getClient().admin().indices().prepareCreate(broIndex)
+.addMapping("bro_doc", 
JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
+
+// define the snort index template
+String snortIndex = "snort_index_2017.01.01.02";
 JSONObject snortTemplate = JSONUtils.INSTANCE.load(new 
File(snortTemplatePath), JSONObject.class);
 addTestFieldMappings(snortTemplate, "snort_doc");
-
es.getClient().admin().indices().prepareCreate("snort_index_2017.01.01.02")
-.addMapping("snort_doc", 
JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
-
-BulkRequestBuilder bulkRequest = es.getClient().prepareBulk()
-.setRefreshPolicy(WriteRequest.RefreshPolicy.WAIT_UNTIL);
-JSONArray broArray = (JSONArray) new JSONParser().parse(broData);
-for (Object o : broArray) {
-  JSONObject jsonObject = (JSONObject) o;
-  IndexRequestBuilder indexRequestBuilder = es.getClient()
-  .prepareIndex("bro_index_2017.01.01.01", "bro_doc");
-  indexRequestBuilder = indexRequestBuilder.setId((String) 
jsonObject.get("guid"));
-  indexRequestBuilder = 
indexRequestBuilder.setSource(jsonObject.toJSONString());
-  indexRequestBuilder = indexRequestBuilder
-  .setTimestamp(jsonObject.get("timestamp").toString());
-  bulkRequest.add(indexRequestBuilder);
+es.getClient().admin().indices().prepareCreate(snortIndex)
+.addMapping("snort_doc", 
JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
+
+// setup the classes required to write the test data
+AccessConfig accessConfig = createAccessConfig();
+ElasticsearchClient client = 
ElasticsearchUtils.getClient(createGlobalConfig());
+ElasticsearchRetrieveLatestDao retrieveLatestDao = new 
ElasticsearchRetrieveLatestDao(client);
+ElasticsearchColumnMetadataDao columnMetadataDao = new 
ElasticsearchColumnMetadataDao(client);
+ElasticsearchRequestSubmitter requestSubmitter = new 
ElasticsearchRequestSubmitter(client);
+ElasticsearchUpdateDao updateDao = new ElasticsearchUpdateDao(client, 
accessConfig, retrieveLatestDao);
+ElasticsearchSearchDao searchDao = new ElasticsearchSearchDao(client, 
accessConfig, columnMetadataDao, requestSubmitter);
+
+// write the test documents for Bro
+List broDocuments = new ArrayList<>();
+for (Object broObject: (JSONArray) new JSONParser().parse(broData)) {
+  broDocuments.add(((JSONObject) broObject).toJSONString());
 }
-JSONArray snortArray = (JSONArray) new JSONParser().parse(snortData);
-for (Object o : snortArray) {
-  JSONObject jsonObject = (JSONObject) o;
-  IndexRequestBuilder indexRequestBuilder = es.getClient()
-  .prepareIndex("snort_index_2017.01.01.02", "snort_doc");
-  indexRequestBuilder = indexRequestBuilder.setId((String) 
jsonObject.get("guid"));
-  indexRequestBuilder = 
indexRequestBuilder.setSource(jsonObject.toJSONString());
-  indexRequestBuilder = indexRequestBuilder
-  .setTimestamp(jsonObject.get("timestamp").toString());
-  bulkRequest.add(indexRequestBuilder);
+es.add(updateDao, broIndex, "bro", broDocuments);
+
+// write the test documents for Snort
+List snortDocuments = new ArrayList<>();
+for (Object snortObject: (JSONArray) new 
JSONParser().parse(snortData)) {
+  snortDocuments.add(((JSONObject) snortObject).toJSONString()

[GitHub] metron issue #1259: METRON-1867 Remove `/api/v1/update/replace` endpoint

2018-11-26 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1259
  
> The /api/v1/update/replace endpoint is no longer used. This is dead code 
and should be removed.

Why does this depend on 3 other PRs? Shouldn't this come first, or at least 
be independent of the others?


---


[GitHub] metron pull request #1254: METRON-1849 Elasticsearch Index Write Functionali...

2018-11-26 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1254#discussion_r236530989
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/ElasticsearchWriter.java
 ---
@@ -56,90 +58,107 @@
*/
   private transient ElasticsearchClient client;
 
+  /**
+   * Responsible for writing documents.
+   *
+   * Uses a {@link TupleBasedDocument} to maintain the relationship 
between
+   * a {@link Tuple} and the document created from the contents of that 
tuple. If
+   * a document cannot be written, the associated tuple needs to be failed.
+   */
+  private transient BulkDocumentWriter documentWriter;
+
   /**
* A simple data formatter used to build the appropriate Elasticsearch 
index name.
*/
   private SimpleDateFormat dateFormat;
 
-
   @Override
   public void init(Map stormConf, TopologyContext topologyContext, 
WriterConfiguration configurations) {
-
 Map globalConfiguration = 
configurations.getGlobalConfig();
-client = ElasticsearchClientFactory.create(globalConfiguration);
 dateFormat = ElasticsearchUtils.getIndexFormat(globalConfiguration);
+
+// only create the document writer, if one does not already exist. 
useful for testing.
+if(documentWriter == null) {
+  client = ElasticsearchClientFactory.create(globalConfiguration);
+  documentWriter = new ElasticsearchBulkDocumentWriter<>(client);
+}
   }
 
   @Override
-  public BulkWriterResponse write(String sensorType, WriterConfiguration 
configurations, Iterable tuples, List messages) throws 
Exception {
+  public BulkWriterResponse write(String sensorType,
+  WriterConfiguration configurations,
+  Iterable tuplesIter,
+  List messages) {
 
 // fetch the field name converter for this sensor type
 FieldNameConverter fieldNameConverter = 
FieldNameConverters.create(sensorType, configurations);
+String indexPostfix = dateFormat.format(new Date());
+String indexName = ElasticsearchUtils.getIndexName(sensorType, 
indexPostfix, configurations);
+
+// the number of tuples must match the number of messages
+List tuples = Lists.newArrayList(tuplesIter);
+int batchSize = tuples.size();
+if(messages.size() != batchSize) {
+  throw new IllegalStateException(format("Expect same number of tuples 
and messages; |tuples|=%d, |messages|=%d",
+  tuples.size(), messages.size()));
+}
 
-final String indexPostfix = dateFormat.format(new Date());
-BulkRequest bulkRequest = new BulkRequest();
-for(JSONObject message: messages) {
+// create a document from each message
+List documents = new ArrayList<>();
+for(int i=0; i {
+  List successfulTuples = docs.stream().map(doc -> 
doc.getTuple()).collect(Collectors.toList());
--- End diff --

Same comment on streams as above.


---


[GitHub] metron pull request #1254: METRON-1849 Elasticsearch Index Write Functionali...

2018-11-26 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1254#discussion_r236530583
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/bulk/ElasticsearchBulkDocumentWriter.java
 ---
@@ -0,0 +1,184 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.elasticsearch.bulk;
+
+import org.apache.commons.lang3.exception.ExceptionUtils;
+import org.apache.metron.elasticsearch.client.ElasticsearchClient;
+import org.apache.metron.indexing.dao.update.Document;
+import org.elasticsearch.action.DocWriteRequest;
+import org.elasticsearch.action.bulk.BulkItemResponse;
+import org.elasticsearch.action.bulk.BulkRequest;
+import org.elasticsearch.action.bulk.BulkResponse;
+import org.elasticsearch.action.index.IndexRequest;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Optional;
+import java.util.stream.Collectors;
+
+/**
+ * Writes documents to an Elasticsearch index in bulk.
+ *
+ * @param  The type of document to write.
+ */
+public class ElasticsearchBulkDocumentWriter 
implements BulkDocumentWriter {
+
+/**
+ * A {@link Document} along with the index it will be written to.
+ */
+private class Indexable {
+D document;
+String index;
+
+public Indexable(D document, String index) {
+this.document = document;
+this.index = index;
+}
+}
+
+private static final Logger LOG = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+private Optional onSuccess;
+private Optional onFailure;
+private ElasticsearchClient client;
+private List documents;
+
+public ElasticsearchBulkDocumentWriter(ElasticsearchClient client) {
+this.client = client;
+this.onSuccess = Optional.empty();
+this.onFailure = Optional.empty();
+this.documents = new ArrayList<>();
+}
+
+@Override
+public void onSuccess(SuccessListener onSuccess) {
+this.onSuccess = Optional.of(onSuccess);
+}
+
+@Override
+public void onFailure(FailureListener onFailure) {
+this.onFailure = Optional.of(onFailure);
+}
+
+@Override
+public void addDocument(D document, String index) {
+documents.add(new Indexable(document, index));
+LOG.debug("Adding document to batch; document={}, index={}", 
document, index);
+}
+
+@Override
+public void write() {
+try {
+// create an index request for each document
+List requests = documents
--- End diff --

Please be careful using streams. In general, but specifically here. Our ES 
endpoints are typically where we bottleneck in any tuning exercise and adding 
additional overhead may prove troublesome. I'd normally not stress too much on 
"premature optimization" but I think we're safely very much post-premature at 
this stage of the game with our Lucene stores.


https://jaxenter.com/java-performance-tutorial-how-fast-are-the-java-8-streams-118830.html


---


[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

2018-11-26 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1247#discussion_r236524985
  
--- Diff: 
metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/integration/ElasticsearchSearchIntegrationTest.java
 ---
@@ -97,45 +118,63 @@ protected static InMemoryComponent startIndex() throws 
Exception {
 return es;
   }
 
-  protected static void loadTestData() throws ParseException, IOException {
+  protected static void loadTestData() throws Exception {
 ElasticSearchComponent es = (ElasticSearchComponent) indexComponent;
 
+// define the bro index template
+String broIndex = "bro_index_2017.01.01.01";
 JSONObject broTemplate = JSONUtils.INSTANCE.load(new 
File(broTemplatePath), JSONObject.class);
 addTestFieldMappings(broTemplate, "bro_doc");
-
es.getClient().admin().indices().prepareCreate("bro_index_2017.01.01.01")
-.addMapping("bro_doc", 
JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
+es.getClient().admin().indices().prepareCreate(broIndex)
+.addMapping("bro_doc", 
JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
+
+// define the snort index template
+String snortIndex = "snort_index_2017.01.01.02";
 JSONObject snortTemplate = JSONUtils.INSTANCE.load(new 
File(snortTemplatePath), JSONObject.class);
 addTestFieldMappings(snortTemplate, "snort_doc");
-
es.getClient().admin().indices().prepareCreate("snort_index_2017.01.01.02")
-.addMapping("snort_doc", 
JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
-
-BulkRequestBuilder bulkRequest = es.getClient().prepareBulk()
-.setRefreshPolicy(WriteRequest.RefreshPolicy.WAIT_UNTIL);
-JSONArray broArray = (JSONArray) new JSONParser().parse(broData);
-for (Object o : broArray) {
-  JSONObject jsonObject = (JSONObject) o;
-  IndexRequestBuilder indexRequestBuilder = es.getClient()
-  .prepareIndex("bro_index_2017.01.01.01", "bro_doc");
-  indexRequestBuilder = indexRequestBuilder.setId((String) 
jsonObject.get("guid"));
-  indexRequestBuilder = 
indexRequestBuilder.setSource(jsonObject.toJSONString());
-  indexRequestBuilder = indexRequestBuilder
-  .setTimestamp(jsonObject.get("timestamp").toString());
-  bulkRequest.add(indexRequestBuilder);
+es.getClient().admin().indices().prepareCreate(snortIndex)
+.addMapping("snort_doc", 
JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
+
+// setup the classes required to write the test data
+AccessConfig accessConfig = createAccessConfig();
+ElasticsearchClient client = 
ElasticsearchUtils.getClient(createGlobalConfig());
+ElasticsearchRetrieveLatestDao retrieveLatestDao = new 
ElasticsearchRetrieveLatestDao(client);
+ElasticsearchColumnMetadataDao columnMetadataDao = new 
ElasticsearchColumnMetadataDao(client);
+ElasticsearchRequestSubmitter requestSubmitter = new 
ElasticsearchRequestSubmitter(client);
+ElasticsearchUpdateDao updateDao = new ElasticsearchUpdateDao(client, 
accessConfig, retrieveLatestDao);
+ElasticsearchSearchDao searchDao = new ElasticsearchSearchDao(client, 
accessConfig, columnMetadataDao, requestSubmitter);
+
+// write the test documents for Bro
+List broDocuments = new ArrayList<>();
+for (Object broObject: (JSONArray) new JSONParser().parse(broData)) {
+  broDocuments.add(((JSONObject) broObject).toJSONString());
 }
-JSONArray snortArray = (JSONArray) new JSONParser().parse(snortData);
-for (Object o : snortArray) {
-  JSONObject jsonObject = (JSONObject) o;
-  IndexRequestBuilder indexRequestBuilder = es.getClient()
-  .prepareIndex("snort_index_2017.01.01.02", "snort_doc");
-  indexRequestBuilder = indexRequestBuilder.setId((String) 
jsonObject.get("guid"));
-  indexRequestBuilder = 
indexRequestBuilder.setSource(jsonObject.toJSONString());
-  indexRequestBuilder = indexRequestBuilder
-  .setTimestamp(jsonObject.get("timestamp").toString());
-  bulkRequest.add(indexRequestBuilder);
+es.add(updateDao, broIndex, "bro", broDocuments);
+
+// write the test documents for Snort
+List snortDocuments = new ArrayList<>();
+for (Object snortObject: (JSONArray) new 
JSONParser().parse(snortData)) {
+  snortDocuments.add(((JSONObject) snortObject).toJSONString()

[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

2018-11-26 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1247#discussion_r236521989
  
--- Diff: 
metron-platform/metron-solr/src/test/java/org/apache/metron/solr/integration/SolrUpdateIntegrationTest.java
 ---
@@ -186,4 +195,114 @@ public void testHugeErrorFields() throws Exception {
 exception.expectMessage("Document contains at least one immense term 
in field=\"error_hash\"");
 getDao().update(errorDoc, Optional.of("error"));
   }
+
+  @Test
+  @Override
+  public void test() throws Exception {
--- End diff --

What do these changes have to do with ES DAO read/write approach change? 
This class is testing Solr - should this be in a separate PR?


---


[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

2018-11-26 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1247#discussion_r236519370
  
--- Diff: 
metron-platform/metron-solr/src/test/java/org/apache/metron/solr/integration/SolrUpdateIntegrationTest.java
 ---
@@ -186,4 +195,114 @@ public void testHugeErrorFields() throws Exception {
 exception.expectMessage("Document contains at least one immense term 
in field=\"error_hash\"");
 getDao().update(errorDoc, Optional.of("error"));
   }
+
+  @Test
+  @Override
+  public void test() throws Exception {
+List> inputData = new ArrayList<>();
+for(int i = 0; i < 10;++i) {
+  final String name = "message" + i;
+  inputData.add(
+  new HashMap() {{
+put("source.type", SENSOR_NAME);
+put("name" , name);
+put("timestamp", System.currentTimeMillis());
+put(Constants.GUID, name);
+  }}
+  );
+}
+addTestData(getIndexName(), SENSOR_NAME, inputData);
+List> docs = null;
+for(int t = 0;t < MAX_RETRIES;++t, Thread.sleep(SLEEP_MS)) {
+  docs = getIndexedTestData(getIndexName(), SENSOR_NAME);
+  if(docs.size() >= 10) {
+break;
+  }
+}
+Assert.assertEquals(10, docs.size());
+//modify the first message and add a new field
+{
+  Map message0 = new HashMap(inputData.get(0)) {{
+put("new-field", "metron");
+  }};
+  String guid = "" + message0.get(Constants.GUID);
+  Document update = getDao().replace(new ReplaceRequest(){{
+setReplacement(message0);
+setGuid(guid);
+setSensorType(SENSOR_NAME);
+setIndex(getIndexName());
+  }}, Optional.empty());
+
+  Assert.assertEquals(message0, update.getDocument());
+  Assert.assertEquals(1, getMockHTable().size());
+  findUpdatedDoc(message0, guid, SENSOR_NAME);
+  {
+//ensure hbase is up to date
+Get g = new Get(HBaseDao.Key.toBytes(new HBaseDao.Key(guid, 
SENSOR_NAME)));
+Result r = getMockHTable().get(g);
+NavigableMap columns = 
r.getFamilyMap(CF.getBytes());
+Assert.assertEquals(1, columns.size());
+Assert.assertEquals(message0
+, JSONUtils.INSTANCE.load(new 
String(columns.lastEntry().getValue())
+, JSONUtils.MAP_SUPPLIER)
+);
+  }
+  {
+//ensure ES is up-to-date
--- End diff --

Isn't this a Solr test?


---


[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

2018-11-26 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1247#discussion_r236520379
  
--- Diff: 
metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/integration/ElasticsearchSearchIntegrationTest.java
 ---
@@ -97,48 +118,81 @@ protected static InMemoryComponent startIndex() throws 
Exception {
 return es;
   }
 
-  protected static void loadTestData() throws ParseException, IOException {
+  protected static void loadTestData() throws Exception {
 ElasticSearchComponent es = (ElasticSearchComponent) indexComponent;
 
+// define the bro index template
+String broIndex = "bro_index_2017.01.01.01";
 JSONObject broTemplate = JSONUtils.INSTANCE.load(new 
File(broTemplatePath), JSONObject.class);
 addTestFieldMappings(broTemplate, "bro_doc");
-
es.getClient().admin().indices().prepareCreate("bro_index_2017.01.01.01")
-.addMapping("bro_doc", 
JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
+es.getClient().admin().indices().prepareCreate(broIndex)
+.addMapping("bro_doc", 
JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
+
+// define the snort index template
+String snortIndex = "snort_index_2017.01.01.02";
 JSONObject snortTemplate = JSONUtils.INSTANCE.load(new 
File(snortTemplatePath), JSONObject.class);
 addTestFieldMappings(snortTemplate, "snort_doc");
-
es.getClient().admin().indices().prepareCreate("snort_index_2017.01.01.02")
-.addMapping("snort_doc", 
JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
-
-BulkRequestBuilder bulkRequest = es.getClient().prepareBulk()
-.setRefreshPolicy(WriteRequest.RefreshPolicy.WAIT_UNTIL);
-JSONArray broArray = (JSONArray) new JSONParser().parse(broData);
-for (Object o : broArray) {
-  JSONObject jsonObject = (JSONObject) o;
-  IndexRequestBuilder indexRequestBuilder = es.getClient()
-  .prepareIndex("bro_index_2017.01.01.01", "bro_doc");
-  indexRequestBuilder = indexRequestBuilder.setId((String) 
jsonObject.get("guid"));
-  indexRequestBuilder = 
indexRequestBuilder.setSource(jsonObject.toJSONString());
-  indexRequestBuilder = indexRequestBuilder
-  .setTimestamp(jsonObject.get("timestamp").toString());
-  bulkRequest.add(indexRequestBuilder);
+es.getClient().admin().indices().prepareCreate(snortIndex)
+.addMapping("snort_doc", 
JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
+
+// setup the classes required to write the test data
+AccessConfig accessConfig = createAccessConfig();
+ElasticsearchClient client = 
ElasticsearchUtils.getClient(createGlobalConfig());
+ElasticsearchRetrieveLatestDao retrieveLatestDao = new 
ElasticsearchRetrieveLatestDao(client);
+ElasticsearchColumnMetadataDao columnMetadataDao = new 
ElasticsearchColumnMetadataDao(client);
+ElasticsearchRequestSubmitter requestSubmitter = new 
ElasticsearchRequestSubmitter(client);
+ElasticsearchUpdateDao updateDao = new ElasticsearchUpdateDao(client, 
accessConfig, retrieveLatestDao);
+ElasticsearchSearchDao searchDao = new ElasticsearchSearchDao(client, 
accessConfig, columnMetadataDao, requestSubmitter);
--- End diff --

Oh wow, do we not have a master factory for stitching these all together in 
the desired default manner? Not sure if this helps, but how bout this -> 
https://github.com/apache/metron/blob/master/metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/dao/ElasticsearchDao.java#L97


---


[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

2018-11-26 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1247#discussion_r236521617
  
--- Diff: 
metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/integration/components/ElasticSearchComponent.java
 ---
@@ -194,35 +215,41 @@ public Client getClient() {
 return client;
   }
 
-  public BulkResponse add(String indexName, String sensorType, String... 
docs) throws IOException {
+  public void add(UpdateDao updateDao, String indexName, String 
sensorType, String... docs)
+  throws IOException, ParseException {
 List d = new ArrayList<>();
 Collections.addAll(d, docs);
-return add(indexName, sensorType, d);
+add(updateDao, indexName, sensorType, d);
   }
 
-  public BulkResponse add(String indexName, String sensorType, 
Iterable docs)
-  throws IOException {
-BulkRequestBuilder bulkRequest = getClient().prepareBulk();
-for (String doc : docs) {
-  IndexRequestBuilder indexRequestBuilder = getClient()
-  .prepareIndex(indexName, sensorType + "_doc");
-
-  indexRequestBuilder = indexRequestBuilder.setSource(doc);
-  Map esDoc = JSONUtils.INSTANCE
-  .load(doc, JSONUtils.MAP_SUPPLIER);
-  indexRequestBuilder.setId((String) esDoc.get(Constants.GUID));
-  Object ts = esDoc.get("timestamp");
-  if (ts != null) {
-indexRequestBuilder = 
indexRequestBuilder.setTimestamp(ts.toString());
-  }
-  bulkRequest.add(indexRequestBuilder);
-}
+  public void add(UpdateDao updateDao, String indexName, String 
sensorType, Iterable docs)
--- End diff --

Might it be better to just use IndexDao, which `extends UpdateDao, 
SearchDao, RetrieveLatestDao, ColumnMetadataDao`? To that end, if we're looking 
to route all of this through the ES component in that fashion, it might make 
sense to simply replace the internal `private Client client;` and instead use 
the new desired IndexDao for the proxied calls to ES. It could be setup at 
construction time of the component and remove the need to do the same thing for 
every test that uses the component class, unless they actually want to do 
something custom and pass in their own dao during init.


---


[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

2018-11-26 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1247#discussion_r236521838
  
--- Diff: 
metron-platform/metron-solr/src/test/java/org/apache/metron/solr/integration/SolrUpdateIntegrationTest.java
 ---
@@ -186,4 +195,114 @@ public void testHugeErrorFields() throws Exception {
 exception.expectMessage("Document contains at least one immense term 
in field=\"error_hash\"");
 getDao().update(errorDoc, Optional.of("error"));
   }
+
+  @Test
+  @Override
+  public void test() throws Exception {
+List> inputData = new ArrayList<>();
+for(int i = 0; i < 10;++i) {
+  final String name = "message" + i;
+  inputData.add(
+  new HashMap() {{
+put("source.type", SENSOR_NAME);
+put("name" , name);
+put("timestamp", System.currentTimeMillis());
+put(Constants.GUID, name);
+  }}
+  );
+}
+addTestData(getIndexName(), SENSOR_NAME, inputData);
+List> docs = null;
+for(int t = 0;t < MAX_RETRIES;++t, Thread.sleep(SLEEP_MS)) {
+  docs = getIndexedTestData(getIndexName(), SENSOR_NAME);
+  if(docs.size() >= 10) {
+break;
+  }
+}
+Assert.assertEquals(10, docs.size());
+//modify the first message and add a new field
+{
+  Map message0 = new HashMap(inputData.get(0)) {{
+put("new-field", "metron");
+  }};
+  String guid = "" + message0.get(Constants.GUID);
+  Document update = getDao().replace(new ReplaceRequest(){{
+setReplacement(message0);
+setGuid(guid);
+setSensorType(SENSOR_NAME);
+setIndex(getIndexName());
+  }}, Optional.empty());
+
+  Assert.assertEquals(message0, update.getDocument());
+  Assert.assertEquals(1, getMockHTable().size());
+  findUpdatedDoc(message0, guid, SENSOR_NAME);
+  {
+//ensure hbase is up to date
+Get g = new Get(HBaseDao.Key.toBytes(new HBaseDao.Key(guid, 
SENSOR_NAME)));
+Result r = getMockHTable().get(g);
+NavigableMap columns = 
r.getFamilyMap(CF.getBytes());
+Assert.assertEquals(1, columns.size());
+Assert.assertEquals(message0
+, JSONUtils.INSTANCE.load(new 
String(columns.lastEntry().getValue())
+, JSONUtils.MAP_SUPPLIER)
+);
+  }
+  {
+//ensure ES is up-to-date
+long cnt = 0;
+for (int t = 0; t < MAX_RETRIES && cnt == 0; ++t, 
Thread.sleep(SLEEP_MS)) {
+  docs = getIndexedTestData(getIndexName(), SENSOR_NAME);
+  cnt = docs
+  .stream()
+  .filter(d -> 
message0.get("new-field").equals(d.get("new-field")))
+  .count();
+}
+Assert.assertNotEquals("Data store is not updated!", cnt, 0);
+  }
+}
+//modify the same message and modify the new field
+{
+  Map message0 = new HashMap(inputData.get(0)) {{
+put("new-field", "metron2");
+  }};
+  String guid = "" + message0.get(Constants.GUID);
+  Document update = getDao().replace(new ReplaceRequest(){{
+setReplacement(message0);
+setGuid(guid);
+setSensorType(SENSOR_NAME);
+setIndex(getIndexName());
+  }}, Optional.empty());
+  Assert.assertEquals(message0, update.getDocument());
+  Assert.assertEquals(1, getMockHTable().size());
+  Document doc = getDao().getLatest(guid, SENSOR_NAME);
+  Assert.assertEquals(message0, doc.getDocument());
+  findUpdatedDoc(message0, guid, SENSOR_NAME);
+  {
+//ensure hbase is up to date
+Get g = new Get(HBaseDao.Key.toBytes(new HBaseDao.Key(guid, 
SENSOR_NAME)));
+Result r = getMockHTable().get(g);
+NavigableMap columns = 
r.getFamilyMap(CF.getBytes());
+Assert.assertEquals(2, columns.size());
+Assert.assertEquals(message0, JSONUtils.INSTANCE.load(new 
String(columns.lastEntry().getValue())
+, JSONUtils.MAP_SUPPLIER)
+);
+Assert.assertNotEquals(message0, JSONUtils.INSTANCE.load(new 
String(columns.firstEntry().getValue())
+, JSONUtils.MAP_SUPPLIER)
+);
+  }
+  {
+//ensure ES is up-to-date
--- End diff --

Solr


---


[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

2018-11-26 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1247#discussion_r236440028
  
--- Diff: 
metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/integration/ElasticsearchSearchIntegrationTest.java
 ---
@@ -97,45 +118,63 @@ protected static InMemoryComponent startIndex() throws 
Exception {
 return es;
   }
 
-  protected static void loadTestData() throws ParseException, IOException {
+  protected static void loadTestData() throws Exception {
 ElasticSearchComponent es = (ElasticSearchComponent) indexComponent;
 
+// define the bro index template
+String broIndex = "bro_index_2017.01.01.01";
 JSONObject broTemplate = JSONUtils.INSTANCE.load(new 
File(broTemplatePath), JSONObject.class);
 addTestFieldMappings(broTemplate, "bro_doc");
-
es.getClient().admin().indices().prepareCreate("bro_index_2017.01.01.01")
-.addMapping("bro_doc", 
JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
+es.getClient().admin().indices().prepareCreate(broIndex)
+.addMapping("bro_doc", 
JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
+
+// define the snort index template
+String snortIndex = "snort_index_2017.01.01.02";
 JSONObject snortTemplate = JSONUtils.INSTANCE.load(new 
File(snortTemplatePath), JSONObject.class);
 addTestFieldMappings(snortTemplate, "snort_doc");
-
es.getClient().admin().indices().prepareCreate("snort_index_2017.01.01.02")
-.addMapping("snort_doc", 
JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
-
-BulkRequestBuilder bulkRequest = es.getClient().prepareBulk()
-.setRefreshPolicy(WriteRequest.RefreshPolicy.WAIT_UNTIL);
-JSONArray broArray = (JSONArray) new JSONParser().parse(broData);
-for (Object o : broArray) {
-  JSONObject jsonObject = (JSONObject) o;
-  IndexRequestBuilder indexRequestBuilder = es.getClient()
-  .prepareIndex("bro_index_2017.01.01.01", "bro_doc");
-  indexRequestBuilder = indexRequestBuilder.setId((String) 
jsonObject.get("guid"));
-  indexRequestBuilder = 
indexRequestBuilder.setSource(jsonObject.toJSONString());
-  indexRequestBuilder = indexRequestBuilder
-  .setTimestamp(jsonObject.get("timestamp").toString());
-  bulkRequest.add(indexRequestBuilder);
+es.getClient().admin().indices().prepareCreate(snortIndex)
+.addMapping("snort_doc", 
JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
+
+// setup the classes required to write the test data
+AccessConfig accessConfig = createAccessConfig();
+ElasticsearchClient client = 
ElasticsearchUtils.getClient(createGlobalConfig());
+ElasticsearchRetrieveLatestDao retrieveLatestDao = new 
ElasticsearchRetrieveLatestDao(client);
+ElasticsearchColumnMetadataDao columnMetadataDao = new 
ElasticsearchColumnMetadataDao(client);
+ElasticsearchRequestSubmitter requestSubmitter = new 
ElasticsearchRequestSubmitter(client);
+ElasticsearchUpdateDao updateDao = new ElasticsearchUpdateDao(client, 
accessConfig, retrieveLatestDao);
+ElasticsearchSearchDao searchDao = new ElasticsearchSearchDao(client, 
accessConfig, columnMetadataDao, requestSubmitter);
+
+// write the test documents for Bro
+List broDocuments = new ArrayList<>();
+for (Object broObject: (JSONArray) new JSONParser().parse(broData)) {
+  broDocuments.add(((JSONObject) broObject).toJSONString());
 }
-JSONArray snortArray = (JSONArray) new JSONParser().parse(snortData);
-for (Object o : snortArray) {
-  JSONObject jsonObject = (JSONObject) o;
-  IndexRequestBuilder indexRequestBuilder = es.getClient()
-  .prepareIndex("snort_index_2017.01.01.02", "snort_doc");
-  indexRequestBuilder = indexRequestBuilder.setId((String) 
jsonObject.get("guid"));
-  indexRequestBuilder = 
indexRequestBuilder.setSource(jsonObject.toJSONString());
-  indexRequestBuilder = indexRequestBuilder
-  .setTimestamp(jsonObject.get("timestamp").toString());
-  bulkRequest.add(indexRequestBuilder);
+es.add(updateDao, broIndex, "bro", broDocuments);
+
+// write the test documents for Snort
+List snortDocuments = new ArrayList<>();
+for (Object snortObject: (JSONArray) new 
JSONParser().parse(snortData)) {
+  snortDocuments.add(((JSONObject) snortObject).toJSONString()

[GitHub] metron pull request #1274: METRON-1887: Add logging to the ClasspathFunction...

2018-11-26 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1274#discussion_r236430858
  
--- Diff: 
metron-stellar/stellar-common/src/main/java/org/apache/metron/stellar/common/utils/VFSClassloaderUtil.java
 ---
@@ -112,14 +112,18 @@ public static FileSystemManager generateVfs() throws 
FileSystemException {
* @throws FileSystemException
*/
   public static Optional configureClassloader(String paths) 
throws FileSystemException {
+LOG.debug("Configuring class loader with paths = {}", paths);
 if(paths.trim().isEmpty()) {
--- End diff --

We haven't encountered any instances with path being null that I'm aware of 
- I'd like to start with this and add more checks as needed.


---


[GitHub] metron issue #1261: METRON-1860 [WIP] new developer option for ansible in do...

2018-11-26 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1261
  
@JonZeolla - would it make sense to add ShellCheck to Travis? Seems like a 
lot of potentially useful detail here.


---


[GitHub] metron issue #1249: METRON-1815: Separate metron-parsers into metron-parsers...

2018-11-20 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1249
  
@justinleet @nickwallen - Doesn't matter to me either way. But again, 
regardless of what you do, please just add/create an appropriate description to 
each pom `description` element, provide a sensible `name`, and add something in 
the module's README to briefly describe what the project does and what belongs 
there.


---


[GitHub] metron pull request #1245: METRON-1795: Initial Commit for Regular Expressio...

2018-11-20 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1245#discussion_r235117335
  
--- Diff: 
metron-platform/metron-common/src/main/java/org/apache/metron/common/Constants.java
 ---
@@ -127,5 +127,40 @@ public String getType() {
 }
   }
 
+   public enum ParserConfigConstants {
--- End diff --

Agreed on the previous change regarding use of `static`. In addition, I 
think we want to move this list of config parser config constants. The 
constants you've added are very specific to this function, and the `Constants` 
class is really intended for more global scope items. You can probably just 
make this an inner enum in your `RegexParser` class as it's very specific to 
that class. Alternatively, you might take a look at @merrimanr 's PR for 
Stellar REST calls for an example of where you can put configuration if you 
need something more complex - 
https://github.com/apache/metron/pull/1250/files#diff-1f3a2a3b1b044494c022cca77223c182.
 Again, I think your best off using an inner enum in this case.


---


[GitHub] metron issue #1274: METRON-1887: Add logging to the ClasspathFunctionResolve...

2018-11-19 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1274
  
Note - github is showing an extra commit from Apache for Justin that hasn't 
synced yet. Should hopefully be in sync soon.


---


[GitHub] metron pull request #1274: METRON-1887: Add logging to the ClasspathFunction...

2018-11-19 Thread mmiklavc
GitHub user mmiklavc opened a pull request:

https://github.com/apache/metron/pull/1274

METRON-1887: Add logging to the ClasspathFunctionResolver

## Contributor Comments

https://issues.apache.org/jira/browse/METRON-1887

From the Jira description:
We had a user reporting non-deterministic NPE's here - 
https://github.com/apache/metron/blob/master/metron-stellar/stellar-common/src/main/java/org/apache/metron/stellar/dsl/functions/resolver/ClasspathFunctionResolver.java#L250.
 The purpose of this ticket is to more gracefully handle the NPE and add some 
additional debugging information to better track down the source of this issue 
should it come up again.


## Pull Request Checklist

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [x] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?


### For code changes:
- [x] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
- [x] Have you included steps or a guide to how the change may be verified 
and tested manually?
- [x] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
  ```
  mvn -q clean integration-test install && 
dev-utilities/build-utils/verify_licenses.sh 
  ```

- n/a Have you written or updated unit tests and or integration tests to 
verify your changes?
- n/a If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] Have you verified the basic functionality of the build by building 
and running locally with Vagrant full-dev environment or the equivalent?

### For documentation related changes:
- n/a Have you ensured that format looks appropriate for the output in 
which it is rendered by building and verifying the site-book? If not then run 
the following commands and the verify changes via 
`site-book/target/site/index.html`:

  ```
  cd site-book
  mvn site
  ```

 Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.
It is also recommended that [travis-ci](https://travis-ci.org) is set up 
for your personal repository such that your branches are built there before 
submitting a pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mmiklavc/metron METRON-1887-classpath-resolver

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/1274.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1274


commit bfa7491f160abc3e01bb1c6f1c3927eb5cec7021
Author: justinleet 
Date:   2018-11-19T14:11:29Z

METRON-1872 Move rat plugin away from snapshot version (justinleet) closes 
apache/metron#1264

commit 0e50720da2c589ff0956164dfc71a12790205be4
Author: Michael Miklavcic 
Date:   2018-11-19T22:18:26Z

METRON-1887: Add logging to the ClasspathFunctionResolver




---


[GitHub] metron issue #1268: METRON-1877: Nested IF ELSE statements can cause parse e...

2018-11-19 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1268
  
Hm, this looks possibly related to, but not the same as what was of concern 
in 1247. In the client migration, there was some uncertainty in the 
ElasticsearchSearchIntegrationTest. But this failure appears to be in the 
indexing integration test. It's actually an error, not a normal failure. And it 
appears to happen before any tests can start to run. Can you open a ticket? We 
should track the intermittent failure, but I think it's unrelated to this PR.

Incidentally, I caught something in the logs I never noticed before. `90399 
[main] INFO  o.e.n.Node - version[5.6.2-SNAPSHOT], pid[21275], 
build[Unknown/Unknown], OS[Linux/4.4.0-101-generic/amd64], JVM[Oracle 
Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_151/25.151-b12]`. This 
might also be worth looking into as I'm not sure if/why we should be depending 
on a SNAPSHOT version.


---


[GitHub] metron issue #1242: METRON-1834: Migrate Elasticsearch from TransportClient ...

2018-11-14 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1242
  
I'm about to merge this and wanted to give some added feedback. @anandsubbu 
and I made some additional changes to the tuning options, specifically setting 
`topology.ackers.executors` equal to the number of inbound Kafka topic 
partitions, and we were able to see runs with no failures.


---


[GitHub] metron issue #1249: METRON-1815: Separate metron-parsers into metron-parsers...

2018-11-14 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1249
  
I don't want to bog this down too much. I'm good with either the recent rev 
you proposed or simply adding the appropriate notes in the pom internals and 
dev docs. Chefs choice.


---


[GitHub] metron issue #1249: METRON-1815: Separate metron-parsers into metron-parsers...

2018-11-13 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1249
  
I'm on board with the most recent proposal, just one small nit. What's the 
significance of using the different, yet still extremely similar words "parser" 
and "parsing?" I'd almost be more inclined to use a term like "normalize," 
since that's effectively the other major function of our parser topology.


---


[GitHub] metron issue #1261: METRON-1860 [WIP] new developer option for ansible in do...

2018-11-13 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1261
  
@ottobackwards thanks for the submission. Per recent community comments, it 
sounds like we could use some improvements to how our build/deploy dep versions 
interact with other tooling that may require other versions of things. 

Is this the correct dependency listings for each host/container?

The Docker container will be pre-configured with:

- Java 8
- Ansible 2.4.0+
- Python 2.7
- Maven 3.3.9
- C++11 compliant compiler, like GCC

And the developer host machine would now need to manage versions of:

- Vagrant 2.0+
- Vagrant Hostmanager Plugin
- Virtualbox 5.0+
- Docker

The deployment goes through these steps:

1. You mount m2 repo and your code dir in the Docker instance.
2. build_and_run optionally spins up Vagrant in VirtualBox.
3. build_and_run optionally creates Docker instance with pre-reqs for 
building and deploying Metron.
4. build_and_run runs the build and then calls the Ansible deployment 
scripts from within Docker.

At the end, you have an ephemeral Docker instance + Vagrant instance that 
has the running Metron instance?


---


[GitHub] metron issue #1258: METRON-1864 fix issue where daylight savings breaks test...

2018-11-08 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1258
  
@justinleet We do use mock clocks in other parts of the platform, but for 
the sake of this fix I think this is fine. @ottobackwards Can we just throw a 
comment in the test so it's obvious that this is intentionally without 
assertions here? 


---


[GitHub] metron issue #1258: METRON-1864 fix issue where daylight savings breaks test...

2018-11-08 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1258
  
Thanks @ottobackwards and @justinleet!

Did you mean to completely remove the assertions for that test? It seems 
like we would still want to validate the results somehow.

Also, since the context for this PR came from discussions in Slack that 
don't show up on the dev list, can you add a comment about the 
motivation/origin for this fix?


---


[GitHub] metron issue #1250: METRON-1850: Stellar REST function

2018-11-08 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1250
  
@merrimanr Pending a successful Travis run, I'm +1 by inspection. Thanks!


---


[GitHub] metron issue #1250: METRON-1850: Stellar REST function

2018-11-08 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1250
  
Travis failure does not appear related here


---


[GitHub] metron issue #1242: METRON-1834: Migrate Elasticsearch from TransportClient ...

2018-11-07 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1242
  
Thanks @nickwallen. I pushed one more commit to make a note of this Jira in 
Upgrading.md. I verified it looks correct in the site-book and checked it off 
on the Jira tasks as well. I guess just a quick look (it's like 5 lines) when 
you get a moment would be great.

The last remaining bit before I merge this will be sharing some results 
around the performance/regression testing performed by @anandsubbu.


---


[GitHub] metron pull request #1242: METRON-1834: Migrate Elasticsearch from Transport...

2018-11-07 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1242#discussion_r231704912
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/client/ElasticsearchClient.java
 ---
@@ -0,0 +1,147 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.elasticsearch.client;
+
+import com.google.common.base.Joiner;
+import com.google.common.base.Splitter;
+import com.google.common.collect.Iterables;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import org.apache.commons.io.IOUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.http.HttpEntity;
+import org.apache.http.entity.StringEntity;
+import org.apache.metron.common.utils.JSONUtils;
+import org.apache.metron.elasticsearch.utils.FieldMapping;
+import org.apache.metron.elasticsearch.utils.FieldProperties;
+import org.elasticsearch.client.Response;
+import org.elasticsearch.client.RestClient;
+import org.elasticsearch.client.RestHighLevelClient;
+
+/**
+ * Wrapper around the Elasticsearch REST clients. Exposes capabilities of 
the low and high-level clients.
+ */
+public class ElasticsearchClient implements AutoCloseable{
+  private RestClient lowLevelClient;
+  private RestHighLevelClient highLevelClient;
+
+  public ElasticsearchClient(RestClient lowLevelClient, 
RestHighLevelClient highLevelClient) {
+this.lowLevelClient = lowLevelClient;
+this.highLevelClient = highLevelClient;
+  }
+
+  public RestClient getLowLevelClient() {
+return lowLevelClient;
+  }
+
+  public RestHighLevelClient getHighLevelClient() {
+return highLevelClient;
+  }
+
+  @Override
+  public void close() throws IOException {
+if(lowLevelClient != null) {
+  lowLevelClient.close();
+}
+  }
+
+  public void putMapping(String index, String type, String source) throws 
IOException {
+HttpEntity entity = new StringEntity(source);
+Response response = lowLevelClient.performRequest("PUT"
+, "/" + index + "/_mapping/" + type
+, Collections.emptyMap()
+, entity
+);
+
+if(response.getStatusLine().getStatusCode() != 200) {
+  String responseStr = 
IOUtils.toString(response.getEntity().getContent());
+  throw new IllegalStateException("Got a " + 
response.getStatusLine().getStatusCode() + " due to " + responseStr);
+}
+  }
+
+  public String[] getIndices() throws IOException {
--- End diff --

Not unit tests, but the integration tests definitely cover this bit.


---


[GitHub] metron issue #1242: METRON-1834: Migrate Elasticsearch from TransportClient ...

2018-11-07 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1242
  
## Testing Notes For X-Pack AND SSL (I know, gettin fancy):

1. Install X-Pack

This will also install the certgen tool which can be used for 
generating certificates for SSL.

```
/usr/share/elasticsearch/bin/elasticsearch-plugin install x-pack
```

1. Setup X-Pack User

username = xpack_client_user 
password = changeme 
role = superuser

```
sudo /usr/share/elasticsearch/bin/x-pack/users useradd 
xpack_client_user -p changeme -r superuser
```

1. Setup cert using the ES certgen tools

Run the certgen tool. You can store the certs in 
/etc/elasticsearch/ssl/certs.zip when prompted. Use "node1" as the instance 
name in fulldev because you'll want it to match your host. You can leave the IP 
and DNS details blank.

```
/usr/share/elasticsearch/bin/x-pack/certgen --pass
```

Extract the certs

```
cd /etc/elasticsearch/ssl
unzip certs.zip
# I flattened all ca/certs so it looks as follows
ls -1 /etc/elasticsearch/ssl
ca.crt
ca.key
esnode.crt
esnode.key
```

1. Setup Elasticsearch to use the certs

https://www.elastic.co/guide/en/x-pack/5.6/ssl-tls.html

1. Modify /etc/elasticsearch/elasticsearch.yml

```
xpack.ssl.key: /etc/elasticsearch/ssl/esnode.key
xpack.ssl.certificate: /etc/elasticsearch/ssl/esnode.crt
xpack.ssl.certificate_authorities: [ 
"/etc/elasticsearch/ssl/ca.crt" ]
xpack.security.transport.ssl.enabled: true
xpack.security.http.ssl.enabled: true
```

1. Setup the client truststore

1. Import the Certificate Authority (CA). 

* Specify an alias of your choosing. I chose "elasticCA". 
* You'll also be prompted for a password, which must be at least 6 
characters. I used "apachemetron".
* When prompted to "Trust this certificate?" type yes and hit enter.

```
keytool -import -alias elasticCA -file 
/etc/elasticsearch/ssl/ca.crt -keystore clienttruststore.jks
```

1. Import the data node certificate.

* Specify an alias of your choosing. In fulldev use "node1" as it 
will need to match your hostname.
* Enter the password that you set when creating the truststore.

```
keytool -importcert -keystore clienttruststore.jks -alias node1 
-file /etc/elasticsearch/ssl/esnode.crt
```

1. Put the truststore in the storm user home dir for the purpose of our 
tests.

```
mv clienttruststore.jks /home/storm/
chown storm:hadoop /home/storm/clienttruststore.jks
```

1. Configure the Elasticsearch client in Metron

1. Load the passwords in HDFS for x-pack and the truststore

```
echo changeme > /tmp/xpack-password
echo apachemetron > /tmp/truststore-password
sudo -u hdfs hdfs dfs -mkdir /apps/metron/elasticsearch/
sudo -u hdfs hdfs dfs -put /tmp/xpack-password 
/apps/metron/elasticsearch/
sudo -u hdfs hdfs dfs -put /tmp/truststore-password 
/apps/metron/elasticsearch/
sudo -u hdfs hdfs dfs -chown metron:metron 
/apps/metron/elasticsearch/*
```

1. Modify the Metron global config with the SSL and X-Pack properties

* Pull down the latest global config

```
$METRON_HOME/bin/zk_load_configs.sh -m PULL -o 
$METRON_HOME/config/zookeeper -z $ZOOKEEPER -f

```

* Update the configuration by adding the es.client.settings for 
xpack and SSL.

```
"es.client.settings" : {
"xpack.username" : "xpack_client_user",
"xpack.password.file" : 
"/apps/metron/elasticsearch/xpack-password",
"ssl.enabled" : true,
"keystore.type" : "jks",
"keystore.path" : "/home/storm/clienttruststore.jks",
"keystore.password.file" : 
"/apps/metron/elasticsearch/truststore-password"
}
```

* Push the changes to Zookeeper

```
$METRON_HOME/bin/zk_load_configs.sh -m PUSH -i 
$METRON_HOME/config/zookeeper -z $ZOOKEEPER
# Con

[GitHub] metron issue #1242: METRON-1834: Migrate Elasticsearch from TransportClient ...

2018-11-07 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1242
  
> @mmiklavc Keep in mind that I already have the open PR that refactors the 
data load in those tests to use the `ElasticsearchUpdateDao` #1247. I will work 
through the conflicts after this goes in.
> 
> Are you happy with this now? Is it done done?

This is good to go now - I just finished testing SSL.


---


[GitHub] metron issue #1242: METRON-1834: Migrate Elasticsearch from TransportClient ...

2018-11-06 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1242
  
@nickwallen I updated the ElasticsearchSearchIntegrationTest a bit. I'm now 
using the new client to setup the bro/snort templates, create the indices, and 
load the data.


---


[GitHub] metron pull request #1242: METRON-1834: Migrate Elasticsearch from Transport...

2018-11-06 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1242#discussion_r231369990
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/ElasticsearchWriter.java
 ---
@@ -76,7 +76,8 @@ public BulkWriterResponse write(String sensorType, 
WriterConfiguration configura
 FieldNameConverter fieldNameConverter = 
FieldNameConverters.create(sensorType, configurations);
 
 final String indexPostfix = dateFormat.format(new Date());
-BulkRequestBuilder bulkRequest = client.prepareBulk();
+BulkRequest bulkRequest = new BulkRequest();
+//BulkRequestBuilder bulkRequest = client.prepareBulk();
--- End diff --

Done


---


[GitHub] metron issue #1251: METRON-1853: Add shutdown hook to Stellar BaseFunctionRe...

2018-11-06 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1251
  
@ottobackwards Any other feedback on this?


---


[GitHub] metron issue #1251: METRON-1853: Add shutdown hook to Stellar BaseFunctionRe...

2018-11-06 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1251
  
### Testing

 Create a custom Stellar function

See the following for reference - 
https://github.com/apache/metron/blob/master/metron-stellar/stellar-common/3rdPartyStellar.md

1. Setup project dirs

```
mkdir custom-stellar && cd custom-stellar
mkdir -p src/main/java/com/thirdparty/stellar
```

1. Create a pom.xml

```
http://maven.apache.org/POM/4.0.0;
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
  4.0.0

  com.thirdparty
  stellar-funcs
  1.0-SNAPSHOT
  jar

  Stellar Functions

  
UTF-8
  

  

  org.apache.metron
  stellar-common
  0.6.1
  
  provided

  
  

  
org.apache.maven.plugins
maven-compiler-plugin
3.1

  1.8
  1.8

  

  

```

1. Create a custom Stellar function

```
package com.thirdparty.stellar;

import org.apache.metron.stellar.common.utils.ConversionUtils;
import org.apache.metron.stellar.dsl.Context;
import org.apache.metron.stellar.dsl.ParseException;
import org.apache.metron.stellar.dsl.Stellar;
import org.apache.metron.stellar.dsl.StellarFunction;

import java.io.IOException;
import java.util.List;
import org.jboss.netty.util.internal.ConversionUtil;

public class StellarCustom {

  @Stellar(name = "NOW",
  description = "Simple now time",
  params = {
  "throwException - whether we should throw an exception on close()"
  },
  returns = "Time in millis"
  )
  public static class Now implements StellarFunction {

private boolean isInitialized = false;
private boolean throwException = false;

public Object apply(List args, Context context) throws 
ParseException {
  if (args.size() == 1) {
throwException = ConversionUtils.convert(args.get(0), 
Boolean.class);
  }
  return System.currentTimeMillis();
}

public void initialize(Context context) {
  try {
System.out.println("Initializing function.");
  } finally {
isInitialized = true;
  }
}

public boolean isInitialized() {
  return isInitialized;
}

public void close() throws IOException {
  System.out.println("Test function called close()!");
  if (throwException) {
NullPointerException cause = new NullPointerException("Don't 
point at me!");
throw new IOException("Something icky happened.", cause);
  }
}
  }

}
```

1. Build it

`mvn clean install`

1. Ship it

`scp custom-stellar/target/stellar-funcs-1.0-SNAPSHOT.jar 
root@node1:/tmp/`

1. Setup HDFS stellar

```
sudo -u hdfs hdfs dfs -mkdir /apps/metron/stellar
sudo -u hdfs hdfs dfs -put /tmp/stellar-funcs-1.0-SNAPSHOT.jar 
/apps/metron/stellar
sudo -u hdfs hdfs dfs -chown -R metron:metron /apps/metron/stellar
```

1. Setup global config for Stellar HDFS location

```
$METRON_HOME/bin/zk_load_configs.sh -m PULL -o 
$METRON_HOME/config/zookeeper -z $ZOOKEEPER -f
vim $METRON_HOME/config/zookeeper/global.json
# add this -> "stellar.function.paths" : 
"hdfs://node1:8020/apps/metron/stellar/.*.jar"
# update global.json in ZK
$METRON_HOME/bin/zk_load_configs.sh -m PUSH -i 
$METRON_HOME/config/zookeeper -z $ZOOKEEPER
```

1. Setup a parser or enrichment with the new function. (more specific 
details to come)

Don't want an exception thrown 
`NOW(false)`
Restart parser or enrichment using this function
Should get data still
Stop the topology - should see a notice about shutdown in the logs.

1. Same as before but throw exception on shutdown. (more specific details 
to come)

WANT an exception thrown 
`NOW(true)`
 

[GitHub] metron issue #1242: METRON-1834: Migrate Elasticsearch from TransportClient ...

2018-11-06 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1242
  
Adapted from 
https://github.com/apache/metron/pull/840#issuecomment-349103038

# Test Script

Run full dev and verify you see data populating the alerts UI.

Testing Instructions beyond the normal smoke test (i.e. letting data
flow through to the indices and checking them).

# Preliminaries

Setup env vars. I like to do something like the following:
```
echo export METRON_HOST=node1 >> /root/.bashrc && \
echo export HDP_HOME=/usr/hdp/current >> /root/.bashrc && \
echo export KAFKA_HOME=/usr/hdp/current/kafka-broker >> /root/.bashrc && \
export SOLR_VERSION="6.6.2" && \
echo export SOLR_VERSION="$SOLR_VERSION" >> /root/.bashrc && \
echo export SOLR_HOME="/var/solr/solr-\${SOLR_VERSION}" >> /root/.bashrc && 
\
echo export ELASTIC_HOME="/usr/share/elasticsearch" >> /root/.bashrc && \
echo export KIBANA_HOME="/usr/share/kibana" >> /root/.bashrc && \
echo export ZOOKEEPER=\${METRON_HOST}:2181 >> /root/.bashrc && \
echo export BROKERLIST=\${METRON_HOST}:6667 >> /root/.bashrc && \
echo export STORM_UI=http://\${METRON_HOST}:8744 >> /root/.bashrc && \
echo export ELASTIC=http://\${METRON_HOST}:9200 >> /root/.bashrc && \
echo export ES_HOST=http://\${METRON_HOST}:9200 >> /root/.bashrc && \
echo export KIBANA=http://\${METRON_HOST}:5000 >> /root/.bashrc && \
export METRON_VERSION="0.6.1" && \
echo export METRON_VERSION="$METRON_VERSION" >> /root/.bashrc && \
echo export METRON_HOME="/usr/metron/\${METRON_VERSION}" >> /root/.bashrc 
&& \
source /root/.bashrc 
```

# Deploy the dummy parser
* Edit `$METRON_HOME/config/zookeeper/parsers/dummy.json`:
```
{
  "parserClassName":"org.apache.metron.parsers.json.JSONMapParser",
  "sensorTopic":"dummy"
}
```
* Create the dummy kafka topic:
  `/usr/hdp/current/kafka-broker/bin/kafka-topics.sh --zookeeper node1:2181 
--create --topic dummy --partitions 1 --replication-factor 1`
* Persist config changes: `$METRON_HOME/bin/zk_load_configs.sh -m PUSH -i 
$METRON_HOME/config/zookeeper -z node1:2181`
* Start via `$METRON_HOME/bin/start_parser_topology.sh -k node1:6667 -z 
node1:2181 -s dummy`

# Send dummy data through
* Edit `~/msg.json` with the following content:
```
{ "guid" : "guid0", "sensor.type" : "dummy", "timestamp" : 100 }
```
* Send `msg.json` through to kafka via `cat ~/msg.json | 
/usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh --broker-list 
node1:6667 --topic dummy`
* Validate data has been written to the index:
```
curl -XPOST 'http://localhost:9200/dummy*/_search?pretty' 
```

## Test Case: Update via patch
* Patch the message in ES and create a new field 'project' by executing
  the following:
```
curl -u user:password -X PATCH --header 'Content-Type: application/json' 
--header 'Accept: */*' -d '{
  "guid" : "guid0",
"sensorType" : "dummy",
"patch" : [
{
  "op": "add"
, "path": "/project"
, "value": "metron"
}
  ]
}' 'http://node1:8082/api/v1/update/patch'
```
* Validate that the message has a field 'project':
```
curl -XPOST 'http://localhost:9200/dummy*/_search?pretty' -d '
{
  "_source" : [ "project" ]
}
'
```

## Test Case: Update via replace 
* Replace the message in ES and create a couple of modifications:
  * new field `new_field` == "brand new"
  * modified `timestamp` == 7
Execute the following:
```
curl -u user:password -X POST --header 'Content-Type: application/json' 
--header 'Accept: */*' -d '{
 "guid" : "guid0",
 "sensorType" : "dummy",
 "replacement" : {
   "source:type": "dummy",
   "guid" : "guid0",
   "new_field" : "brand new",
   "timestamp" : 7
  }
   }' 'http://node1:8082/api/v1/update/replace'
```
* Validate that the message has a field 'new_field':
```
curl -XPOST 'http://localhost:9200/dummy*/_search?pretty' -d '
{
  "_source" : [ "new_field", "timestamp" ]
}
'

[GitHub] metron issue #1253: METRON-1857 Fix Metaalert Nested Alert Field Name in Ind...

2018-11-05 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1253
  
lgtm, +1 by inspection.


---


[GitHub] metron issue #1251: METRON-1853: Add shutdown hook to Stellar BaseFunctionRe...

2018-11-05 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1251
  
@ottobackwards Ok, pushed out an update for that change along with 
accompanying unit test.


---


[GitHub] metron issue #1251: METRON-1853: Add shutdown hook to Stellar BaseFunctionRe...

2018-11-05 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1251
  
> Should the resolver throw if there is a call after it is closed?
> or just return nothing? Do we have a test for that? Either way we should.

What do you think of a no-op on multiple invocations of close? This would 
be consistent with the `Closeable` interface.


```
/**
 * Closes this stream and releases any system resources associated
 * with it. If the stream is already closed then invoking this
 * method has no effect.
...
 */
public void close() throws IOException;
```



---


[GitHub] metron issue #1252: METRON-1855: Make unified enrichment topology the defaul...

2018-11-05 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1252
  
Pertaining to the topology args, our elasticsearch_master.py script always 
passes in args based on the active topology. I also added a note for changing 
the setting in Ambari as well as the manual process, per your suggestion, to 
Upgrading.md.

```
 # which enrichment topology needs started?
if self.__params.enrichment_topology == "Unified":
topology_flux = 
"{0}/flux/enrichment/remote-unified.yaml".format(self.__params.metron_home)
topology_props = 
"{0}/config/enrichment-unified.properties".format(self.__params.metron_home)
elif self.__params.enrichment_topology == "Split-Join":
topology_flux = 
"{0}/flux/enrichment/remote-splitjoin.yaml".format(self.__params.metron_home)
topology_props = 
"{0}/config/enrichment-splitjoin.properties".format(self.__params.metron_home)
else:
raise Fail("Unexpected enrichment topology; name=" + 
self.__params.enrichment_topology)

# start the topology
start_cmd_template = """{0}/bin/start_enrichment_topology.sh 
--remote {1} --filter {2}"""
Logger.info('Starting ' + self.__enrichment_topology)
start_cmd = 
start_cmd_template.format(self.__params.metron_home, topology_flux, 
topology_props)
Execute(start_cmd, user=self.__params.metron_user, tries=3, 
try_sleep=5, logoutput=True)
```


---


[GitHub] metron issue #1252: METRON-1855: Make unified enrichment topology the defaul...

2018-11-05 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1252
  
@nickwallen Let me know what you think of the latest changes.

Side note, I just noticed this new "Resolve Conversation" option.


---


[GitHub] metron pull request #1252: METRON-1855: Make unified enrichment topology the...

2018-11-05 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1252#discussion_r230869589
  
--- Diff: metron-platform/metron-enrichment/README.md ---
@@ -76,6 +62,19 @@ intel bolt, the configurations will be taken from the 
respective join bolt
 parallelism.  When proper ambari support for this is added, we will add
 its own property.
 
+### Split-Join Enrichment Topology
+
+The now-deprecated split/join topology is also available and performs 
enrichments in parallel.
+This poses some issues in terms of ease of tuning and reasoning about 
performance.
+
+![Architecture](enrichment_arch.png)
+
+ Using It
+
+In order to use the older, deprecated topology, you will need to
+* Edit `$METRON_HOME/bin/start_enrichment_topology.sh` and adjust it to 
use `remote-splitjoin.yaml` instead of `remote-unified.yaml`
--- End diff --

Nevermind, just re-read that suggestion. Sounds good to me @nickwallen.


---


[GitHub] metron pull request #1252: METRON-1855: Make unified enrichment topology the...

2018-11-05 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1252#discussion_r230867139
  
--- Diff: 
metron-platform/metron-enrichment/src/main/scripts/start_enrichment_topology.sh 
---
@@ -20,7 +20,7 @@ METRON_VERSION=${project.version}
 METRON_HOME=/usr/metron/$METRON_VERSION
 TOPOLOGY_JAR=${project.artifactId}-$METRON_VERSION-uber.jar
 
-# there are two enrichment topologies.  by default, the split-join 
enrichment topology is executed
+# There are two enrichment topologies. By default, the unified enrichment 
topology is executed. Split-join is now deprecated.
 SPLIT_JOIN_ARGS="--remote 
$METRON_HOME/flux/enrichment/remote-splitjoin.yaml --filter 
$METRON_HOME/config/enrichment-splitjoin.properties"
 UNIFIED_ARGS="--remote $METRON_HOME/flux/enrichment/remote-unified.yaml 
--filter $METRON_HOME/config/enrichment-unified.properties"
--- End diff --

> @mmiklavc Should we add a small blurb in `Upgrading.md` to document how 
users who are upgrading can revert to the existing functionality?

Hrm, our upgrading doc appears to list items on a per-release basis, and we 
don't have anything new since 0.5.0. Here's what I'll do. Add an upgrading from 
0.6.0 to 0.6.1 notice and make we change the version accordingly in the next 
release.


---


[GitHub] metron pull request #1252: METRON-1855: Make unified enrichment topology the...

2018-11-05 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1252#discussion_r230864542
  
--- Diff: metron-platform/metron-enrichment/README.md ---
@@ -76,6 +62,19 @@ intel bolt, the configurations will be taken from the 
respective join bolt
 parallelism.  When proper ambari support for this is added, we will add
 its own property.
 
+### Split-Join Enrichment Topology
+
+The now-deprecated split/join topology is also available and performs 
enrichments in parallel.
+This poses some issues in terms of ease of tuning and reasoning about 
performance.
+
+![Architecture](enrichment_arch.png)
+
+ Using It
+
+In order to use the older, deprecated topology, you will need to
+* Edit `$METRON_HOME/bin/start_enrichment_topology.sh` and adjust it to 
use `remote-splitjoin.yaml` instead of `remote-unified.yaml`
--- End diff --

This actually wasn't net-new, it's the way we currently expose the choice. 
I would prefer to keep it as-is and manual because the desire is NOT to have 
customers use the deprecated topology.


---


[GitHub] metron pull request #1252: METRON-1855: Make unified enrichment topology the...

2018-11-03 Thread mmiklavc
GitHub user mmiklavc opened a pull request:

https://github.com/apache/metron/pull/1252

METRON-1855: Make unified enrichment topology the default and deprecate 
split-join

## Contributor Comments

https://issues.apache.org/jira/browse/METRON-1855

Deprecates the split-join topology in favor of the simpler, more performant 
unified enrichment topology. Encompasses the following changes:

- The MPack is configured with the Unified topology as the new default
- Unified topology properties appear first in the Ambari Enrichment config 
section and in the enrichment type dropdown.
- Documentation changed to emphasize the unified topology
- Performance docs make note of the split-join topology deprecation in 
favor of the unified topology.

I'll make another note of it in the DISCUSS thread, but I think we should 
make a goal of marking split-join deprecated in this upcoming release and 
removing it altogether shortly thereafter.

DISCUSS thread - 
https://lists.apache.org/thread.html/6cfc883de28a5cb41f26d0523522d4b93272ac954e5713c80a35675e@%3Cdev.metron.apache.org%3E

**Testing**

- Spin up full dev
- Verify that the unified enrichment topology is now running by default. 
- View the Storm UI topology details for enrichment and verify that there 
is no longer a split/join bolt.
- Verify the unified enrichment properties show up first under the 
enrichments config section and that "unified" is active in the enrichment type 
dropdown.
- Change enrichment type to Split Join and verify that the deprecated 
topology is still able to be run.

## Pull Request Checklist

Thank you for submitting a contribution to Apache Metron.  
Please refer to our [Development 
Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235)
 for the complete guide to follow for contributions.  
Please refer also to our [Build Verification 
Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview)
 for complete smoke testing guides.  


In order to streamline the review of the contribution we ask you follow 
these guidelines and ask you to double check the following:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [x] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?


### For code changes:
- [x] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
- [x] Have you included steps or a guide to how the change may be verified 
and tested manually?
- [x] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
  ```
  mvn -q clean integration-test install && 
dev-utilities/build-utils/verify_licenses.sh 
  ```

- [x] Have you written or updated unit tests and or integration tests to 
verify your changes?
- n/a If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [x] Have you verified the basic functionality of the build by building 
and running locally with Vagrant full-dev environment or the equivalent?

### For documentation related changes:
- [x] Have you ensured that format looks appropriate for the output in 
which it is rendered by building and verifying the site-book? If not then run 
the following commands and the verify changes via 
`site-book/target/site/index.html`:

  ```
  cd site-book
  mvn site
  ```

 Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.
It is also recommended that [travis-ci](https://travis-ci.org) is set up 
for your personal repository such that your branches are built there before 
submitting a pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mmiklavc/metron deprecate-split-join

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/1252.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1252


commit 25e2b657b8f077006cabd1f79d1e7bb2aea28b13
Author: Michael Miklavcic 
Date:   2018-11-03T22:11:44Z

Make unified enrichment topo

[GitHub] metron pull request #1251: METRON-1853: Add shutdown hook to Stellar BaseFun...

2018-11-03 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1251#discussion_r230560963
  
--- Diff: 
metron-stellar/stellar-common/src/main/java/org/apache/metron/stellar/dsl/functions/resolver/BaseFunctionResolver.java
 ---
@@ -94,6 +95,16 @@ public void initialize(Context context) {
 this.context = context;
   }
 
+  /**
+   * Close the Stellar functions.
+   */
+  @Override
+  public void close() throws IOException {
+for (StellarFunctionInfo info : getFunctionInfo()) {
--- End diff --

What about something like this?
```
public void close() throws IOException {
Map errors = new HashMap();
for (StellarFunctionInfo info : getFunctionInfo()) {
  try {
info.getFunction().close();
  } catch (Throwable t) {
errors.put(info.getName(), t);
  }
}
if (!errors.isEmpty()) {
  StringBuilder sb = new StringBuilder();
  sb.append("Unable to close Stellar functions:");
  for (Map.Entry e : errors.entrySet()) {
Throwable throwable = e.getValue();
String eText = String
.format("Exception - Function: %s; Message: %s; Cause: %s", 
e.getKey(), throwable .getMessage(),
throwable .getCause());
sb.append(System.lineSeparator());
sb.append(eText);
  }
  throw new IOException(sb.toString());
}
  }
```


---


[GitHub] metron pull request #1251: METRON-1853: Add shutdown hook to Stellar BaseFun...

2018-11-03 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1251#discussion_r230560387
  
--- Diff: 
metron-stellar/stellar-common/src/main/java/org/apache/metron/stellar/dsl/functions/resolver/BaseFunctionResolver.java
 ---
@@ -94,6 +95,16 @@ public void initialize(Context context) {
 this.context = context;
   }
 
+  /**
+   * Close the Stellar functions.
+   */
+  @Override
+  public void close() throws IOException {
+for (StellarFunctionInfo info : getFunctionInfo()) {
--- End diff --

On a related note, we may want to consider timeouts at some point, but we 
don't do this for initialization either.


---


[GitHub] metron pull request #1251: METRON-1853: Add shutdown hook to Stellar BaseFun...

2018-11-03 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1251#discussion_r230560366
  
--- Diff: 
metron-stellar/stellar-common/src/main/java/org/apache/metron/stellar/dsl/functions/resolver/FunctionResolver.java
 ---
@@ -43,4 +45,11 @@
* @param context Context used to initialize.
*/
   void initialize(Context context);
+
+  /**
+   * Perform any cleanup necessary for the loaded Stellar functions.
+   */
--- End diff --

Default is probably sufficient


---


[GitHub] metron pull request #1251: METRON-1853: Add shutdown hook to Stellar BaseFun...

2018-11-02 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1251#discussion_r230485589
  
--- Diff: 
metron-stellar/stellar-common/src/main/java/org/apache/metron/stellar/dsl/StellarFunction.java
 ---
@@ -23,4 +23,5 @@
   Object apply(List args, Context context) throws ParseException;
   void initialize(Context context);
   boolean isInitialized();
--- End diff --

Yeah, was adding that as I posted.


---


[GitHub] metron issue #1250: METRON-1850: Stellar REST function

2018-11-02 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1250
  
Throwing this out there for consideration - 
https://github.com/apache/metron/pull/1251


---


[GitHub] metron pull request #1251: METRON-1853: Add shutdown hook to Stellar BaseFun...

2018-11-02 Thread mmiklavc
GitHub user mmiklavc opened a pull request:

https://github.com/apache/metron/pull/1251

METRON-1853: Add shutdown hook to Stellar BaseFunctionResolver

## Contributor Comments

https://issues.apache.org/jira/browse/METRON-1853

Noodling on a method to add shutdown hooks to Stellar. Modified the 
StellarFunction interface to include a teardown method. Added a no-op  to 
BaseStellarFunction to make it easy for implementations that don't need a 
teardown.

Rationale - some functions may have long-lived activities that include 
closeable resources. Adding the teardown method enables function implementers 
to close any resources opened during init gracefully.

## Pull Request Checklist

Thank you for submitting a contribution to Apache Metron.  
Please refer to our [Development 
Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235)
 for the complete guide to follow for contributions.  
Please refer also to our [Build Verification 
Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview)
 for complete smoke testing guides.  


In order to streamline the review of the contribution we ask you follow 
these guidelines and ask you to double check the following:

### For all changes:
- [ ] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [ ] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?


### For code changes:
- [ ] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
- [ ] Have you included steps or a guide to how the change may be verified 
and tested manually?
- [ ] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
  ```
  mvn -q clean integration-test install && 
dev-utilities/build-utils/verify_licenses.sh 
  ```

- [ ] Have you written or updated unit tests and or integration tests to 
verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] Have you verified the basic functionality of the build by building 
and running locally with Vagrant full-dev environment or the equivalent?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered by building and verifying the site-book? If not then run 
the following commands and the verify changes via 
`site-book/target/site/index.html`:

  ```
  cd site-book
  mvn site
  ```

 Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.
It is also recommended that [travis-ci](https://travis-ci.org) is set up 
for your personal repository such that your branches are built there before 
submitting a pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mmiklavc/metron stellar-shutdown-hook

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/1251.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1251


commit f5607432293956fc8085690cd979c3113be4b176
Author: Michael Miklavcic 
Date:   2018-11-02T06:00:12Z

Start noodling on implementation.

commit 15af28e6660a1291140bf1333d75c2502de1a96e
Author: Michael Miklavcic 
Date:   2018-11-02T19:03:06Z

Added teardown capability to BaseFunctionResolver




---


[GitHub] metron issue #1250: METRON-1850: Stellar REST function

2018-11-02 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1250
  
It's not just a style issue. It's an architectural issue, otherwise I 
wouldn't care.


---


[GitHub] metron issue #1250: METRON-1850: Stellar REST function

2018-11-02 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1250
  
I'm just not sure we *need* to shut them down. It's not like there's 
writing or state-related concerns that push the need to finish with a formal 
shutdown.  The only time we would ever shutdown a client connection would be 
when we're killing the topology, right? 

I really think we need to get this client code out of the bolts. I looked a 
bit through the Stellar code and if we *really* think we need this, I think it 
can be accomplished through a change to the `FunctionResolver` classes. I'm 
going to noodle on this a bit.


---


[GitHub] metron issue #1250: METRON-1850: Stellar REST function

2018-11-02 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1250
  
@merrimanr - about the need for a close/shutdown hook - why is it needed in 
the first place? I don't believe we shutdown Zookeeper connections explicitly. 
They end when the topology spins down. Why should this be any different?


---


[GitHub] metron issue #1226: METRON-1803: Integrate Cypress with Travis

2018-11-02 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1226
  
For some reason the Travis check shows as still in progress on my end even 
though it shows as passed when I click details. I'm still +1 on this. Thanks 
@tiborm.


---


[GitHub] metron pull request #1250: METRON-1850: Stellar REST function

2018-11-01 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1250#discussion_r230200490
  
--- Diff: 
metron-platform/metron-common/src/main/java/org/apache/metron/common/bolt/ConfiguredParserBolt.java
 ---
@@ -36,4 +44,20 @@ protected SensorParserConfig 
getSensorParserConfig(String sensorType) {
 return getConfigurations().getSensorParserConfig(sensorType);
   }
 
+  @Override
+  public void prepare(Map stormConf, TopologyContext context, 
OutputCollector collector) {
--- End diff --

Same as the comment for enrichment - do we want this tied to the bolts this 
way?


---


[GitHub] metron pull request #1250: METRON-1850: Stellar REST function

2018-11-01 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1250#discussion_r230194731
  
--- Diff: 
metron-stellar/stellar-common/src/main/java/org/apache/metron/stellar/dsl/functions/RestConfig.java
 ---
@@ -0,0 +1,147 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.stellar.dsl.functions;
--- End diff --

How do other Stellar functions handle config? This is the only config class 
I see in stellar-common and I'm wondering if there's another idiom. That aside, 
there are some existing config examples out there, e.g. PcapConfig and 
PcapOptions. Any specific reason you're using this approach rather than an enum 
for your options?


---


[GitHub] metron issue #1226: METRON-1803: Integrate Cypress with Travis

2018-10-30 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1226
  
Excellent addition, though it looks like this change broke your tests.


---


[GitHub] metron issue #1242: METRON-1834: Migrate Elasticsearch from TransportClient ...

2018-10-29 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1242
  
@ottobackwards @nickwallen - just to shed some additional light on the 
choices made here, take a look at the DAO classes. The new 
`ElasticsearchClient` Metron client class lays some foundation towards wrapping 
the ES API. What this PR explicitly does *not* do is completely abstract away 
the ES dependencies. I believe that's a task that we should perform, but I do 
not believe it makes sense to couple it with the task of swapping out the 
client. 


---


[GitHub] metron pull request #1242: METRON-1834: Migrate Elasticsearch from Transport...

2018-10-29 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1242#discussion_r229066940
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/utils/ElasticsearchUtils.java
 ---
@@ -124,102 +129,146 @@ public static String getBaseIndexName(String 
indexName) {
   }
 
   /**
-   * Instantiates an Elasticsearch client based on es.client.class, if 
set. Defaults to
-   * org.elasticsearch.transport.client.PreBuiltTransportClient.
+   * Instantiates an Elasticsearch client
*
* @param globalConfiguration Metron global config
-   * @return
+   * @return new es client
*/
-  public static TransportClient getClient(Map 
globalConfiguration) {
-Set customESSettings = new HashSet<>();
-customESSettings.addAll(Arrays.asList("es.client.class", 
USERNAME_CONFIG_KEY, PWD_FILE_CONFIG_KEY));
-Settings.Builder settingsBuilder = Settings.builder();
-Map esSettings = getEsSettings(globalConfiguration);
-for (Map.Entry entry : esSettings.entrySet()) {
-  String key = entry.getKey();
-  String value = entry.getValue();
-  if (!customESSettings.contains(key)) {
-settingsBuilder.put(key, value);
-  }
-}
-settingsBuilder.put("cluster.name", 
globalConfiguration.get("es.clustername"));
-settingsBuilder.put("client.transport.ping_timeout", 
esSettings.getOrDefault("client.transport.ping_timeout","500s"));
-setXPackSecurityOrNone(settingsBuilder, esSettings);
-
-try {
-  LOG.info("Number of available processors in Netty: {}", 
NettyRuntimeWrapper.availableProcessors());
-  // Netty sets available processors statically and if an attempt is 
made to set it more than
-  // once an IllegalStateException is thrown by 
NettyRuntime.setAvailableProcessors(NettyRuntime.java:87)
-  // 
https://discuss.elastic.co/t/getting-availableprocessors-is-already-set-to-1-rejecting-1-illegalstateexception-exception/103082
-  // 
https://discuss.elastic.co/t/elasticsearch-5-4-1-availableprocessors-is-already-set/88036
-  System.setProperty("es.set.netty.runtime.available.processors", 
"false");
-  TransportClient client = 
createTransportClient(settingsBuilder.build(), esSettings);
-  for (HostnamePort hp : getIps(globalConfiguration)) {
-client.addTransportAddress(
-new 
InetSocketTransportAddress(InetAddress.getByName(hp.hostname), hp.port)
-);
-  }
-  return client;
-} catch (UnknownHostException exception) {
-  throw new RuntimeException(exception);
-}
+  public static ElasticsearchClient getClient(Map 
globalConfiguration) {
--- End diff --

@ottobackwards I was looking at that, but with the callbacks in the ES 
client API, a builder pattern is really difficult to use here.


---


[GitHub] metron issue #1226: METRON-1803: Integrate Cypress with Travis

2018-10-29 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1226
  
I think I'm good on this @tiborm. I also ran the tests locally and 
everything passed as expected. I did notice that this adds 2 minutes to the 
metron-alerts build - I expect that is because this change is completely 
additive and is not replacing any Protractor tests at this time. Nice work, 
thanks for the contribution. +1


---


[GitHub] metron issue #1242: METRON-1834: Migrate Elasticsearch from TransportClient ...

2018-10-29 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1242
  
I think we're all on the same page @ottobackwards, @nickwallen, and 
@cestella . As I mentioned early on, this effort was somewhat unique as it was 
built off of some POC work from @cestella from a couple months back, and I 
wanted to enhance that work (e.g. addressing the TODO's, bugs, and 
configuration issues mentioned) while refactoring/changing as little as 
possible. That's on me for not having gotten out in front of polishing a few of 
the common/reusable pieces a bit more, so your comments are well received. 
Updates soon to follow.


---


[GitHub] metron issue #1242: METRON-1834: Migrate Elasticsearch from TransportClient ...

2018-10-26 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1242
  
I've started addressing some of the questions you had @nickwallen. Not 
quite finished yet, but the rest should be soon to follow. I'd like to clean up 
the way we handle the ElasticsearchClient and its instantiation a bit.


---


[GitHub] metron pull request #1242: METRON-1834: Migrate Elasticsearch from Transport...

2018-10-26 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1242#discussion_r228666108
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/utils/ElasticsearchUtils.java
 ---
@@ -124,102 +129,146 @@ public static String getBaseIndexName(String 
indexName) {
   }
 
   /**
-   * Instantiates an Elasticsearch client based on es.client.class, if 
set. Defaults to
-   * org.elasticsearch.transport.client.PreBuiltTransportClient.
+   * Instantiates an Elasticsearch client
*
* @param globalConfiguration Metron global config
-   * @return
+   * @return new es client
*/
-  public static TransportClient getClient(Map 
globalConfiguration) {
-Set customESSettings = new HashSet<>();
-customESSettings.addAll(Arrays.asList("es.client.class", 
USERNAME_CONFIG_KEY, PWD_FILE_CONFIG_KEY));
-Settings.Builder settingsBuilder = Settings.builder();
-Map esSettings = getEsSettings(globalConfiguration);
-for (Map.Entry entry : esSettings.entrySet()) {
-  String key = entry.getKey();
-  String value = entry.getValue();
-  if (!customESSettings.contains(key)) {
-settingsBuilder.put(key, value);
-  }
-}
-settingsBuilder.put("cluster.name", 
globalConfiguration.get("es.clustername"));
-settingsBuilder.put("client.transport.ping_timeout", 
esSettings.getOrDefault("client.transport.ping_timeout","500s"));
-setXPackSecurityOrNone(settingsBuilder, esSettings);
-
-try {
-  LOG.info("Number of available processors in Netty: {}", 
NettyRuntimeWrapper.availableProcessors());
-  // Netty sets available processors statically and if an attempt is 
made to set it more than
-  // once an IllegalStateException is thrown by 
NettyRuntime.setAvailableProcessors(NettyRuntime.java:87)
-  // 
https://discuss.elastic.co/t/getting-availableprocessors-is-already-set-to-1-rejecting-1-illegalstateexception-exception/103082
-  // 
https://discuss.elastic.co/t/elasticsearch-5-4-1-availableprocessors-is-already-set/88036
-  System.setProperty("es.set.netty.runtime.available.processors", 
"false");
-  TransportClient client = 
createTransportClient(settingsBuilder.build(), esSettings);
-  for (HostnamePort hp : getIps(globalConfiguration)) {
-client.addTransportAddress(
-new 
InetSocketTransportAddress(InetAddress.getByName(hp.hostname), hp.port)
-);
-  }
-  return client;
-} catch (UnknownHostException exception) {
-  throw new RuntimeException(exception);
-}
+  public static ElasticsearchClient getClient(Map 
globalConfiguration) {
--- End diff --

I'll refactor this a bit.


---


[GitHub] metron pull request #1242: METRON-1834: Migrate Elasticsearch from Transport...

2018-10-26 Thread mmiklavc
Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1242#discussion_r228664884
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/config/ElasticsearchClientOptions.java
 ---
@@ -0,0 +1,60 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.metron.elasticsearch.config;
+
+import org.apache.metron.common.configuration.ConfigOption;
+
+public enum ElasticsearchClientOptions implements ConfigOption {
--- End diff --

One is an enum of the options for the elasticsearch client. The other is 
the config used to access the configuration settings provided via global config 
(or other method, if a different context were ever used).


---


[GitHub] metron issue #1242: METRON-1834: Migrate Elasticsearch from TransportClient ...

2018-10-26 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1242
  
Thanks for the feedback/viewpoint @cestella . I think @nickwallen has some 
good points, in particular about why/when to choose one client or the other. 
I'm adding some of that detail now, and am linking aggressively to the ES docs 
so that we're not simply reproducing their documentation. In general, I do NOT 
think we should be writing javadoc like this - 
https://docs.oracle.com/javase/8/docs/api/java/util/Arrays.html. It is 
certainly detailed, but arguably pedantic for our purposes. Oh, and we should 
also be documenting Stellar functions - which is a feature we already support 
via the annotations you added some time ago.


---


[GitHub] metron issue #1242: METRON-1834: Migrate Elasticsearch from TransportClient ...

2018-10-26 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1242
  
I'm happy to clarify variable and method names here, but I want to be 
careful about overdoing it on the javadoc. It gets stale, is not testable, and 
is far more often overlooked than the code itself. I'm all for javadoc on 
public API's.


---


[GitHub] metron issue #789: METRON-1233: Remove description of Global configuration f...

2018-10-24 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/789
  
@DimDroll @ottobackwards - All of our topologies pull in the global config. 
One such more recent example is that we now provide an option for specifying 
batching details for enrichment:
```
enrichment.writer.batchSize
enrichment.writer.batchTimeout
```

The reason for this is that we don't have a global-only-per-topology type 
of configuration, with the exception parsers because of how they can be 
deployed independently. If you think there's some better clarification that 
could be made, I'm open to it. But I think the link between parsers, 
enrichment, indexing, and the global configs should be maintained as it is 
relevant.




---


[GitHub] metron issue #1242: METRON-1834: Migrate Elasticsearch from TransportClient ...

2018-10-23 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1242
  
Note, I force pushed this branch with revised history after the revert done 
by @nickwallen on https://github.com/apache/metron/pull/1218. This is 
effectively a merge with the latest master minus 1218 changes while we resolve 
the latent issues with ES and doc ids and guids.


---


[GitHub] metron issue #1243: METRON-1831 Project Version Substitution Not Working

2018-10-19 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1243
  
+1 by inspection @nickwallen, thanks for the fix.


---


[GitHub] metron issue #1242: METRON-1834: Migrate Elasticsearch from TransportClient ...

2018-10-19 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1242
  
A recent merge with https://github.com/apache/metron/pull/1218 has caused a 
number of integration test failures. Looking into it.


---


[GitHub] metron pull request #1242: METRON-1834: Migrate Elasticsearch from Transport...

2018-10-19 Thread mmiklavc
GitHub user mmiklavc opened a pull request:

https://github.com/apache/metron/pull/1242

METRON-1834: Migrate Elasticsearch from TransportClient to new Java REST API

## Contributor Comments

https://issues.apache.org/jira/browse/METRON-1834

This task has been a long time coming after having completed the ES upgrade 
in https://issues.apache.org/jira/browse/METRON-939. Motivation for completing 
this now is that Elasticsearch will be deprecating use of the TransportClient 
in v 7.x. This PR migrates the Elasticsearch client from TransportClient to the 
newer Java REST API. 

1. 
https://www.elastic.co/guide/en/elasticsearch/client/java-api/5.6/client.html
2. 
https://www.elastic.co/guide/en/elasticsearch/client/java-rest/5.6/java-rest-overview.html
3. 
https://www.elastic.co/guide/en/elasticsearch/client/java-rest/5.6/java-rest-high-level-migration.html

This builds off and finishes work started by @cestella here - 
https://github.com/cestella/incubator-metron/tree/es_rest_client. I condensed 
his branch into 1 flattened commit and built on top of it in order to provide 
attribution.

I have a number of tasks I'm still working through, but I wanted to get the 
review process started. I've minimally validated X-Pack auth and will have some 
follow-up for SSL. Test plans and a breakdown of the changes will be soon to 
follow. For starters, full dev should continue to work as normal and you should 
see data flowing into indexes for bro, snort, and yaf. There are some 
additional changes to how this client will be configured, which I'll be 
documenting shortly. The new client does not take a Map of settings any longer 
now that it is leverage Apache HTTP Async Client 
https://www.elastic.co/guide/en/elasticsearch/client/java-rest/5.6/java-rest-low-usage-dependencies.html
 under the hood. This meant choosing a set of properties to expose and doing a 
translation to the builder pattern under the hood. Again, I'll have a write-up 
of this in the migration guide and update the README's accordingly.

NOTE: This checks off 2 items from this follow-on list 
https://github.com/apache/metron/pull/840#issuecomment-347281776

1. Fix Log4j logging problem - classpath issues
2. Migrate to new ES REST client

Per discussion in the Metron Slack channel, I will be updating the Jira 
ticket with a series of tasks to be completed prior to acceptance, including 
performance regression testing compared with the old API.

## Pull Request Checklist

Thank you for submitting a contribution to Apache Metron.  
Please refer to our [Development 
Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235)
 for the complete guide to follow for contributions.  
Please refer also to our [Build Verification 
Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview)
 for complete smoke testing guides.  


In order to streamline the review of the contribution we ask you follow 
these guidelines and ask you to double check the following:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [x] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?


### For code changes:
- [ ] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
- [ ] Have you included steps or a guide to how the change may be verified 
and tested manually?
- [ ] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
  ```
  mvn -q clean integration-test install && 
dev-utilities/build-utils/verify_licenses.sh 
  ```

- [ ] Have you written or updated unit tests and or integration tests to 
verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] Have you verified the basic functionality of the build by building 
and running locally with Vagrant full-dev environment or the equivalent?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered by building and verifying the site-book? If not then run 
the following commands and the verify changes via 
`site-book/target/site/index.html`:

  ```
  cd site-book
  mvn site
  ```

 Note:
Pleas

[GitHub] metron pull request #1241: METRON-1833: Management UI incorrectly displaying...

2018-10-19 Thread mmiklavc
GitHub user mmiklavc opened a pull request:

https://github.com/apache/metron/pull/1241

METRON-1833: Management UI incorrectly displaying sensor topology latency 
units as seconds instead of millis

## Contributor Comments

https://issues.apache.org/jira/browse/METRON-1833

Management UI is displaying units for topology latency as seconds instead 
of milliseconds. This value should be identical to that reported in the Storm 
UI. The change is straightforward - spin up full dev and the UI should show 
"ms" instead of "s"

## Pull Request Checklist

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [x] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?


### For code changes:
- [x] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
- [x] Have you included steps or a guide to how the change may be verified 
and tested manually?
- [ ] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
  ```
  mvn -q clean integration-test install && 
dev-utilities/build-utils/verify_licenses.sh 
  ```

- N/A Have you written or updated unit tests and or integration tests to 
verify your changes?
- N/A If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] Have you verified the basic functionality of the build by building 
and running locally with Vagrant full-dev environment or the equivalent?

### For documentation related changes:
- N/A Have you ensured that format looks appropriate for the output in 
which it is rendered by building and verifying the site-book? If not then run 
the following commands and the verify changes via 
`site-book/target/site/index.html`:

  ```
  cd site-book
  mvn site
  ```

 Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.
It is also recommended that [travis-ci](https://travis-ci.org) is set up 
for your personal repository such that your branches are built there before 
submitting a pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mmiklavc/metron ui-storm-latency

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/1241.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1241


commit 4b26720e0581721470ea39198abee6bb5fa7bea6
Author: Michael Miklavcic 
Date:   2018-10-19T19:34:49Z

Fix latency measurement.




---


[GitHub] metron issue #1235: METRON-1823 Refactor Elasticsearch Configuration Setting...

2018-10-18 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1235
  
@justinleet correct, I have a number of changes that will overlap with or 
replace this in the ES client upgrade/migration. In particular the utils class 
changes.


---


[GitHub] metron issue #1213: METRON-1681: Decouple the ParserBolt from the Parse exec...

2018-10-18 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1213
  
Good suggestion @ottobackwards. lgtm @merrimanr, still +1 from me.


---


[GitHub] metron issue #1213: METRON-1681: Decouple the ParserBolt from the Parse exec...

2018-10-17 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1213
  
lgtm, +1 by inspection. Thanks for the collaborative effort on this 
@merrimanr and @ottobackwards!


---


[GitHub] metron issue #1213: METRON-1681: Decouple the ParserBolt from the Parse exec...

2018-10-16 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1213
  
@ottobackwards - @merrimanr and I had a chat about this offline and 
realized there was another alternative to what we had come up with in the 
multiline grok and syslog PRs. Basically, just making the existing methods in 
MessageParser `@deprecated` and with a new default implementation for 
`parse(..)` that throws a `NotImplementedException`. This does clean up some of 
the shrapnel we had discussed on 1234 such that there's no further need for a 
decorator.

@merrimanr - the only thing missing here I think is to mark the `parse` and 
`parseOptional` methods `@deprecated`. Also, I didn't see a delete of 
`MultilineMessageParser` in your latest commits to go with that refactor/merge. 
We probably want to get rid of it altogether if we take this alternate 
approach. Also, I think we should make a more public statement on the user and 
dev list about deprecating the methods if we all agree on this approach to give 
users time to migrate existing custom parser implementations.




---


[GitHub] metron issue #870: METRON-1364: Add an implementation of Robust PCA outlier ...

2018-10-15 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/870
  
Any updates on this PR?


---


[GitHub] metron issue #526: Metron-846: Add E2E tests for metron management ui

2018-10-15 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/526
  
Is this PR still relevant with the ongoing Cypress work?


---


  1   2   3   4   >