[jira] [Commented] (METRON-1569) Allow user to change field name conversion when indexing to Elasticsearch

2018-05-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495275#comment-16495275
 ] 

ASF GitHub Bot commented on METRON-1569:


Github user asfgit closed the pull request at:

https://github.com/apache/metron/pull/1022


> Allow user to change field name conversion when indexing to Elasticsearch
> -
>
> Key: METRON-1569
> URL: https://issues.apache.org/jira/browse/METRON-1569
> Project: Metron
>  Issue Type: Improvement
>Reporter: Nick Allen
>Assignee: Nick Allen
>Priority: Major
>
> The `ElasticsearchWriter` has a mechanism to transform the field names of a 
> message before it is written to Elasticsearch.  Right now this mechanism is 
> hard-coded to replace all '.' dots with ':' colons.
> This mechanism was needed for Elasticsearch 2.x which did not allow dots in 
> field names.  Now that Metron supports Elasticsearch 5.x this is no longer a 
> problem.
> A user should be able to configure the field name transformation when writing 
> to Elasticsearch, as needed.  
> While it might have been simpler to just remove the de-dotting mechanism, 
> this would break backwards compatibility.  Providing users with a means to 
> configure this mechanism provides them with an upgrade path.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1569) Allow user to change field name conversion when indexing to Elasticsearch

2018-05-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495266#comment-16495266
 ] 

ASF GitHub Bot commented on METRON-1569:


Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1022
  
+1 by inspection. This is great @nickwallen, thanks for the contribution.


> Allow user to change field name conversion when indexing to Elasticsearch
> -
>
> Key: METRON-1569
> URL: https://issues.apache.org/jira/browse/METRON-1569
> Project: Metron
>  Issue Type: Improvement
>Reporter: Nick Allen
>Assignee: Nick Allen
>Priority: Major
>
> The `ElasticsearchWriter` has a mechanism to transform the field names of a 
> message before it is written to Elasticsearch.  Right now this mechanism is 
> hard-coded to replace all '.' dots with ':' colons.
> This mechanism was needed for Elasticsearch 2.x which did not allow dots in 
> field names.  Now that Metron supports Elasticsearch 5.x this is no longer a 
> problem.
> A user should be able to configure the field name transformation when writing 
> to Elasticsearch, as needed.  
> While it might have been simpler to just remove the de-dotting mechanism, 
> this would break backwards compatibility.  Providing users with a means to 
> configure this mechanism provides them with an upgrade path.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1569) Allow user to change field name conversion when indexing to Elasticsearch

2018-05-29 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494330#comment-16494330
 ] 

ASF GitHub Bot commented on METRON-1569:


Github user cestella commented on the issue:

https://github.com/apache/metron/pull/1022
  
Ok, I'm comfortable with the change now.  +1 by inspection, pending 
@mmiklavc 


> Allow user to change field name conversion when indexing to Elasticsearch
> -
>
> Key: METRON-1569
> URL: https://issues.apache.org/jira/browse/METRON-1569
> Project: Metron
>  Issue Type: Improvement
>Reporter: Nick Allen
>Assignee: Nick Allen
>Priority: Major
>
> The `ElasticsearchWriter` has a mechanism to transform the field names of a 
> message before it is written to Elasticsearch.  Right now this mechanism is 
> hard-coded to replace all '.' dots with ':' colons.
> This mechanism was needed for Elasticsearch 2.x which did not allow dots in 
> field names.  Now that Metron supports Elasticsearch 5.x this is no longer a 
> problem.
> A user should be able to configure the field name transformation when writing 
> to Elasticsearch, as needed.  
> While it might have been simpler to just remove the de-dotting mechanism, 
> this would break backwards compatibility.  Providing users with a means to 
> configure this mechanism provides them with an upgrade path.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1569) Allow user to change field name conversion when indexing to Elasticsearch

2018-05-29 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494329#comment-16494329
 ] 

ASF GitHub Bot commented on METRON-1569:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/1022#discussion_r191585931
  
--- Diff: 
metron-platform/metron-common/src/main/java/org/apache/metron/common/field/FieldNameConverters.java
 ---
@@ -0,0 +1,98 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.metron.common.field;
+
+import org.apache.commons.lang.ClassUtils;
+import org.apache.commons.lang3.exception.ExceptionUtils;
+import org.apache.metron.common.configuration.writer.WriterConfiguration;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+
+/**
+ * Enumerates a set of {@link FieldNameConverter} implementations.
+ *
+ * Provides shared instances of each {@link FieldNameConverter}.
+ *
+ * Allows the field name converter to be specified using a short-hand
+ * name, rather than the entire fully-qualified class name.
+ */
+public enum FieldNameConverters {
--- End diff --

Ok.  Check 'er out.


> Allow user to change field name conversion when indexing to Elasticsearch
> -
>
> Key: METRON-1569
> URL: https://issues.apache.org/jira/browse/METRON-1569
> Project: Metron
>  Issue Type: Improvement
>Reporter: Nick Allen
>Assignee: Nick Allen
>Priority: Major
>
> The `ElasticsearchWriter` has a mechanism to transform the field names of a 
> message before it is written to Elasticsearch.  Right now this mechanism is 
> hard-coded to replace all '.' dots with ':' colons.
> This mechanism was needed for Elasticsearch 2.x which did not allow dots in 
> field names.  Now that Metron supports Elasticsearch 5.x this is no longer a 
> problem.
> A user should be able to configure the field name transformation when writing 
> to Elasticsearch, as needed.  
> While it might have been simpler to just remove the de-dotting mechanism, 
> this would break backwards compatibility.  Providing users with a means to 
> configure this mechanism provides them with an upgrade path.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1569) Allow user to change field name conversion when indexing to Elasticsearch

2018-05-29 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494273#comment-16494273
 ] 

ASF GitHub Bot commented on METRON-1569:


Github user cestella commented on a diff in the pull request:

https://github.com/apache/metron/pull/1022#discussion_r191574312
  
--- Diff: 
metron-platform/metron-common/src/main/java/org/apache/metron/common/field/FieldNameConverters.java
 ---
@@ -0,0 +1,98 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.metron.common.field;
+
+import org.apache.commons.lang.ClassUtils;
+import org.apache.commons.lang3.exception.ExceptionUtils;
+import org.apache.metron.common.configuration.writer.WriterConfiguration;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+
+/**
+ * Enumerates a set of {@link FieldNameConverter} implementations.
+ *
+ * Provides shared instances of each {@link FieldNameConverter}.
+ *
+ * Allows the field name converter to be specified using a short-hand
+ * name, rather than the entire fully-qualified class name.
+ */
+public enum FieldNameConverters {
--- End diff --

Minor nit, can we have this implement `FieldNameConverter` so the enum can 
be used directly without calling `get()`?


> Allow user to change field name conversion when indexing to Elasticsearch
> -
>
> Key: METRON-1569
> URL: https://issues.apache.org/jira/browse/METRON-1569
> Project: Metron
>  Issue Type: Improvement
>Reporter: Nick Allen
>Assignee: Nick Allen
>Priority: Major
>
> The `ElasticsearchWriter` has a mechanism to transform the field names of a 
> message before it is written to Elasticsearch.  Right now this mechanism is 
> hard-coded to replace all '.' dots with ':' colons.
> This mechanism was needed for Elasticsearch 2.x which did not allow dots in 
> field names.  Now that Metron supports Elasticsearch 5.x this is no longer a 
> problem.
> A user should be able to configure the field name transformation when writing 
> to Elasticsearch, as needed.  
> While it might have been simpler to just remove the de-dotting mechanism, 
> this would break backwards compatibility.  Providing users with a means to 
> configure this mechanism provides them with an upgrade path.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1569) Allow user to change field name conversion when indexing to Elasticsearch

2018-05-29 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494268#comment-16494268
 ] 

ASF GitHub Bot commented on METRON-1569:


Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/1022
  
All review feedback should have been addressed.  Let me know if there is 
anything else @mmiklavc @cestella 


> Allow user to change field name conversion when indexing to Elasticsearch
> -
>
> Key: METRON-1569
> URL: https://issues.apache.org/jira/browse/METRON-1569
> Project: Metron
>  Issue Type: Improvement
>Reporter: Nick Allen
>Assignee: Nick Allen
>Priority: Major
>
> The `ElasticsearchWriter` has a mechanism to transform the field names of a 
> message before it is written to Elasticsearch.  Right now this mechanism is 
> hard-coded to replace all '.' dots with ':' colons.
> This mechanism was needed for Elasticsearch 2.x which did not allow dots in 
> field names.  Now that Metron supports Elasticsearch 5.x this is no longer a 
> problem.
> A user should be able to configure the field name transformation when writing 
> to Elasticsearch, as needed.  
> While it might have been simpler to just remove the de-dotting mechanism, 
> this would break backwards compatibility.  Providing users with a means to 
> configure this mechanism provides them with an upgrade path.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1569) Allow user to change field name conversion when indexing to Elasticsearch

2018-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16491144#comment-16491144
 ] 

ASF GitHub Bot commented on METRON-1569:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/1022#discussion_r190977652
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java
 ---
@@ -0,0 +1,155 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.elasticsearch.writer;
+
+import com.github.benmanes.caffeine.cache.Cache;
+import com.github.benmanes.caffeine.cache.Caffeine;
+import org.apache.commons.lang.ClassUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.commons.lang3.exception.ExceptionUtils;
+import org.apache.metron.common.configuration.writer.WriterConfiguration;
+import org.apache.metron.common.field.DeDotFieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverters;
+import org.apache.metron.common.field.NoopFieldNameConverter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * A {@link FieldNameConverterFactory} that is backed by a cache.
+ *
+ * Each sensor type can use a different {@link FieldNameConverter} 
implementation.
+ *
+ * The {@link WriterConfiguration} allows a user to define the {@link 
FieldNameConverter}
+ * that should be used for a given sensor type.
+ *
+ * The {@link FieldNameConverter}s are maintained in a cache for a 
fixed period of time
+ * after they are created.  Once they expire, the {@link 
WriterConfiguration} is used to
+ * reload the {@link FieldNameConverter}.
+ *
+ * The user can change the {@link FieldNameConverter} in use at 
runtime. A change
--- End diff --

I can add the `create` to the `FieldNameConverters`.  That makes sense 
here.  Its going to be slightly different than what you have above, but is what 
you were going for.


> Allow user to change field name conversion when indexing to Elasticsearch
> -
>
> Key: METRON-1569
> URL: https://issues.apache.org/jira/browse/METRON-1569
> Project: Metron
>  Issue Type: Improvement
>Reporter: Nick Allen
>Assignee: Nick Allen
>Priority: Major
>
> The `ElasticsearchWriter` has a mechanism to transform the field names of a 
> message before it is written to Elasticsearch.  Right now this mechanism is 
> hard-coded to replace all '.' dots with ':' colons.
> This mechanism was needed for Elasticsearch 2.x which did not allow dots in 
> field names.  Now that Metron supports Elasticsearch 5.x this is no longer a 
> problem.
> A user should be able to configure the field name transformation when writing 
> to Elasticsearch, as needed.  
> While it might have been simpler to just remove the de-dotting mechanism, 
> this would break backwards compatibility.  Providing users with a means to 
> configure this mechanism provides them with an upgrade path.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1569) Allow user to change field name conversion when indexing to Elasticsearch

2018-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16491097#comment-16491097
 ] 

ASF GitHub Bot commented on METRON-1569:


Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1022#discussion_r190970420
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java
 ---
@@ -0,0 +1,155 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.elasticsearch.writer;
+
+import com.github.benmanes.caffeine.cache.Cache;
+import com.github.benmanes.caffeine.cache.Caffeine;
+import org.apache.commons.lang.ClassUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.commons.lang3.exception.ExceptionUtils;
+import org.apache.metron.common.configuration.writer.WriterConfiguration;
+import org.apache.metron.common.field.DeDotFieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverters;
+import org.apache.metron.common.field.NoopFieldNameConverter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * A {@link FieldNameConverterFactory} that is backed by a cache.
+ *
+ * Each sensor type can use a different {@link FieldNameConverter} 
implementation.
+ *
+ * The {@link WriterConfiguration} allows a user to define the {@link 
FieldNameConverter}
+ * that should be used for a given sensor type.
+ *
+ * The {@link FieldNameConverter}s are maintained in a cache for a 
fixed period of time
+ * after they are created.  Once they expire, the {@link 
WriterConfiguration} is used to
+ * reload the {@link FieldNameConverter}.
+ *
+ * The user can change the {@link FieldNameConverter} in use at 
runtime. A change
--- End diff --

Ok, I see what you mean. Agreed, it would be good to have some tests around 
that `write` method at some point, but I think it's ok to keep that out of 
scope for this particular PR. We have a WriterBoltIntegrationTest, but it's 
only covering parsers right now. At a later time we could look at expanding it 
to cover multiple writers and topologies and provide hooks on those builders to 
enable one-time setup and teardown across a suite of tests.


> Allow user to change field name conversion when indexing to Elasticsearch
> -
>
> Key: METRON-1569
> URL: https://issues.apache.org/jira/browse/METRON-1569
> Project: Metron
>  Issue Type: Improvement
>Reporter: Nick Allen
>Assignee: Nick Allen
>Priority: Major
>
> The `ElasticsearchWriter` has a mechanism to transform the field names of a 
> message before it is written to Elasticsearch.  Right now this mechanism is 
> hard-coded to replace all '.' dots with ':' colons.
> This mechanism was needed for Elasticsearch 2.x which did not allow dots in 
> field names.  Now that Metron supports Elasticsearch 5.x this is no longer a 
> problem.
> A user should be able to configure the field name transformation when writing 
> to Elasticsearch, as needed.  
> While it might have been simpler to just remove the de-dotting mechanism, 
> this would break backwards compatibility.  Providing users with a means to 
> configure this mechanism provides them with an upgrade path.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1569) Allow user to change field name conversion when indexing to Elasticsearch

2018-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16491077#comment-16491077
 ] 

ASF GitHub Bot commented on METRON-1569:


Github user cestella commented on a diff in the pull request:

https://github.com/apache/metron/pull/1022#discussion_r190966698
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/SharedFieldNameConverterFactory.java
 ---
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.elasticsearch.writer;
+
+import org.apache.commons.lang.ClassUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.commons.lang3.exception.ExceptionUtils;
+import org.apache.metron.common.configuration.writer.WriterConfiguration;
+import org.apache.metron.common.field.FieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverters;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+
+/**
+ * A {@link FieldNameConverterFactory} that returns instances of {@link 
FieldNameConverter}
+ * that are shared and reused.
+ *
+ * The instances are created and managed by the {@link 
FieldNameConverters} class.
+ */
+public class SharedFieldNameConverterFactory implements 
FieldNameConverterFactory {
+
--- End diff --

This looks like it's moving in the right direction, but as @mmiklavc gave 
an example for, why would we not replace this class with a `static 
create(String sensorType, WriterConfig config)` method on the enum?


> Allow user to change field name conversion when indexing to Elasticsearch
> -
>
> Key: METRON-1569
> URL: https://issues.apache.org/jira/browse/METRON-1569
> Project: Metron
>  Issue Type: Improvement
>Reporter: Nick Allen
>Assignee: Nick Allen
>Priority: Major
>
> The `ElasticsearchWriter` has a mechanism to transform the field names of a 
> message before it is written to Elasticsearch.  Right now this mechanism is 
> hard-coded to replace all '.' dots with ':' colons.
> This mechanism was needed for Elasticsearch 2.x which did not allow dots in 
> field names.  Now that Metron supports Elasticsearch 5.x this is no longer a 
> problem.
> A user should be able to configure the field name transformation when writing 
> to Elasticsearch, as needed.  
> While it might have been simpler to just remove the de-dotting mechanism, 
> this would break backwards compatibility.  Providing users with a means to 
> configure this mechanism provides them with an upgrade path.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1569) Allow user to change field name conversion when indexing to Elasticsearch

2018-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16491065#comment-16491065
 ] 

ASF GitHub Bot commented on METRON-1569:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/1022#discussion_r190964321
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java
 ---
@@ -0,0 +1,155 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.elasticsearch.writer;
+
+import com.github.benmanes.caffeine.cache.Cache;
+import com.github.benmanes.caffeine.cache.Caffeine;
+import org.apache.commons.lang.ClassUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.commons.lang3.exception.ExceptionUtils;
+import org.apache.metron.common.configuration.writer.WriterConfiguration;
+import org.apache.metron.common.field.DeDotFieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverters;
+import org.apache.metron.common.field.NoopFieldNameConverter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * A {@link FieldNameConverterFactory} that is backed by a cache.
+ *
+ * Each sensor type can use a different {@link FieldNameConverter} 
implementation.
+ *
+ * The {@link WriterConfiguration} allows a user to define the {@link 
FieldNameConverter}
+ * that should be used for a given sensor type.
+ *
+ * The {@link FieldNameConverter}s are maintained in a cache for a 
fixed period of time
+ * after they are created.  Once they expire, the {@link 
WriterConfiguration} is used to
+ * reload the {@link FieldNameConverter}.
+ *
+ * The user can change the {@link FieldNameConverter} in use at 
runtime. A change
--- End diff --

> I'm not sure I follow about the testing

If you look closely at those tests, all that class tests is the response 
using `buildWriteReponse`.  It does not actually test the class using the 
`write` method.  When you try to do that, it blows up because of dependency 
issues.


> Allow user to change field name conversion when indexing to Elasticsearch
> -
>
> Key: METRON-1569
> URL: https://issues.apache.org/jira/browse/METRON-1569
> Project: Metron
>  Issue Type: Improvement
>Reporter: Nick Allen
>Assignee: Nick Allen
>Priority: Major
>
> The `ElasticsearchWriter` has a mechanism to transform the field names of a 
> message before it is written to Elasticsearch.  Right now this mechanism is 
> hard-coded to replace all '.' dots with ':' colons.
> This mechanism was needed for Elasticsearch 2.x which did not allow dots in 
> field names.  Now that Metron supports Elasticsearch 5.x this is no longer a 
> problem.
> A user should be able to configure the field name transformation when writing 
> to Elasticsearch, as needed.  
> While it might have been simpler to just remove the de-dotting mechanism, 
> this would break backwards compatibility.  Providing users with a means to 
> configure this mechanism provides them with an upgrade path.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1569) Allow user to change field name conversion when indexing to Elasticsearch

2018-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16491024#comment-16491024
 ] 

ASF GitHub Bot commented on METRON-1569:


Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1022#discussion_r190956913
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java
 ---
@@ -0,0 +1,155 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.elasticsearch.writer;
+
+import com.github.benmanes.caffeine.cache.Cache;
+import com.github.benmanes.caffeine.cache.Caffeine;
+import org.apache.commons.lang.ClassUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.commons.lang3.exception.ExceptionUtils;
+import org.apache.metron.common.configuration.writer.WriterConfiguration;
+import org.apache.metron.common.field.DeDotFieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverters;
+import org.apache.metron.common.field.NoopFieldNameConverter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * A {@link FieldNameConverterFactory} that is backed by a cache.
+ *
+ * Each sensor type can use a different {@link FieldNameConverter} 
implementation.
+ *
+ * The {@link WriterConfiguration} allows a user to define the {@link 
FieldNameConverter}
+ * that should be used for a given sensor type.
+ *
+ * The {@link FieldNameConverter}s are maintained in a cache for a 
fixed period of time
+ * after they are created.  Once they expire, the {@link 
WriterConfiguration} is used to
+ * reload the {@link FieldNameConverter}.
+ *
+ * The user can change the {@link FieldNameConverter} in use at 
runtime. A change
--- End diff --

Yes, agreed on the possible need for handling other writers. Is there a 
reason we wouldn't put that logic in the enum factory rather than creating 
another new class for it? It wouldn't even need to depend on the 
WriterConfiguration, actually.

```
public enum FieldNameConverters implements FieldNameConverter {

  NOOP(new NoopFieldNameConverter()),
  DEDOT(new DeDotFieldNameConverter());

  private FieldNameConverter converter;

  FieldNameConverters(FieldNameConverter converter) {
this.converter = converter;
  }

  public FieldNameConverter create(String sensorType, Optional 
converterName) {
// default to the 'DEDOT' field name converter to maintain backwards 
compatibility
   return FieldNameConverters.valueOf(converterName.orElse("DEDOT"));
  }

  @Override
  public String convert(String originalField) {
return converter.convert(originalField);
  }
}
```

I'm not sure I follow about the testing - do you mean integration testing? 
We have an ElasticsearchWriterTest class that @justinleet wrote - 
https://github.com/apache/metron/blob/master/metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/writer/ElasticsearchWriterTest.java


> Allow user to change field name conversion when indexing to Elasticsearch
> -
>
> Key: METRON-1569
> URL: https://issues.apache.org/jira/browse/METRON-1569
> Project: Metron
>  Issue Type: Improvement
>Reporter: Nick Allen
>Assignee: Nick Allen
>Priority: Major
>
> The `ElasticsearchWriter` has a mechanism to transform the field names of a 
> message before it is written to Elasticsearch.  Right now this mechanism is 
> hard-coded to replace all '.' dots with ':' colons.
> This mechanism was needed for Elasticsearch 2.x which did not allow dots in 
> 

[jira] [Commented] (METRON-1569) Allow user to change field name conversion when indexing to Elasticsearch

2018-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490992#comment-16490992
 ] 

ASF GitHub Bot commented on METRON-1569:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/1022#discussion_r190950314
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java
 ---
@@ -0,0 +1,155 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.elasticsearch.writer;
+
+import com.github.benmanes.caffeine.cache.Cache;
+import com.github.benmanes.caffeine.cache.Caffeine;
+import org.apache.commons.lang.ClassUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.commons.lang3.exception.ExceptionUtils;
+import org.apache.metron.common.configuration.writer.WriterConfiguration;
+import org.apache.metron.common.field.DeDotFieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverters;
+import org.apache.metron.common.field.NoopFieldNameConverter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * A {@link FieldNameConverterFactory} that is backed by a cache.
+ *
+ * Each sensor type can use a different {@link FieldNameConverter} 
implementation.
+ *
+ * The {@link WriterConfiguration} allows a user to define the {@link 
FieldNameConverter}
+ * that should be used for a given sensor type.
+ *
+ * The {@link FieldNameConverter}s are maintained in a cache for a 
fixed period of time
+ * after they are created.  Once they expire, the {@link 
WriterConfiguration} is used to
+ * reload the {@link FieldNameConverter}.
+ *
+ * The user can change the {@link FieldNameConverter} in use at 
runtime. A change
--- End diff --

The prior commit 'b18c3bf' was just me fixing my original caching 
implementation, so at least I had a version of that with the bug fixed.

The latest commit 'cee3994' removes the cache and relies on the 
`FactoryNameConverters` class to hand out shared instances.  This is what we 
have discussed.


> Allow user to change field name conversion when indexing to Elasticsearch
> -
>
> Key: METRON-1569
> URL: https://issues.apache.org/jira/browse/METRON-1569
> Project: Metron
>  Issue Type: Improvement
>Reporter: Nick Allen
>Assignee: Nick Allen
>Priority: Major
>
> The `ElasticsearchWriter` has a mechanism to transform the field names of a 
> message before it is written to Elasticsearch.  Right now this mechanism is 
> hard-coded to replace all '.' dots with ':' colons.
> This mechanism was needed for Elasticsearch 2.x which did not allow dots in 
> field names.  Now that Metron supports Elasticsearch 5.x this is no longer a 
> problem.
> A user should be able to configure the field name transformation when writing 
> to Elasticsearch, as needed.  
> While it might have been simpler to just remove the de-dotting mechanism, 
> this would break backwards compatibility.  Providing users with a means to 
> configure this mechanism provides them with an upgrade path.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1569) Allow user to change field name conversion when indexing to Elasticsearch

2018-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490968#comment-16490968
 ] 

ASF GitHub Bot commented on METRON-1569:


Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1022#discussion_r190946427
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java
 ---
@@ -0,0 +1,155 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.elasticsearch.writer;
+
+import com.github.benmanes.caffeine.cache.Cache;
+import com.github.benmanes.caffeine.cache.Caffeine;
+import org.apache.commons.lang.ClassUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.commons.lang3.exception.ExceptionUtils;
+import org.apache.metron.common.configuration.writer.WriterConfiguration;
+import org.apache.metron.common.field.DeDotFieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverters;
+import org.apache.metron.common.field.NoopFieldNameConverter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * A {@link FieldNameConverterFactory} that is backed by a cache.
+ *
+ * Each sensor type can use a different {@link FieldNameConverter} 
implementation.
+ *
+ * The {@link WriterConfiguration} allows a user to define the {@link 
FieldNameConverter}
+ * that should be used for a given sensor type.
+ *
+ * The {@link FieldNameConverter}s are maintained in a cache for a 
fixed period of time
+ * after they are created.  Once they expire, the {@link 
WriterConfiguration} is used to
+ * reload the {@link FieldNameConverter}.
+ *
+ * The user can change the {@link FieldNameConverter} in use at 
runtime. A change
--- End diff --

I actually liked what you had with your enum pattern. I was imaging a very 
minor change to your code, like below. The other change would be having the 
default converter value provided by the config class, or making it an 
Optional if you wanted the es writer to own setting the default:

```
public interface FieldNameConverter {
String convert(String originalField);
}

public enum FieldNameConverters implements FieldNameConverter {

  NOOP(new NoopFieldNameConverter()),
  DEDOT(new DeDotFieldNameConverter());

  private FieldNameConverter converter;

  FieldNameConverters(FieldNameConverter converter) {
this.converter = converter;
  }

  @Override
  public String convert(String originalField) {
return converter.convert(originalField);
  }
}

public class ElasticsearchWriter implements BulkMessageWriter, 
Serializable {
write( ... ) {
String converterType = config.getFieldNameConverter(sensorType);
// or this -> String converterType = 
config.getFieldNameConverter(sensorType).orElse("DEDOT");
// fetch the field name converter for this sensor type
FieldNameConverter fieldNameConverter = 
FieldNameConverters.valueOf(converterType);
final String indexPostfix = dateFormat.format(new Date());
BulkRequestBuilder bulkRequest = client.prepareBulk();
for(JSONObject message: messages) {
  JSONObject esDoc = new JSONObject();
  for(Object k : message.keySet()){
copyField(k.toString(), message, esDoc, fieldNameConverter);
  }
  ...
}
...
}

private void copyField(... FieldNameConverter fieldNameConverter) {
 String destinationFieldName = 
fieldNameConverter.convert(sourceFieldName);
...
}
...
}

```


> Allow user to change field name conversion 

[jira] [Commented] (METRON-1569) Allow user to change field name conversion when indexing to Elasticsearch

2018-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490876#comment-16490876
 ] 

ASF GitHub Bot commented on METRON-1569:


Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/1022
  
FYI - I also confirmed that the configuration value can be changed using 
the Advanced mode in the Management UI.

![screen shot 2018-05-25 at 11 31 17 
am](https://user-images.githubusercontent.com/2475409/40552989-53c6a118-600f-11e8-9b03-4cadf351d641.png)




> Allow user to change field name conversion when indexing to Elasticsearch
> -
>
> Key: METRON-1569
> URL: https://issues.apache.org/jira/browse/METRON-1569
> Project: Metron
>  Issue Type: Improvement
>Reporter: Nick Allen
>Assignee: Nick Allen
>Priority: Major
>
> The `ElasticsearchWriter` has a mechanism to transform the field names of a 
> message before it is written to Elasticsearch.  Right now this mechanism is 
> hard-coded to replace all '.' dots with ':' colons.
> This mechanism was needed for Elasticsearch 2.x which did not allow dots in 
> field names.  Now that Metron supports Elasticsearch 5.x this is no longer a 
> problem.
> A user should be able to configure the field name transformation when writing 
> to Elasticsearch, as needed.  
> While it might have been simpler to just remove the de-dotting mechanism, 
> this would break backwards compatibility.  Providing users with a means to 
> configure this mechanism provides them with an upgrade path.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1569) Allow user to change field name conversion when indexing to Elasticsearch

2018-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490857#comment-16490857
 ] 

ASF GitHub Bot commented on METRON-1569:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/1022#discussion_r190926866
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java
 ---
@@ -0,0 +1,155 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.elasticsearch.writer;
+
+import com.github.benmanes.caffeine.cache.Cache;
+import com.github.benmanes.caffeine.cache.Caffeine;
+import org.apache.commons.lang.ClassUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.commons.lang3.exception.ExceptionUtils;
+import org.apache.metron.common.configuration.writer.WriterConfiguration;
+import org.apache.metron.common.field.DeDotFieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverters;
+import org.apache.metron.common.field.NoopFieldNameConverter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * A {@link FieldNameConverterFactory} that is backed by a cache.
+ *
+ * Each sensor type can use a different {@link FieldNameConverter} 
implementation.
+ *
+ * The {@link WriterConfiguration} allows a user to define the {@link 
FieldNameConverter}
+ * that should be used for a given sensor type.
+ *
+ * The {@link FieldNameConverter}s are maintained in a cache for a 
fixed period of time
+ * after they are created.  Once they expire, the {@link 
WriterConfiguration} is used to
+ * reload the {@link FieldNameConverter}.
+ *
+ * The user can change the {@link FieldNameConverter} in use at 
runtime. A change
--- End diff --

I still need a place to house the (sourceType, WriterConfiguration) -> 
FieldNameConverter logic.  I will remove the cache mechanism though.


> Allow user to change field name conversion when indexing to Elasticsearch
> -
>
> Key: METRON-1569
> URL: https://issues.apache.org/jira/browse/METRON-1569
> Project: Metron
>  Issue Type: Improvement
>Reporter: Nick Allen
>Assignee: Nick Allen
>Priority: Major
>
> The `ElasticsearchWriter` has a mechanism to transform the field names of a 
> message before it is written to Elasticsearch.  Right now this mechanism is 
> hard-coded to replace all '.' dots with ':' colons.
> This mechanism was needed for Elasticsearch 2.x which did not allow dots in 
> field names.  Now that Metron supports Elasticsearch 5.x this is no longer a 
> problem.
> A user should be able to configure the field name transformation when writing 
> to Elasticsearch, as needed.  
> While it might have been simpler to just remove the de-dotting mechanism, 
> this would break backwards compatibility.  Providing users with a means to 
> configure this mechanism provides them with an upgrade path.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1569) Allow user to change field name conversion when indexing to Elasticsearch

2018-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490817#comment-16490817
 ] 

ASF GitHub Bot commented on METRON-1569:


Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1022#discussion_r190915687
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java
 ---
@@ -0,0 +1,155 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.elasticsearch.writer;
+
+import com.github.benmanes.caffeine.cache.Cache;
+import com.github.benmanes.caffeine.cache.Caffeine;
+import org.apache.commons.lang.ClassUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.commons.lang3.exception.ExceptionUtils;
+import org.apache.metron.common.configuration.writer.WriterConfiguration;
+import org.apache.metron.common.field.DeDotFieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverters;
+import org.apache.metron.common.field.NoopFieldNameConverter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * A {@link FieldNameConverterFactory} that is backed by a cache.
+ *
+ * Each sensor type can use a different {@link FieldNameConverter} 
implementation.
+ *
+ * The {@link WriterConfiguration} allows a user to define the {@link 
FieldNameConverter}
+ * that should be used for a given sensor type.
+ *
+ * The {@link FieldNameConverter}s are maintained in a cache for a 
fixed period of time
+ * after they are created.  Once they expire, the {@link 
WriterConfiguration} is used to
+ * reload the {@link FieldNameConverter}.
+ *
+ * The user can change the {@link FieldNameConverter} in use at 
runtime. A change
--- End diff --

I definitely appreciate the thinking around state though I think we can 
safely leave that out for now. This isn't client-facing, so refactoring later 
would have less consequence than something we expose via config or the UI.

I think you could probably use the enum as a factory directly and use 
FieldNameConverter converter = FieldNameConverters.valueOf(converterName).get() 
in ElasticsearchWriter line 80 directly. We do this a number of places as a 
combination strategy/factory pattern. 


https://github.com/apache/metron/blob/master/metron-platform/metron-common/src/main/java/org/apache/metron/common/utils/CompressionStrategies.java


https://github.com/apache/metron/blob/master/metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/bolt/UnifiedEnrichmentBolt.java#L194

I definitely get wanting to provide a factory, and I think you effectively 
get that with the enum you already created.


> Allow user to change field name conversion when indexing to Elasticsearch
> -
>
> Key: METRON-1569
> URL: https://issues.apache.org/jira/browse/METRON-1569
> Project: Metron
>  Issue Type: Improvement
>Reporter: Nick Allen
>Assignee: Nick Allen
>Priority: Major
>
> The `ElasticsearchWriter` has a mechanism to transform the field names of a 
> message before it is written to Elasticsearch.  Right now this mechanism is 
> hard-coded to replace all '.' dots with ':' colons.
> This mechanism was needed for Elasticsearch 2.x which did not allow dots in 
> field names.  Now that Metron supports Elasticsearch 5.x this is no longer a 
> problem.
> A user should be able to configure the field name transformation when writing 
> to Elasticsearch, as needed.  
> While it might have been simpler to just remove the de-dotting mechanism, 
> this would break backwards compatibility.  Providing 

[jira] [Commented] (METRON-1569) Allow user to change field name conversion when indexing to Elasticsearch

2018-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489883#comment-16489883
 ] 

ASF GitHub Bot commented on METRON-1569:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/1022#discussion_r190742938
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java
 ---
@@ -0,0 +1,155 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.elasticsearch.writer;
+
+import com.github.benmanes.caffeine.cache.Cache;
+import com.github.benmanes.caffeine.cache.Caffeine;
+import org.apache.commons.lang.ClassUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.commons.lang3.exception.ExceptionUtils;
+import org.apache.metron.common.configuration.writer.WriterConfiguration;
+import org.apache.metron.common.field.DeDotFieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverters;
+import org.apache.metron.common.field.NoopFieldNameConverter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * A {@link FieldNameConverterFactory} that is backed by a cache.
+ *
+ * Each sensor type can use a different {@link FieldNameConverter} 
implementation.
+ *
+ * The {@link WriterConfiguration} allows a user to define the {@link 
FieldNameConverter}
+ * that should be used for a given sensor type.
+ *
+ * The {@link FieldNameConverter}s are maintained in a cache for a 
fixed period of time
+ * after they are created.  Once they expire, the {@link 
WriterConfiguration} is used to
+ * reload the {@link FieldNameConverter}.
+ *
+ * The user can change the {@link FieldNameConverter} in use at 
runtime. A change
--- End diff --

So you're saying, hey why not just...
1. Take out the caching and just have something like a 
`SimpleFieldNameConverterFactory`. This would translate the (sourceType, 
WriterConfiguration) to the right `FieldNameConverter`.
2. The caching would be handled by `FieldNameConverters` in the sense that 
they are effectively singletons.

I think that would work if all `FieldNameConverter` implementations are 
stateless.  And they all are stateless today, but I tried not to make that a 
constraint.  I actually tried to make this work whether they are stateless or 
not.  

But alas there is a bug in what I did because I need each call to 
FieldNameConverters.get() to return a new instance.

I'd be open to removing the cache, if its not needed and we think 
`FieldNameConverter`s would always remain stateless.






> Allow user to change field name conversion when indexing to Elasticsearch
> -
>
> Key: METRON-1569
> URL: https://issues.apache.org/jira/browse/METRON-1569
> Project: Metron
>  Issue Type: Improvement
>Reporter: Nick Allen
>Assignee: Nick Allen
>Priority: Major
>
> The `ElasticsearchWriter` has a mechanism to transform the field names of a 
> message before it is written to Elasticsearch.  Right now this mechanism is 
> hard-coded to replace all '.' dots with ':' colons.
> This mechanism was needed for Elasticsearch 2.x which did not allow dots in 
> field names.  Now that Metron supports Elasticsearch 5.x this is no longer a 
> problem.
> A user should be able to configure the field name transformation when writing 
> to Elasticsearch, as needed.  
> While it might have been simpler to just remove the de-dotting mechanism, 
> this would break backwards compatibility.  Providing users with a means to 
> configure this mechanism provides them with an upgrade path.
>  
>  



--

[jira] [Commented] (METRON-1569) Allow user to change field name conversion when indexing to Elasticsearch

2018-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489850#comment-16489850
 ] 

ASF GitHub Bot commented on METRON-1569:


Github user cestella commented on a diff in the pull request:

https://github.com/apache/metron/pull/1022#discussion_r190738333
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java
 ---
@@ -0,0 +1,155 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.elasticsearch.writer;
+
+import com.github.benmanes.caffeine.cache.Cache;
+import com.github.benmanes.caffeine.cache.Caffeine;
+import org.apache.commons.lang.ClassUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.commons.lang3.exception.ExceptionUtils;
+import org.apache.metron.common.configuration.writer.WriterConfiguration;
+import org.apache.metron.common.field.DeDotFieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverters;
+import org.apache.metron.common.field.NoopFieldNameConverter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * A {@link FieldNameConverterFactory} that is backed by a cache.
+ *
+ * Each sensor type can use a different {@link FieldNameConverter} 
implementation.
+ *
+ * The {@link WriterConfiguration} allows a user to define the {@link 
FieldNameConverter}
+ * that should be used for a given sensor type.
+ *
+ * The {@link FieldNameConverter}s are maintained in a cache for a 
fixed period of time
+ * after they are created.  Once they expire, the {@link 
WriterConfiguration} is used to
+ * reload the {@link FieldNameConverter}.
+ *
+ * The user can change the {@link FieldNameConverter} in use at 
runtime. A change
+ * to this configuration is recognized once the old {@link 
FieldNameConverter} expires
+ * from the cache.
+ *
+ * Defining a shorter expiration interval allows config changes to be 
recognized more
+ * quickly, but also can impact performance negatively.
+ */
+public class CachedFieldNameConverterFactory implements 
FieldNameConverterFactory {
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  /**
+   * A cache that contains a {@link FieldNameConverter} for each sensor 
type.
+   *
+   * A user can alter the {@link FieldNameConverter} for a given sensor at 
any time
+   * by altering the Indexing configuration.  The actual {@link 
FieldNameConverter}
+   * in use for a given sensor will only change once the original 
converter has
+   * expired from the cache.
+   */
+  private Cache fieldNameConverters;
+
+  /**
+   * Creates a {@link CachedFieldNameConverterFactory}.
+   *
+   * @param expires The duration before {@link FieldNameConverter}s are 
expired.
+   * @param expiresUnits The units before {@link FieldNameConverter}s are 
expired.
+   */
+  public CachedFieldNameConverterFactory(int expires, TimeUnit 
expiresUnits) {
+
+fieldNameConverters = createFieldNameConverterCache(expires, 
expiresUnits);
+  }
+
+  /**
+   * Creates a {@link CachedFieldNameConverterFactory} where the cache 
expires after 5 minutes.
+   */
+  public CachedFieldNameConverterFactory() {
+
+this(5, TimeUnit.MINUTES);
+  }
+
+  /**
+   * Creates a {@link CachedFieldNameConverterFactory} using the given 
cache.  This should only
+   * be used for testing.
+   *
+   * @param fieldNameConverters A {@link Cache} containing {@link 
FieldNameConverter}s.
+   */
+  public CachedFieldNameConverterFactory(Cache 
fieldNameConverters) {
+
  

[jira] [Commented] (METRON-1569) Allow user to change field name conversion when indexing to Elasticsearch

2018-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489843#comment-16489843
 ] 

ASF GitHub Bot commented on METRON-1569:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/1022#discussion_r190737770
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java
 ---
@@ -0,0 +1,155 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.elasticsearch.writer;
+
+import com.github.benmanes.caffeine.cache.Cache;
+import com.github.benmanes.caffeine.cache.Caffeine;
+import org.apache.commons.lang.ClassUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.commons.lang3.exception.ExceptionUtils;
+import org.apache.metron.common.configuration.writer.WriterConfiguration;
+import org.apache.metron.common.field.DeDotFieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverters;
+import org.apache.metron.common.field.NoopFieldNameConverter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * A {@link FieldNameConverterFactory} that is backed by a cache.
+ *
+ * Each sensor type can use a different {@link FieldNameConverter} 
implementation.
+ *
+ * The {@link WriterConfiguration} allows a user to define the {@link 
FieldNameConverter}
+ * that should be used for a given sensor type.
+ *
+ * The {@link FieldNameConverter}s are maintained in a cache for a 
fixed period of time
+ * after they are created.  Once they expire, the {@link 
WriterConfiguration} is used to
+ * reload the {@link FieldNameConverter}.
+ *
+ * The user can change the {@link FieldNameConverter} in use at 
runtime. A change
+ * to this configuration is recognized once the old {@link 
FieldNameConverter} expires
+ * from the cache.
+ *
+ * Defining a shorter expiration interval allows config changes to be 
recognized more
+ * quickly, but also can impact performance negatively.
+ */
+public class CachedFieldNameConverterFactory implements 
FieldNameConverterFactory {
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  /**
+   * A cache that contains a {@link FieldNameConverter} for each sensor 
type.
+   *
+   * A user can alter the {@link FieldNameConverter} for a given sensor at 
any time
+   * by altering the Indexing configuration.  The actual {@link 
FieldNameConverter}
+   * in use for a given sensor will only change once the original 
converter has
+   * expired from the cache.
+   */
+  private Cache fieldNameConverters;
+
+  /**
+   * Creates a {@link CachedFieldNameConverterFactory}.
+   *
+   * @param expires The duration before {@link FieldNameConverter}s are 
expired.
+   * @param expiresUnits The units before {@link FieldNameConverter}s are 
expired.
+   */
+  public CachedFieldNameConverterFactory(int expires, TimeUnit 
expiresUnits) {
+
+fieldNameConverters = createFieldNameConverterCache(expires, 
expiresUnits);
+  }
+
+  /**
+   * Creates a {@link CachedFieldNameConverterFactory} where the cache 
expires after 5 minutes.
+   */
+  public CachedFieldNameConverterFactory() {
+
+this(5, TimeUnit.MINUTES);
+  }
+
+  /**
+   * Creates a {@link CachedFieldNameConverterFactory} using the given 
cache.  This should only
+   * be used for testing.
+   *
+   * @param fieldNameConverters A {@link Cache} containing {@link 
FieldNameConverter}s.
+   */
+  public CachedFieldNameConverterFactory(Cache 
fieldNameConverters) {
+

[jira] [Commented] (METRON-1569) Allow user to change field name conversion when indexing to Elasticsearch

2018-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489802#comment-16489802
 ] 

ASF GitHub Bot commented on METRON-1569:


Github user cestella commented on a diff in the pull request:

https://github.com/apache/metron/pull/1022#discussion_r190730655
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java
 ---
@@ -0,0 +1,155 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.elasticsearch.writer;
+
+import com.github.benmanes.caffeine.cache.Cache;
+import com.github.benmanes.caffeine.cache.Caffeine;
+import org.apache.commons.lang.ClassUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.commons.lang3.exception.ExceptionUtils;
+import org.apache.metron.common.configuration.writer.WriterConfiguration;
+import org.apache.metron.common.field.DeDotFieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverters;
+import org.apache.metron.common.field.NoopFieldNameConverter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * A {@link FieldNameConverterFactory} that is backed by a cache.
+ *
+ * Each sensor type can use a different {@link FieldNameConverter} 
implementation.
+ *
+ * The {@link WriterConfiguration} allows a user to define the {@link 
FieldNameConverter}
+ * that should be used for a given sensor type.
+ *
+ * The {@link FieldNameConverter}s are maintained in a cache for a 
fixed period of time
+ * after they are created.  Once they expire, the {@link 
WriterConfiguration} is used to
+ * reload the {@link FieldNameConverter}.
+ *
+ * The user can change the {@link FieldNameConverter} in use at 
runtime. A change
+ * to this configuration is recognized once the old {@link 
FieldNameConverter} expires
+ * from the cache.
+ *
+ * Defining a shorter expiration interval allows config changes to be 
recognized more
+ * quickly, but also can impact performance negatively.
+ */
+public class CachedFieldNameConverterFactory implements 
FieldNameConverterFactory {
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  /**
+   * A cache that contains a {@link FieldNameConverter} for each sensor 
type.
+   *
+   * A user can alter the {@link FieldNameConverter} for a given sensor at 
any time
+   * by altering the Indexing configuration.  The actual {@link 
FieldNameConverter}
+   * in use for a given sensor will only change once the original 
converter has
+   * expired from the cache.
+   */
+  private Cache fieldNameConverters;
+
+  /**
+   * Creates a {@link CachedFieldNameConverterFactory}.
+   *
+   * @param expires The duration before {@link FieldNameConverter}s are 
expired.
+   * @param expiresUnits The units before {@link FieldNameConverter}s are 
expired.
+   */
+  public CachedFieldNameConverterFactory(int expires, TimeUnit 
expiresUnits) {
+
+fieldNameConverters = createFieldNameConverterCache(expires, 
expiresUnits);
+  }
+
+  /**
+   * Creates a {@link CachedFieldNameConverterFactory} where the cache 
expires after 5 minutes.
+   */
+  public CachedFieldNameConverterFactory() {
+
+this(5, TimeUnit.MINUTES);
+  }
+
+  /**
+   * Creates a {@link CachedFieldNameConverterFactory} using the given 
cache.  This should only
+   * be used for testing.
+   *
+   * @param fieldNameConverters A {@link Cache} containing {@link 
FieldNameConverter}s.
+   */
+  public CachedFieldNameConverterFactory(Cache 
fieldNameConverters) {
+
  

[jira] [Commented] (METRON-1569) Allow user to change field name conversion when indexing to Elasticsearch

2018-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489705#comment-16489705
 ] 

ASF GitHub Bot commented on METRON-1569:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/1022#discussion_r190715353
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java
 ---
@@ -0,0 +1,155 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.elasticsearch.writer;
+
+import com.github.benmanes.caffeine.cache.Cache;
+import com.github.benmanes.caffeine.cache.Caffeine;
+import org.apache.commons.lang.ClassUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.commons.lang3.exception.ExceptionUtils;
+import org.apache.metron.common.configuration.writer.WriterConfiguration;
+import org.apache.metron.common.field.DeDotFieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverters;
+import org.apache.metron.common.field.NoopFieldNameConverter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * A {@link FieldNameConverterFactory} that is backed by a cache.
+ *
+ * Each sensor type can use a different {@link FieldNameConverter} 
implementation.
+ *
+ * The {@link WriterConfiguration} allows a user to define the {@link 
FieldNameConverter}
+ * that should be used for a given sensor type.
+ *
+ * The {@link FieldNameConverter}s are maintained in a cache for a 
fixed period of time
+ * after they are created.  Once they expire, the {@link 
WriterConfiguration} is used to
+ * reload the {@link FieldNameConverter}.
+ *
+ * The user can change the {@link FieldNameConverter} in use at 
runtime. A change
--- End diff --

To help clarify, I could have implemented a different 
`FieldNameConverterFactory`; maybe one called 
`SimpleFieldNameConverterFactory`.  

This factory wouldn't even have to use a cache.  Every time 
`ElasticsearchWriter` calls `create(String sensorType, WriterConfiguration 
config)`, it could just instantiate a new instance of the right 
`FieldNameConverter`.  

But of course, I didn't do that because this occurs for every field in 
every message that is written to Elasticsearch.  The cache is there for 
performance.

I offer this only as an example to help clarify the purpose of this class.


> Allow user to change field name conversion when indexing to Elasticsearch
> -
>
> Key: METRON-1569
> URL: https://issues.apache.org/jira/browse/METRON-1569
> Project: Metron
>  Issue Type: Improvement
>Reporter: Nick Allen
>Assignee: Nick Allen
>Priority: Major
>
> The `ElasticsearchWriter` has a mechanism to transform the field names of a 
> message before it is written to Elasticsearch.  Right now this mechanism is 
> hard-coded to replace all '.' dots with ':' colons.
> This mechanism was needed for Elasticsearch 2.x which did not allow dots in 
> field names.  Now that Metron supports Elasticsearch 5.x this is no longer a 
> problem.
> A user should be able to configure the field name transformation when writing 
> to Elasticsearch, as needed.  
> While it might have been simpler to just remove the de-dotting mechanism, 
> this would break backwards compatibility.  Providing users with a means to 
> configure this mechanism provides them with an upgrade path.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1569) Allow user to change field name conversion when indexing to Elasticsearch

2018-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489635#comment-16489635
 ] 

ASF GitHub Bot commented on METRON-1569:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/1022#discussion_r190702283
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java
 ---
@@ -0,0 +1,155 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.elasticsearch.writer;
+
+import com.github.benmanes.caffeine.cache.Cache;
+import com.github.benmanes.caffeine.cache.Caffeine;
+import org.apache.commons.lang.ClassUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.commons.lang3.exception.ExceptionUtils;
+import org.apache.metron.common.configuration.writer.WriterConfiguration;
+import org.apache.metron.common.field.DeDotFieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverters;
+import org.apache.metron.common.field.NoopFieldNameConverter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * A {@link FieldNameConverterFactory} that is backed by a cache.
+ *
+ * Each sensor type can use a different {@link FieldNameConverter} 
implementation.
+ *
+ * The {@link WriterConfiguration} allows a user to define the {@link 
FieldNameConverter}
+ * that should be used for a given sensor type.
+ *
+ * The {@link FieldNameConverter}s are maintained in a cache for a 
fixed period of time
+ * after they are created.  Once they expire, the {@link 
WriterConfiguration} is used to
+ * reload the {@link FieldNameConverter}.
+ *
+ * The user can change the {@link FieldNameConverter} in use at 
runtime. A change
--- End diff --

Maybe I am misunderstanding you, but the ConfigurationsUpdater ensures that 
our configuration classes stay in-sync with Zk.  That's now what this class 
does. 

This maintains a `FieldNameConverter` instance for each source type that is 
then used by the `ElasticsearchWriter` to do the field name conversion.  Are 
you proposing a different way to do this?


> Allow user to change field name conversion when indexing to Elasticsearch
> -
>
> Key: METRON-1569
> URL: https://issues.apache.org/jira/browse/METRON-1569
> Project: Metron
>  Issue Type: Improvement
>Reporter: Nick Allen
>Assignee: Nick Allen
>Priority: Major
>
> The `ElasticsearchWriter` has a mechanism to transform the field names of a 
> message before it is written to Elasticsearch.  Right now this mechanism is 
> hard-coded to replace all '.' dots with ':' colons.
> This mechanism was needed for Elasticsearch 2.x which did not allow dots in 
> field names.  Now that Metron supports Elasticsearch 5.x this is no longer a 
> problem.
> A user should be able to configure the field name transformation when writing 
> to Elasticsearch, as needed.  
> While it might have been simpler to just remove the de-dotting mechanism, 
> this would break backwards compatibility.  Providing users with a means to 
> configure this mechanism provides them with an upgrade path.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1569) Allow user to change field name conversion when indexing to Elasticsearch

2018-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489624#comment-16489624
 ] 

ASF GitHub Bot commented on METRON-1569:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/1022#discussion_r190701068
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java
 ---
@@ -0,0 +1,155 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.elasticsearch.writer;
+
+import com.github.benmanes.caffeine.cache.Cache;
+import com.github.benmanes.caffeine.cache.Caffeine;
+import org.apache.commons.lang.ClassUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.commons.lang3.exception.ExceptionUtils;
+import org.apache.metron.common.configuration.writer.WriterConfiguration;
+import org.apache.metron.common.field.DeDotFieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverters;
+import org.apache.metron.common.field.NoopFieldNameConverter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * A {@link FieldNameConverterFactory} that is backed by a cache.
+ *
+ * Each sensor type can use a different {@link FieldNameConverter} 
implementation.
+ *
+ * The {@link WriterConfiguration} allows a user to define the {@link 
FieldNameConverter}
+ * that should be used for a given sensor type.
+ *
+ * The {@link FieldNameConverter}s are maintained in a cache for a 
fixed period of time
+ * after they are created.  Once they expire, the {@link 
WriterConfiguration} is used to
+ * reload the {@link FieldNameConverter}.
+ *
+ * The user can change the {@link FieldNameConverter} in use at 
runtime. A change
+ * to this configuration is recognized once the old {@link 
FieldNameConverter} expires
+ * from the cache.
+ *
+ * Defining a shorter expiration interval allows config changes to be 
recognized more
+ * quickly, but also can impact performance negatively.
+ */
+public class CachedFieldNameConverterFactory implements 
FieldNameConverterFactory {
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  /**
+   * A cache that contains a {@link FieldNameConverter} for each sensor 
type.
+   *
+   * A user can alter the {@link FieldNameConverter} for a given sensor at 
any time
+   * by altering the Indexing configuration.  The actual {@link 
FieldNameConverter}
+   * in use for a given sensor will only change once the original 
converter has
+   * expired from the cache.
+   */
+  private Cache fieldNameConverters;
+
+  /**
+   * Creates a {@link CachedFieldNameConverterFactory}.
+   *
+   * @param expires The duration before {@link FieldNameConverter}s are 
expired.
+   * @param expiresUnits The units before {@link FieldNameConverter}s are 
expired.
+   */
+  public CachedFieldNameConverterFactory(int expires, TimeUnit 
expiresUnits) {
+
+fieldNameConverters = createFieldNameConverterCache(expires, 
expiresUnits);
+  }
+
+  /**
+   * Creates a {@link CachedFieldNameConverterFactory} where the cache 
expires after 5 minutes.
+   */
+  public CachedFieldNameConverterFactory() {
+
+this(5, TimeUnit.MINUTES);
+  }
+
+  /**
+   * Creates a {@link CachedFieldNameConverterFactory} using the given 
cache.  This should only
+   * be used for testing.
+   *
+   * @param fieldNameConverters A {@link Cache} containing {@link 
FieldNameConverter}s.
+   */
+  public CachedFieldNameConverterFactory(Cache 
fieldNameConverters) {
+

[jira] [Commented] (METRON-1569) Allow user to change field name conversion when indexing to Elasticsearch

2018-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489622#comment-16489622
 ] 

ASF GitHub Bot commented on METRON-1569:


Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1022#discussion_r190698390
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java
 ---
@@ -0,0 +1,155 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.elasticsearch.writer;
+
+import com.github.benmanes.caffeine.cache.Cache;
+import com.github.benmanes.caffeine.cache.Caffeine;
+import org.apache.commons.lang.ClassUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.commons.lang3.exception.ExceptionUtils;
+import org.apache.metron.common.configuration.writer.WriterConfiguration;
+import org.apache.metron.common.field.DeDotFieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverters;
+import org.apache.metron.common.field.NoopFieldNameConverter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * A {@link FieldNameConverterFactory} that is backed by a cache.
+ *
+ * Each sensor type can use a different {@link FieldNameConverter} 
implementation.
+ *
+ * The {@link WriterConfiguration} allows a user to define the {@link 
FieldNameConverter}
+ * that should be used for a given sensor type.
+ *
+ * The {@link FieldNameConverter}s are maintained in a cache for a 
fixed period of time
+ * after they are created.  Once they expire, the {@link 
WriterConfiguration} is used to
+ * reload the {@link FieldNameConverter}.
+ *
+ * The user can change the {@link FieldNameConverter} in use at 
runtime. A change
+ * to this configuration is recognized once the old {@link 
FieldNameConverter} expires
+ * from the cache.
+ *
+ * Defining a shorter expiration interval allows config changes to be 
recognized more
+ * quickly, but also can impact performance negatively.
+ */
+public class CachedFieldNameConverterFactory implements 
FieldNameConverterFactory {
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  /**
+   * A cache that contains a {@link FieldNameConverter} for each sensor 
type.
+   *
+   * A user can alter the {@link FieldNameConverter} for a given sensor at 
any time
+   * by altering the Indexing configuration.  The actual {@link 
FieldNameConverter}
+   * in use for a given sensor will only change once the original 
converter has
+   * expired from the cache.
+   */
+  private Cache fieldNameConverters;
+
+  /**
+   * Creates a {@link CachedFieldNameConverterFactory}.
+   *
+   * @param expires The duration before {@link FieldNameConverter}s are 
expired.
+   * @param expiresUnits The units before {@link FieldNameConverter}s are 
expired.
+   */
+  public CachedFieldNameConverterFactory(int expires, TimeUnit 
expiresUnits) {
+
+fieldNameConverters = createFieldNameConverterCache(expires, 
expiresUnits);
+  }
+
+  /**
+   * Creates a {@link CachedFieldNameConverterFactory} where the cache 
expires after 5 minutes.
+   */
+  public CachedFieldNameConverterFactory() {
+
+this(5, TimeUnit.MINUTES);
+  }
+
+  /**
+   * Creates a {@link CachedFieldNameConverterFactory} using the given 
cache.  This should only
+   * be used for testing.
+   *
+   * @param fieldNameConverters A {@link Cache} containing {@link 
FieldNameConverter}s.
+   */
+  public CachedFieldNameConverterFactory(Cache 
fieldNameConverters) {
+
  

[jira] [Commented] (METRON-1569) Allow user to change field name conversion when indexing to Elasticsearch

2018-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489623#comment-16489623
 ] 

ASF GitHub Bot commented on METRON-1569:


Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1022#discussion_r190700071
  
--- Diff: 
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java
 ---
@@ -0,0 +1,155 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.elasticsearch.writer;
+
+import com.github.benmanes.caffeine.cache.Cache;
+import com.github.benmanes.caffeine.cache.Caffeine;
+import org.apache.commons.lang.ClassUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.commons.lang3.exception.ExceptionUtils;
+import org.apache.metron.common.configuration.writer.WriterConfiguration;
+import org.apache.metron.common.field.DeDotFieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverters;
+import org.apache.metron.common.field.NoopFieldNameConverter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * A {@link FieldNameConverterFactory} that is backed by a cache.
+ *
+ * Each sensor type can use a different {@link FieldNameConverter} 
implementation.
+ *
+ * The {@link WriterConfiguration} allows a user to define the {@link 
FieldNameConverter}
+ * that should be used for a given sensor type.
+ *
+ * The {@link FieldNameConverter}s are maintained in a cache for a 
fixed period of time
+ * after they are created.  Once they expire, the {@link 
WriterConfiguration} is used to
+ * reload the {@link FieldNameConverter}.
+ *
+ * The user can change the {@link FieldNameConverter} in use at 
runtime. A change
--- End diff --

I want to say that this is handled through the ConfigurationsUpdater 
classes already already. I believe the writer passes in an updated config on 
each call.


https://github.com/apache/metron/blob/master/metron-platform/metron-common/src/main/java/org/apache/metron/common/zookeeper/configurations/ConfigurationsUpdater.java


> Allow user to change field name conversion when indexing to Elasticsearch
> -
>
> Key: METRON-1569
> URL: https://issues.apache.org/jira/browse/METRON-1569
> Project: Metron
>  Issue Type: Improvement
>Reporter: Nick Allen
>Assignee: Nick Allen
>Priority: Major
>
> The `ElasticsearchWriter` has a mechanism to transform the field names of a 
> message before it is written to Elasticsearch.  Right now this mechanism is 
> hard-coded to replace all '.' dots with ':' colons.
> This mechanism was needed for Elasticsearch 2.x which did not allow dots in 
> field names.  Now that Metron supports Elasticsearch 5.x this is no longer a 
> problem.
> A user should be able to configure the field name transformation when writing 
> to Elasticsearch, as needed.  
> While it might have been simpler to just remove the de-dotting mechanism, 
> this would break backwards compatibility.  Providing users with a means to 
> configure this mechanism provides them with an upgrade path.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1569) Allow user to change field name conversion when indexing to Elasticsearch

2018-05-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484274#comment-16484274
 ] 

ASF GitHub Bot commented on METRON-1569:


GitHub user nickwallen opened a pull request:

https://github.com/apache/metron/pull/1022

METRON-1569 Allow user to change field name conversion when indexing …

The `ElasticsearchWriter` has a mechanism to transform the field names of a 
message before it is written to Elasticsearch.  Right now this mechanism is 
hard-coded to replace all '.' dots with ':' colons.

This mechanism was needed for Elasticsearch 2.x which did not allow dots in 
field names.  Now that Metron supports Elasticsearch 5.x this is no longer a 
problem. A user should be able to configure the field name transformation when 
writing to Elasticsearch, as needed.  

While it might have been simpler to just remove the de-dotting mechanism, 
this would break backwards compatibility.  Taking this approach provides users 
with an upgrade path.

## Changes

This change allows the user to configure the field name converter as part 
of the index writer configuration.  

Acceptable values include the following.
* `DEDOT`: Replaces all '.' with ':' which is the default, backwards 
compatible behavior.
* `NOOP`: No field name change.

If no "fieldNameConverter" is defined, it defaults to using `DEDOT` which 
maintains backwards compatibility.

A cache of `FieldNameConverter`s is maintained since the index writer 
configuration can be changed at run-time and each sensor has its own index 
writer configuration.

An example configuration looks-like the following.
```
{
  "hdfs" : {
"enabled" : false
  },
  "elasticsearch" : {
"index" : "bro",
"batchSize" : 5,
"enabled" : true,
"fieldNameConverter": "NOOP"
  },
  "solr" : {
"enabled" : false
  }
} 
```

## Code Changes

* Added the `fieldNameConverter` parameter to the Index writer 
configuration.

* Moved the `FieldNameConverter` implementations to a dedicated package in 
`metron-common`.

* Renamed `ElasticsearchFieldNameConverter` to `DeDotFieldNameConverter`.

* Implemented the `NoopFieldNameConverter` which does not modify the field 
name.

* Created `FieldNameConverters` class that allows a user to specify either 
`DEDOT` or `NOOP` to choose the appropriate implementation.

* Implemented a `CachedFieldNameConverterFactory` that encapsulates all the 
logic for choosing and instantiating the appropriate `FieldNameConverter`.  

* Updated `ElasticsearchWriter` to use the 
`CachedFieldNameConverterFactory`.

* Updated the README to document the new configuration parameter.


## Manual Testing

1. Launch a development environment and login.

```
vagrant ssh
sudo su -
source /etc/default/metron
```

1. Validate the environment by ensuring alerts are visible in the Alerts UI 
and that the Ambari Service Check completes successfully.  This ensures that 
the change is backwards compatible.

1. Login to the Storm UI and enable DEBUG logging for 
`org.apache.metron.common` and `org.apache.metron.elasticsearch`.

1. The Storm worker logs in 
`/var/log/storm/worker-artifacts/random_access_indexing*/worker.log` should 
contain the following log statements, if you have enabled DEBUG logging 
correctly.  This shows that the default `DEDOT` converter is in-use.

```
2018-05-22 14:38:... [DEBUG] Renamed dotted field; 
original=source.type, new=source:type
2018-05-22 14:38:... [DEBUG] Renamed dotted field; 
original=adapter.geoadapter.end.ts, new=adapter:geoadapter:end:ts
2018-05-22 14:38:... [DEBUG] Renamed dotted field; 
original=threatintelsplitterbolt.splitter.end.ts, 
new=threatintelsplitterbolt:splitter:end:ts
2018-05-22 14:38:... [DEBUG] Renamed dotted field; 
original=adapter.threatinteladapter.begin.ts, 
new=adapter:threatinteladapter:begin:ts
2018-05-22 14:38:... [DEBUG] Renamed dotted field; 
original=enrichments.geo.ip_dst_addr.location_point, 
new=enrichments:geo:ip_dst_addr:location_point
2018-05-22 14:38:... [DEBUG] Renamed dotted field; 
original=adapter.threatinteladapter.end.ts, 
new=adapter:threatinteladapter:end:ts
2018-05-22 14:38:... [DEBUG] Renamed dotted field; 
original=enrichmentsplitterbolt.splitter.end.ts, 
new=enrichmentsplitterbolt:splitter:end:ts
```

1. Launch the REPL.

```
./bin/stellar -z $ZOOKEEPER
```

1. Change the field name converter to NOOP.

```
[Stellar]>>> conf := SHELL_EDIT()
{
  "hdfs" : {