[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2019-04-24 Thread Nick Allen (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16825074#comment-16825074
 ] 

Nick Allen commented on METRON-1005:


[~justinleet] yes, you are correct.  I will do that.

> Create Decodable Row Key for Profiler
> -
>
> Key: METRON-1005
> URL: https://issues.apache.org/jira/browse/METRON-1005
> Project: Metron
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Nick Allen
>Assignee: Nick Allen
>Priority: Major
> Fix For: Next + 1
>
>
> To be able to answer the types of questions that I outlined in METRON-450, we 
> need a row key that is decodable.  Right now there is no logic to decode a 
> row key, nor is the existing row key easily decodable.  
> Once the row keys can be decoded, you could scan all of the row keys in the 
> Profiler's HBase table, decode each of them and extract things like, the 
> names of all your profiles, the names of entities within a profile, the 
> period duration of a given profile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2019-04-23 Thread Justin Leet (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16824739#comment-16824739
 ] 

Justin Leet commented on METRON-1005:
-

[~nickwallen] This Jira can be closed and the fix version removed, right?

> Create Decodable Row Key for Profiler
> -
>
> Key: METRON-1005
> URL: https://issues.apache.org/jira/browse/METRON-1005
> Project: Metron
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Nick Allen
>Assignee: Nick Allen
>Priority: Major
> Fix For: Next + 1
>
>
> To be able to answer the types of questions that I outlined in METRON-450, we 
> need a row key that is decodable.  Right now there is no logic to decode a 
> row key, nor is the existing row key easily decodable.  
> Once the row keys can be decoded, you could scan all of the row keys in the 
> Profiler's HBase table, decode each of them and extract things like, the 
> names of all your profiles, the names of entities within a profile, the 
> period duration of a given profile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2018-02-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16354270#comment-16354270
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user nickwallen closed the pull request at:

https://github.com/apache/metron/pull/622


> Create Decodable Row Key for Profiler
> -
>
> Key: METRON-1005
> URL: https://issues.apache.org/jira/browse/METRON-1005
> Project: Metron
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Nick Allen
>Assignee: Nick Allen
>Priority: Major
> Fix For: Next + 1
>
>
> To be able to answer the types of questions that I outlined in METRON-450, we 
> need a row key that is decodable.  Right now there is no logic to decode a 
> row key, nor is the existing row key easily decodable.  
> Once the row keys can be decoded, you could scan all of the row keys in the 
> Profiler's HBase table, decode each of them and extract things like, the 
> names of all your profiles, the names of entities within a profile, the 
> period duration of a given profile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2018-01-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308099#comment-16308099
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user justinleet commented on the issue:

https://github.com/apache/metron/pull/622
  
@nickwallen I haven't been following this discussion, but it seems like a 
useful feature / enhancement that's been hanging out awhile after active 
discussion petered out. What are the next steps here?  Does this PR need 
changes?  Should the discussion be revived on the user lists?  It doesn't seem 
like there was any consensus on the approach, but again, I like this 
enhancement a lot.


> Create Decodable Row Key for Profiler
> -
>
> Key: METRON-1005
> URL: https://issues.apache.org/jira/browse/METRON-1005
> Project: Metron
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Nick Allen
>Assignee: Nick Allen
> Fix For: Next + 1
>
>
> To be able to answer the types of questions that I outlined in METRON-450, we 
> need a row key that is decodable.  Right now there is no logic to decode a 
> row key, nor is the existing row key easily decodable.  
> Once the row keys can be decoded, you could scan all of the row keys in the 
> Profiler's HBase table, decode each of them and extract things like, the 
> names of all your profiles, the names of entities within a profile, the 
> period duration of a given profile.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100398#comment-16100398
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user mattf-horton commented on a diff in the pull request:

https://github.com/apache/metron/pull/622#discussion_r129370546
  
--- Diff: 
metron-analytics/metron-profiler-common/src/main/java/org/apache/metron/profiler/hbase/DecodableRowKeyBuilder.java
 ---
@@ -0,0 +1,402 @@
+/*
+ *
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ *
+ */
+
+package org.apache.metron.profiler.hbase;
+
+import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.metron.profiler.ProfileMeasurement;
+import org.apache.metron.profiler.ProfilePeriod;
+
+import java.nio.BufferUnderflowException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.security.MessageDigest;
+import java.security.NoSuchAlgorithmException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.TimeUnit;
+
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD;
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD_UNITS;
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_SALT_DIVISOR;
+
+/**
+ * Responsible for building the row keys used to store profile data in 
HBase.
+ *
+ * This builder generates decodable row keys.  A decodable row key is one 
that can be interrogated to extract
+ * the constituent components of that row key.  Given a previously 
generated row key this builder
+ * can extract the profile name, entity name, group name(s), period 
duration, and period.
+ *
+ * The row key is composed of the following fields.
+ * 
+ * magic number - Helps to validate the row key.
+ * version - The version number of the row key.
+ * salt - A salt that helps prevent hot-spotting.
+ * profile - The name of the profile.
+ * entity - The name of the entity being profiled.
+ * group(s) - The group(s) used to sort the data in HBase. For 
example, a group may distinguish between weekends and weekdays.
+ * period - The period in which the measurement was taken. The first 
period starts at the epoch and increases monotonically.
+ * 
+ */
+public class DecodableRowKeyBuilder implements RowKeyBuilder {
+
+  /**
+   * Defines the byte order when encoding and decoding the row keys.
+   *
+   * Making this configurable is likely not necessary and is left as a 
practice exercise for the reader. :)
+   */
+  private static final ByteOrder byteOrder = ByteOrder.BIG_ENDIAN;
+
+  /**
+   * Defines some level of sane max field length to avoid any shenanigans 
with oddly encoded row keys.
+   */
+  private static final int MAX_FIELD_LENGTH = 1000;
+
+  /**
+   * A magic number embedded in each row key to help validate the row key 
and byte ordering when decoding.
+   */
+  protected static final short MAGIC_NUMBER = 77;
+
+  /**
+   * The version number of the row keys supported by this builder.
+   */
+  protected static final byte VERSION = (byte) 1;
+
+  /**
+   * A salt can be prepended to the row key to help prevent hot-spotting.  
The salt
+   * divisor is used to generate the salt.  The salt divisor should be 
roughly equal
+   * to the number of nodes in the Hbase cluster.
+   */
+  private int saltDivisor;
+
+  /**
+   * The duration of each profile period in milliseconds.
+   */
+  private long periodDurationMillis;
+
+  public DecodableRowKeyBuilder() {
+this(PROFILER_SALT_DIVISOR.getDefault(Integer.class),
+PROFILER_PERIOD.getDefault(Long.class),
+
TimeUnit.valueOf(PROFILER_PERIOD_UNITS.getDefault(String.class)));
+  }
+
+  public 

[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100388#comment-16100388
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user mattf-horton commented on the issue:

https://github.com/apache/metron/pull/622
  
@cestella , 
> Would this approach require scans on read in the critical path?

I don't perceive that decoding rowkeys is on any critical path.  You only 
need to look up Profile by serial number (or hash) in the case of decoding 
rowkeys.  No? 


> Create Decodable Row Key for Profiler
> -
>
> Key: METRON-1005
> URL: https://issues.apache.org/jira/browse/METRON-1005
> Project: Metron
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Nick Allen
>Assignee: Nick Allen
> Fix For: Next + 1
>
>
> To be able to answer the types of questions that I outlined in METRON-450, we 
> need a row key that is decodable.  Right now there is no logic to decode a 
> row key, nor is the existing row key easily decodable.  
> Once the row keys can be decoded, you could scan all of the row keys in the 
> Profiler's HBase table, decode each of them and extract things like, the 
> names of all your profiles, the names of entities within a profile, the 
> period duration of a given profile.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100379#comment-16100379
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user mattf-horton commented on the issue:

https://github.com/apache/metron/pull/622
  
@nickwallen brought up the issue of wildcard queries on our rowkeys.  It 
has always bothered me that we can't do wildcard queries on groups.  If you 
have, for example, a single groupBy based on day of week, that's just 7 
possible values, and if you want them all you could just do 7 queries and 
combine them.  But if you have three groupBy's, and they have 7, 31, and 256 
possible values, then to simulate a wildcard query you would have to do over 
55,000 individual queries!  Of course you would just do an hbase scan, but it 
would require a full table scan to select the time range desired.

I propose that we re-order the rowkey elements to support prefix queries on 
Profile and time range, with wildcarding for primarily groups, and secondarily 
entities, ie:
\\\\\\

So if I want the results for all rows in a time range regarding entity 
"192.168.222.123" regardless of group, I can query it, and if I want all rows 
in a time range regardless of entity value or group, I can query that too, as 
efficiently as an ordinary time range query.  What do you think?



> Create Decodable Row Key for Profiler
> -
>
> Key: METRON-1005
> URL: https://issues.apache.org/jira/browse/METRON-1005
> Project: Metron
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Nick Allen
>Assignee: Nick Allen
> Fix For: Next + 1
>
>
> To be able to answer the types of questions that I outlined in METRON-450, we 
> need a row key that is decodable.  Right now there is no logic to decode a 
> row key, nor is the existing row key easily decodable.  
> Once the row keys can be decoded, you could scan all of the row keys in the 
> Profiler's HBase table, decode each of them and extract things like, the 
> names of all your profiles, the names of entities within a profile, the 
> period duration of a given profile.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100364#comment-16100364
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user cestella commented on the issue:

https://github.com/apache/metron/pull/622
  
@mattf-horton Would this approach require scans on read in the critical 
path?


> Create Decodable Row Key for Profiler
> -
>
> Key: METRON-1005
> URL: https://issues.apache.org/jira/browse/METRON-1005
> Project: Metron
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Nick Allen
>Assignee: Nick Allen
> Fix For: Next + 1
>
>
> To be able to answer the types of questions that I outlined in METRON-450, we 
> need a row key that is decodable.  Right now there is no logic to decode a 
> row key, nor is the existing row key easily decodable.  
> Once the row keys can be decoded, you could scan all of the row keys in the 
> Profiler's HBase table, decode each of them and extract things like, the 
> names of all your profiles, the names of entities within a profile, the 
> period duration of a given profile.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100356#comment-16100356
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user mattf-horton commented on the issue:

https://github.com/apache/metron/pull/622
  
@cestella , tl;dr: The discussion of serial numbers is a distraction.  
Let's just use the profileHash and forget the serial number.  It was a 
micro-optimization.

Answer to your question:  Two cases:
- If you have the profileHash, then you can look up the Profile using an 
hbase wildcard query for rowkey \\* , and since the profileHash 
is unique, it will be essentially as efficient as using the full rowkey.
- If you are trying to decode a rowkey and only have the serial number then 
I stated some assumptions: "The expectation is that we will seldom (almost 
never) need to reference back to the Profile specification, and the total usage 
of Profile specs will be human-scale finite, **so it is okay to "scan" the 
ProfileSpecs table to find the full Profile spec referenced by a profileSN.** 
If this is not true, use the full hash as both the rowkey in the PeriodSpecs 
table, and as the reference element in the Profile rowkeys."


> Create Decodable Row Key for Profiler
> -
>
> Key: METRON-1005
> URL: https://issues.apache.org/jira/browse/METRON-1005
> Project: Metron
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Nick Allen
>Assignee: Nick Allen
> Fix For: Next + 1
>
>
> To be able to answer the types of questions that I outlined in METRON-450, we 
> need a row key that is decodable.  Right now there is no logic to decode a 
> row key, nor is the existing row key easily decodable.  
> Once the row keys can be decoded, you could scan all of the row keys in the 
> Profiler's HBase table, decode each of them and extract things like, the 
> names of all your profiles, the names of entities within a profile, the 
> period duration of a given profile.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100329#comment-16100329
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user cestella commented on the issue:

https://github.com/apache/metron/pull/622
  
@mattf-horton Wouldn't you have to use the serial number to retrieve 
profiles?


> Create Decodable Row Key for Profiler
> -
>
> Key: METRON-1005
> URL: https://issues.apache.org/jira/browse/METRON-1005
> Project: Metron
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Nick Allen
>Assignee: Nick Allen
> Fix For: Next + 1
>
>
> To be able to answer the types of questions that I outlined in METRON-450, we 
> need a row key that is decodable.  Right now there is no logic to decode a 
> row key, nor is the existing row key easily decodable.  
> Once the row keys can be decoded, you could scan all of the row keys in the 
> Profiler's HBase table, decode each of them and extract things like, the 
> names of all your profiles, the names of entities within a profile, the 
> period duration of a given profile.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100316#comment-16100316
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user mattf-horton commented on a diff in the pull request:

https://github.com/apache/metron/pull/622#discussion_r129356740
  
--- Diff: 
metron-analytics/metron-profiler-common/src/main/java/org/apache/metron/profiler/hbase/DecodableRowKeyBuilder.java
 ---
@@ -0,0 +1,382 @@
+/*
+ *
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ *
+ */
+
+package org.apache.metron.profiler.hbase;
+
+import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.metron.profiler.ProfileMeasurement;
+import org.apache.metron.profiler.ProfilePeriod;
+
+import java.nio.BufferUnderflowException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.security.MessageDigest;
+import java.security.NoSuchAlgorithmException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Optional;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Responsible for building the row keys used to store profile data in 
HBase.
+ *
+ * This builder generates decodable row keys.  A decodable row key is one 
that can be interrogated to extract
+ * the constituent components of that row key.  Given a previously 
generated row key this builder
+ * can extract the profile name, entity name, group name(s), period 
duration, and period.
+ *
+ * The row key is composed of the following fields.
+ * 
+ * magic number - Helps to validate the row key.
+ * version - The version number of the row key.
+ * salt - A salt that helps prevent hot-spotting.
+ * profile - The name of the profile.
+ * entity - The name of the entity being profiled.
+ * group(s) - The group(s) used to sort the data in HBase. For 
example, a group may distinguish between weekends and weekdays.
+ * period - The period in which the measurement was taken. The first 
period starts at the epoch and increases monotonically.
+ * 
+ */
+public class DecodableRowKeyBuilder implements RowKeyBuilder {
+
+  /**
+   * Defines the byte order when encoding and decoding the row keys.
+   *
+   * Making this configurable is likely not necessary and is left as a 
practice exercise for the reader. :)
+   */
+  private static final ByteOrder byteOrder = ByteOrder.BIG_ENDIAN;
+
+  /**
+   * Defines some level of sane max field length to avoid any shenanigans 
with oddly encoded row keys.
+   */
+  private static final int MAX_FIELD_LENGTH = 1000;
+
+  /**
+   * A magic number embedded in each row key to help validate the row key 
and byte ordering when decoding.
+   */
+  protected static final short MAGIC_NUMBER = 77;
+
+  /**
+   * The version number of the row keys supported by this builder.
+   */
+  protected static final byte VERSION = (byte) 1;
+
+  /**
+   * A salt can be prepended to the row key to help prevent hot-spotting.  
The salt
+   * divisor is used to generate the salt.  The salt divisor should be 
roughly equal
+   * to the number of nodes in the Hbase cluster.
+   */
+  private int saltDivisor;
+
+  /**
+   * The duration of each profile period in milliseconds.
+   */
+  private long periodDurationMillis;
+
+  public DecodableRowKeyBuilder() {
+this(1000, 15, TimeUnit.MINUTES);
+  }
+
+  public DecodableRowKeyBuilder(int saltDivisor, long duration, TimeUnit 
units) {
+this.saltDivisor = saltDivisor;
+this.periodDurationMillis = units.toMillis(duration);
+  }
+
+  /**
+   * Builds a list of row keys necessary to retrieve profile measurements 
over
+   * a time horizon.
+   *
+   * @param profile The name of the profile.
+   * @param entity The name of the entity.
+   * @param groups The group(s) used to sort 

[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100315#comment-16100315
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user mattf-horton commented on a diff in the pull request:

https://github.com/apache/metron/pull/622#discussion_r129356631
  
--- Diff: 
metron-analytics/metron-profiler-common/src/main/java/org/apache/metron/profiler/hbase/DecodableRowKeyBuilder.java
 ---
@@ -0,0 +1,402 @@
+/*
+ *
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ *
+ */
+
+package org.apache.metron.profiler.hbase;
+
+import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.metron.profiler.ProfileMeasurement;
+import org.apache.metron.profiler.ProfilePeriod;
+
+import java.nio.BufferUnderflowException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.security.MessageDigest;
+import java.security.NoSuchAlgorithmException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.TimeUnit;
+
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD;
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD_UNITS;
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_SALT_DIVISOR;
+
+/**
+ * Responsible for building the row keys used to store profile data in 
HBase.
+ *
+ * This builder generates decodable row keys.  A decodable row key is one 
that can be interrogated to extract
+ * the constituent components of that row key.  Given a previously 
generated row key this builder
+ * can extract the profile name, entity name, group name(s), period 
duration, and period.
+ *
+ * The row key is composed of the following fields.
+ * 
+ * magic number - Helps to validate the row key.
+ * version - The version number of the row key.
+ * salt - A salt that helps prevent hot-spotting.
+ * profile - The name of the profile.
+ * entity - The name of the entity being profiled.
+ * group(s) - The group(s) used to sort the data in HBase. For 
example, a group may distinguish between weekends and weekdays.
+ * period - The period in which the measurement was taken. The first 
period starts at the epoch and increases monotonically.
+ * 
+ */
+public class DecodableRowKeyBuilder implements RowKeyBuilder {
+
+  /**
+   * Defines the byte order when encoding and decoding the row keys.
+   *
+   * Making this configurable is likely not necessary and is left as a 
practice exercise for the reader. :)
+   */
+  private static final ByteOrder byteOrder = ByteOrder.BIG_ENDIAN;
+
+  /**
+   * Defines some level of sane max field length to avoid any shenanigans 
with oddly encoded row keys.
+   */
+  private static final int MAX_FIELD_LENGTH = 1000;
+
+  /**
+   * A magic number embedded in each row key to help validate the row key 
and byte ordering when decoding.
+   */
+  protected static final short MAGIC_NUMBER = 77;
+
+  /**
+   * The version number of the row keys supported by this builder.
+   */
+  protected static final byte VERSION = (byte) 1;
+
+  /**
+   * A salt can be prepended to the row key to help prevent hot-spotting.  
The salt
+   * divisor is used to generate the salt.  The salt divisor should be 
roughly equal
+   * to the number of nodes in the Hbase cluster.
+   */
+  private int saltDivisor;
+
+  /**
+   * The duration of each profile period in milliseconds.
+   */
+  private long periodDurationMillis;
+
+  public DecodableRowKeyBuilder() {
+this(PROFILER_SALT_DIVISOR.getDefault(Integer.class),
+PROFILER_PERIOD.getDefault(Long.class),
+
TimeUnit.valueOf(PROFILER_PERIOD_UNITS.getDefault(String.class)));
+  }
+
+  public 

[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100324#comment-16100324
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user mattf-horton commented on a diff in the pull request:

https://github.com/apache/metron/pull/622#discussion_r129358521
  
--- Diff: 
metron-analytics/metron-profiler-common/src/main/java/org/apache/metron/profiler/hbase/DecodableRowKeyBuilder.java
 ---
@@ -0,0 +1,402 @@
+/*
+ *
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ *
+ */
+
+package org.apache.metron.profiler.hbase;
+
+import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.metron.profiler.ProfileMeasurement;
+import org.apache.metron.profiler.ProfilePeriod;
+
+import java.nio.BufferUnderflowException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.security.MessageDigest;
+import java.security.NoSuchAlgorithmException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.TimeUnit;
+
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD;
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD_UNITS;
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_SALT_DIVISOR;
+
+/**
+ * Responsible for building the row keys used to store profile data in 
HBase.
+ *
+ * This builder generates decodable row keys.  A decodable row key is one 
that can be interrogated to extract
+ * the constituent components of that row key.  Given a previously 
generated row key this builder
+ * can extract the profile name, entity name, group name(s), period 
duration, and period.
+ *
+ * The row key is composed of the following fields.
+ * 
+ * magic number - Helps to validate the row key.
+ * version - The version number of the row key.
+ * salt - A salt that helps prevent hot-spotting.
+ * profile - The name of the profile.
+ * entity - The name of the entity being profiled.
+ * group(s) - The group(s) used to sort the data in HBase. For 
example, a group may distinguish between weekends and weekdays.
+ * period - The period in which the measurement was taken. The first 
period starts at the epoch and increases monotonically.
+ * 
+ */
+public class DecodableRowKeyBuilder implements RowKeyBuilder {
+
+  /**
+   * Defines the byte order when encoding and decoding the row keys.
+   *
+   * Making this configurable is likely not necessary and is left as a 
practice exercise for the reader. :)
+   */
+  private static final ByteOrder byteOrder = ByteOrder.BIG_ENDIAN;
+
+  /**
+   * Defines some level of sane max field length to avoid any shenanigans 
with oddly encoded row keys.
+   */
+  private static final int MAX_FIELD_LENGTH = 1000;
+
+  /**
+   * A magic number embedded in each row key to help validate the row key 
and byte ordering when decoding.
+   */
+  protected static final short MAGIC_NUMBER = 77;
+
+  /**
+   * The version number of the row keys supported by this builder.
+   */
+  protected static final byte VERSION = (byte) 1;
+
+  /**
+   * A salt can be prepended to the row key to help prevent hot-spotting.  
The salt
+   * divisor is used to generate the salt.  The salt divisor should be 
roughly equal
+   * to the number of nodes in the Hbase cluster.
+   */
+  private int saltDivisor;
+
+  /**
+   * The duration of each profile period in milliseconds.
+   */
+  private long periodDurationMillis;
+
+  public DecodableRowKeyBuilder() {
+this(PROFILER_SALT_DIVISOR.getDefault(Integer.class),
+PROFILER_PERIOD.getDefault(Long.class),
+
TimeUnit.valueOf(PROFILER_PERIOD_UNITS.getDefault(String.class)));
+  }
+
+  public 

[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100293#comment-16100293
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user mattf-horton commented on the issue:

https://github.com/apache/metron/pull/622
  
@cestella , we would not need to keep an index resident in memory.  Most of 
the time we would just have the active Profiles in memory, exactly as we do 
today.  You only need to retrieve the Profile by serial number on the rare 
occasions that you have to decode rowkeys.  That said, it's fine with me to 
just use the profileHash.  I agree it decreases complexity.


> Create Decodable Row Key for Profiler
> -
>
> Key: METRON-1005
> URL: https://issues.apache.org/jira/browse/METRON-1005
> Project: Metron
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Nick Allen
>Assignee: Nick Allen
> Fix For: Next + 1
>
>
> To be able to answer the types of questions that I outlined in METRON-450, we 
> need a row key that is decodable.  Right now there is no logic to decode a 
> row key, nor is the existing row key easily decodable.  
> Once the row keys can be decoded, you could scan all of the row keys in the 
> Profiler's HBase table, decode each of them and extract things like, the 
> names of all your profiles, the names of entities within a profile, the 
> period duration of a given profile.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100096#comment-16100096
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/622#discussion_r129319336
  
--- Diff: 
metron-analytics/metron-profiler-common/src/main/java/org/apache/metron/profiler/hbase/DecodableRowKeyBuilder.java
 ---
@@ -0,0 +1,402 @@
+/*
+ *
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ *
+ */
+
+package org.apache.metron.profiler.hbase;
+
+import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.metron.profiler.ProfileMeasurement;
+import org.apache.metron.profiler.ProfilePeriod;
+
+import java.nio.BufferUnderflowException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.security.MessageDigest;
+import java.security.NoSuchAlgorithmException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.TimeUnit;
+
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD;
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD_UNITS;
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_SALT_DIVISOR;
+
+/**
+ * Responsible for building the row keys used to store profile data in 
HBase.
+ *
+ * This builder generates decodable row keys.  A decodable row key is one 
that can be interrogated to extract
+ * the constituent components of that row key.  Given a previously 
generated row key this builder
+ * can extract the profile name, entity name, group name(s), period 
duration, and period.
+ *
+ * The row key is composed of the following fields.
+ * 
+ * magic number - Helps to validate the row key.
+ * version - The version number of the row key.
+ * salt - A salt that helps prevent hot-spotting.
+ * profile - The name of the profile.
+ * entity - The name of the entity being profiled.
+ * group(s) - The group(s) used to sort the data in HBase. For 
example, a group may distinguish between weekends and weekdays.
+ * period - The period in which the measurement was taken. The first 
period starts at the epoch and increases monotonically.
+ * 
+ */
+public class DecodableRowKeyBuilder implements RowKeyBuilder {
+
+  /**
+   * Defines the byte order when encoding and decoding the row keys.
+   *
+   * Making this configurable is likely not necessary and is left as a 
practice exercise for the reader. :)
+   */
+  private static final ByteOrder byteOrder = ByteOrder.BIG_ENDIAN;
+
+  /**
+   * Defines some level of sane max field length to avoid any shenanigans 
with oddly encoded row keys.
+   */
+  private static final int MAX_FIELD_LENGTH = 1000;
+
+  /**
+   * A magic number embedded in each row key to help validate the row key 
and byte ordering when decoding.
+   */
+  protected static final short MAGIC_NUMBER = 77;
+
+  /**
+   * The version number of the row keys supported by this builder.
+   */
+  protected static final byte VERSION = (byte) 1;
+
+  /**
+   * A salt can be prepended to the row key to help prevent hot-spotting.  
The salt
+   * divisor is used to generate the salt.  The salt divisor should be 
roughly equal
+   * to the number of nodes in the Hbase cluster.
+   */
+  private int saltDivisor;
+
+  /**
+   * The duration of each profile period in milliseconds.
+   */
+  private long periodDurationMillis;
+
+  public DecodableRowKeyBuilder() {
+this(PROFILER_SALT_DIVISOR.getDefault(Integer.class),
+PROFILER_PERIOD.getDefault(Long.class),
+
TimeUnit.valueOf(PROFILER_PERIOD_UNITS.getDefault(String.class)));
+  }
+
+  public 

[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100090#comment-16100090
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/622#discussion_r129318110
  
--- Diff: 
metron-analytics/metron-profiler-common/src/main/java/org/apache/metron/profiler/hbase/DecodableRowKeyBuilder.java
 ---
@@ -0,0 +1,382 @@
+/*
+ *
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ *
+ */
+
+package org.apache.metron.profiler.hbase;
+
+import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.metron.profiler.ProfileMeasurement;
+import org.apache.metron.profiler.ProfilePeriod;
+
+import java.nio.BufferUnderflowException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.security.MessageDigest;
+import java.security.NoSuchAlgorithmException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Optional;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Responsible for building the row keys used to store profile data in 
HBase.
+ *
+ * This builder generates decodable row keys.  A decodable row key is one 
that can be interrogated to extract
+ * the constituent components of that row key.  Given a previously 
generated row key this builder
+ * can extract the profile name, entity name, group name(s), period 
duration, and period.
+ *
+ * The row key is composed of the following fields.
+ * 
+ * magic number - Helps to validate the row key.
+ * version - The version number of the row key.
+ * salt - A salt that helps prevent hot-spotting.
+ * profile - The name of the profile.
+ * entity - The name of the entity being profiled.
+ * group(s) - The group(s) used to sort the data in HBase. For 
example, a group may distinguish between weekends and weekdays.
+ * period - The period in which the measurement was taken. The first 
period starts at the epoch and increases monotonically.
+ * 
+ */
+public class DecodableRowKeyBuilder implements RowKeyBuilder {
+
+  /**
+   * Defines the byte order when encoding and decoding the row keys.
+   *
+   * Making this configurable is likely not necessary and is left as a 
practice exercise for the reader. :)
+   */
+  private static final ByteOrder byteOrder = ByteOrder.BIG_ENDIAN;
+
+  /**
+   * Defines some level of sane max field length to avoid any shenanigans 
with oddly encoded row keys.
+   */
+  private static final int MAX_FIELD_LENGTH = 1000;
+
+  /**
+   * A magic number embedded in each row key to help validate the row key 
and byte ordering when decoding.
+   */
+  protected static final short MAGIC_NUMBER = 77;
+
+  /**
+   * The version number of the row keys supported by this builder.
+   */
+  protected static final byte VERSION = (byte) 1;
+
+  /**
+   * A salt can be prepended to the row key to help prevent hot-spotting.  
The salt
+   * divisor is used to generate the salt.  The salt divisor should be 
roughly equal
+   * to the number of nodes in the Hbase cluster.
+   */
+  private int saltDivisor;
+
+  /**
+   * The duration of each profile period in milliseconds.
+   */
+  private long periodDurationMillis;
+
+  public DecodableRowKeyBuilder() {
+this(1000, 15, TimeUnit.MINUTES);
+  }
+
+  public DecodableRowKeyBuilder(int saltDivisor, long duration, TimeUnit 
units) {
+this.saltDivisor = saltDivisor;
+this.periodDurationMillis = units.toMillis(duration);
+  }
+
+  /**
+   * Builds a list of row keys necessary to retrieve profile measurements 
over
+   * a time horizon.
+   *
+   * @param profile The name of the profile.
+   * @param entity The name of the entity.
+   * @param groups The group(s) used to sort the 

[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100061#comment-16100061
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/622#discussion_r129309663
  
--- Diff: 
metron-analytics/metron-profiler-common/src/main/java/org/apache/metron/profiler/hbase/SaltyRowKeyBuilder.java
 ---
@@ -81,20 +99,19 @@ public SaltyRowKeyBuilder(int saltDivisor, long 
duration, TimeUnit units) {
* @return All of the row keys necessary to retrieve the profile 
measurements.
*/
   @Override
-  public List rowKeys(String profile, String entity, List 
groups, long start, long end) {
+  public List encode(String profile, String entity, List 
groups, long start, long end) {
 // be forgiving of out-of-order start and end times; order is critical 
to this algorithm
 end = Math.max(start, end);
 start = Math.min(start, end);
--- End diff --

This does look fishy.  I will open a separate JIRA and track this 
separately.  Thanks, @mattf-horton!


> Create Decodable Row Key for Profiler
> -
>
> Key: METRON-1005
> URL: https://issues.apache.org/jira/browse/METRON-1005
> Project: Metron
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Nick Allen
>Assignee: Nick Allen
> Fix For: Next + 1
>
>
> To be able to answer the types of questions that I outlined in METRON-450, we 
> need a row key that is decodable.  Right now there is no logic to decode a 
> row key, nor is the existing row key easily decodable.  
> Once the row keys can be decoded, you could scan all of the row keys in the 
> Profiler's HBase table, decode each of them and extract things like, the 
> names of all your profiles, the names of entities within a profile, the 
> period duration of a given profile.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100051#comment-16100051
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/622#discussion_r129306225
  
--- Diff: 
metron-analytics/metron-profiler-common/src/main/java/org/apache/metron/profiler/hbase/DecodableRowKeyBuilder.java
 ---
@@ -0,0 +1,402 @@
+/*
+ *
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ *
+ */
+
+package org.apache.metron.profiler.hbase;
+
+import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.metron.profiler.ProfileMeasurement;
+import org.apache.metron.profiler.ProfilePeriod;
+
+import java.nio.BufferUnderflowException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.security.MessageDigest;
+import java.security.NoSuchAlgorithmException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.TimeUnit;
+
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD;
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD_UNITS;
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_SALT_DIVISOR;
+
+/**
+ * Responsible for building the row keys used to store profile data in 
HBase.
+ *
+ * This builder generates decodable row keys.  A decodable row key is one 
that can be interrogated to extract
+ * the constituent components of that row key.  Given a previously 
generated row key this builder
+ * can extract the profile name, entity name, group name(s), period 
duration, and period.
+ *
+ * The row key is composed of the following fields.
+ * 
+ * magic number - Helps to validate the row key.
+ * version - The version number of the row key.
+ * salt - A salt that helps prevent hot-spotting.
+ * profile - The name of the profile.
+ * entity - The name of the entity being profiled.
+ * group(s) - The group(s) used to sort the data in HBase. For 
example, a group may distinguish between weekends and weekdays.
+ * period - The period in which the measurement was taken. The first 
period starts at the epoch and increases monotonically.
+ * 
+ */
+public class DecodableRowKeyBuilder implements RowKeyBuilder {
+
+  /**
+   * Defines the byte order when encoding and decoding the row keys.
+   *
+   * Making this configurable is likely not necessary and is left as a 
practice exercise for the reader. :)
+   */
+  private static final ByteOrder byteOrder = ByteOrder.BIG_ENDIAN;
+
+  /**
+   * Defines some level of sane max field length to avoid any shenanigans 
with oddly encoded row keys.
+   */
+  private static final int MAX_FIELD_LENGTH = 1000;
+
+  /**
+   * A magic number embedded in each row key to help validate the row key 
and byte ordering when decoding.
+   */
+  protected static final short MAGIC_NUMBER = 77;
+
+  /**
+   * The version number of the row keys supported by this builder.
+   */
+  protected static final byte VERSION = (byte) 1;
+
+  /**
+   * A salt can be prepended to the row key to help prevent hot-spotting.  
The salt
+   * divisor is used to generate the salt.  The salt divisor should be 
roughly equal
+   * to the number of nodes in the Hbase cluster.
+   */
+  private int saltDivisor;
+
+  /**
+   * The duration of each profile period in milliseconds.
+   */
+  private long periodDurationMillis;
+
+  public DecodableRowKeyBuilder() {
+this(PROFILER_SALT_DIVISOR.getDefault(Integer.class),
+PROFILER_PERIOD.getDefault(Long.class),
+
TimeUnit.valueOf(PROFILER_PERIOD_UNITS.getDefault(String.class)));
+  }
+
+  public 

[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16099761#comment-16099761
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user cestella commented on a diff in the pull request:

https://github.com/apache/metron/pull/622#discussion_r129252678
  
--- Diff: 
metron-analytics/metron-profiler-common/src/main/java/org/apache/metron/profiler/hbase/DecodableRowKeyBuilder.java
 ---
@@ -0,0 +1,402 @@
+/*
+ *
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ *
+ */
+
+package org.apache.metron.profiler.hbase;
+
+import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.metron.profiler.ProfileMeasurement;
+import org.apache.metron.profiler.ProfilePeriod;
+
+import java.nio.BufferUnderflowException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.security.MessageDigest;
+import java.security.NoSuchAlgorithmException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.TimeUnit;
+
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD;
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD_UNITS;
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_SALT_DIVISOR;
+
+/**
+ * Responsible for building the row keys used to store profile data in 
HBase.
+ *
+ * This builder generates decodable row keys.  A decodable row key is one 
that can be interrogated to extract
+ * the constituent components of that row key.  Given a previously 
generated row key this builder
+ * can extract the profile name, entity name, group name(s), period 
duration, and period.
+ *
+ * The row key is composed of the following fields.
+ * 
+ * magic number - Helps to validate the row key.
+ * version - The version number of the row key.
+ * salt - A salt that helps prevent hot-spotting.
+ * profile - The name of the profile.
+ * entity - The name of the entity being profiled.
+ * group(s) - The group(s) used to sort the data in HBase. For 
example, a group may distinguish between weekends and weekdays.
+ * period - The period in which the measurement was taken. The first 
period starts at the epoch and increases monotonically.
+ * 
+ */
+public class DecodableRowKeyBuilder implements RowKeyBuilder {
+
+  /**
+   * Defines the byte order when encoding and decoding the row keys.
+   *
+   * Making this configurable is likely not necessary and is left as a 
practice exercise for the reader. :)
+   */
+  private static final ByteOrder byteOrder = ByteOrder.BIG_ENDIAN;
+
+  /**
+   * Defines some level of sane max field length to avoid any shenanigans 
with oddly encoded row keys.
+   */
+  private static final int MAX_FIELD_LENGTH = 1000;
+
+  /**
+   * A magic number embedded in each row key to help validate the row key 
and byte ordering when decoding.
+   */
+  protected static final short MAGIC_NUMBER = 77;
+
+  /**
+   * The version number of the row keys supported by this builder.
+   */
+  protected static final byte VERSION = (byte) 1;
+
+  /**
+   * A salt can be prepended to the row key to help prevent hot-spotting.  
The salt
+   * divisor is used to generate the salt.  The salt divisor should be 
roughly equal
+   * to the number of nodes in the Hbase cluster.
+   */
+  private int saltDivisor;
+
+  /**
+   * The duration of each profile period in milliseconds.
+   */
+  private long periodDurationMillis;
+
+  public DecodableRowKeyBuilder() {
+this(PROFILER_SALT_DIVISOR.getDefault(Integer.class),
+PROFILER_PERIOD.getDefault(Long.class),
+
TimeUnit.valueOf(PROFILER_PERIOD_UNITS.getDefault(String.class)));
+  }
+
+  public 

[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16099756#comment-16099756
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user cestella commented on a diff in the pull request:

https://github.com/apache/metron/pull/622#discussion_r129252112
  
--- Diff: 
metron-analytics/metron-profiler-common/src/main/java/org/apache/metron/profiler/hbase/DecodableRowKeyBuilder.java
 ---
@@ -0,0 +1,402 @@
+/*
+ *
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ *
+ */
+
+package org.apache.metron.profiler.hbase;
+
+import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.metron.profiler.ProfileMeasurement;
+import org.apache.metron.profiler.ProfilePeriod;
+
+import java.nio.BufferUnderflowException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.security.MessageDigest;
+import java.security.NoSuchAlgorithmException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.TimeUnit;
+
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD;
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD_UNITS;
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_SALT_DIVISOR;
+
+/**
+ * Responsible for building the row keys used to store profile data in 
HBase.
+ *
+ * This builder generates decodable row keys.  A decodable row key is one 
that can be interrogated to extract
+ * the constituent components of that row key.  Given a previously 
generated row key this builder
+ * can extract the profile name, entity name, group name(s), period 
duration, and period.
+ *
+ * The row key is composed of the following fields.
+ * 
+ * magic number - Helps to validate the row key.
+ * version - The version number of the row key.
+ * salt - A salt that helps prevent hot-spotting.
+ * profile - The name of the profile.
+ * entity - The name of the entity being profiled.
+ * group(s) - The group(s) used to sort the data in HBase. For 
example, a group may distinguish between weekends and weekdays.
+ * period - The period in which the measurement was taken. The first 
period starts at the epoch and increases monotonically.
+ * 
+ */
+public class DecodableRowKeyBuilder implements RowKeyBuilder {
+
+  /**
+   * Defines the byte order when encoding and decoding the row keys.
+   *
+   * Making this configurable is likely not necessary and is left as a 
practice exercise for the reader. :)
+   */
+  private static final ByteOrder byteOrder = ByteOrder.BIG_ENDIAN;
+
+  /**
+   * Defines some level of sane max field length to avoid any shenanigans 
with oddly encoded row keys.
+   */
+  private static final int MAX_FIELD_LENGTH = 1000;
+
+  /**
+   * A magic number embedded in each row key to help validate the row key 
and byte ordering when decoding.
+   */
+  protected static final short MAGIC_NUMBER = 77;
+
+  /**
+   * The version number of the row keys supported by this builder.
+   */
+  protected static final byte VERSION = (byte) 1;
+
+  /**
+   * A salt can be prepended to the row key to help prevent hot-spotting.  
The salt
+   * divisor is used to generate the salt.  The salt divisor should be 
roughly equal
+   * to the number of nodes in the Hbase cluster.
+   */
+  private int saltDivisor;
+
+  /**
+   * The duration of each profile period in milliseconds.
+   */
+  private long periodDurationMillis;
+
+  public DecodableRowKeyBuilder() {
+this(PROFILER_SALT_DIVISOR.getDefault(Integer.class),
+PROFILER_PERIOD.getDefault(Long.class),
+
TimeUnit.valueOf(PROFILER_PERIOD_UNITS.getDefault(String.class)));
+  }
+
+  public 

[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16099609#comment-16099609
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user mattf-horton commented on a diff in the pull request:

https://github.com/apache/metron/pull/622#discussion_r129227588
  
--- Diff: 
metron-analytics/metron-profiler-common/src/main/java/org/apache/metron/profiler/hbase/DecodableRowKeyBuilder.java
 ---
@@ -0,0 +1,402 @@
+/*
+ *
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ *
+ */
+
+package org.apache.metron.profiler.hbase;
+
+import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.metron.profiler.ProfileMeasurement;
+import org.apache.metron.profiler.ProfilePeriod;
+
+import java.nio.BufferUnderflowException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.security.MessageDigest;
+import java.security.NoSuchAlgorithmException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.TimeUnit;
+
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD;
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD_UNITS;
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_SALT_DIVISOR;
+
+/**
+ * Responsible for building the row keys used to store profile data in 
HBase.
+ *
+ * This builder generates decodable row keys.  A decodable row key is one 
that can be interrogated to extract
+ * the constituent components of that row key.  Given a previously 
generated row key this builder
+ * can extract the profile name, entity name, group name(s), period 
duration, and period.
+ *
+ * The row key is composed of the following fields.
+ * 
+ * magic number - Helps to validate the row key.
+ * version - The version number of the row key.
+ * salt - A salt that helps prevent hot-spotting.
+ * profile - The name of the profile.
+ * entity - The name of the entity being profiled.
+ * group(s) - The group(s) used to sort the data in HBase. For 
example, a group may distinguish between weekends and weekdays.
+ * period - The period in which the measurement was taken. The first 
period starts at the epoch and increases monotonically.
+ * 
+ */
+public class DecodableRowKeyBuilder implements RowKeyBuilder {
+
+  /**
+   * Defines the byte order when encoding and decoding the row keys.
+   *
+   * Making this configurable is likely not necessary and is left as a 
practice exercise for the reader. :)
+   */
+  private static final ByteOrder byteOrder = ByteOrder.BIG_ENDIAN;
+
+  /**
+   * Defines some level of sane max field length to avoid any shenanigans 
with oddly encoded row keys.
+   */
+  private static final int MAX_FIELD_LENGTH = 1000;
+
+  /**
+   * A magic number embedded in each row key to help validate the row key 
and byte ordering when decoding.
+   */
+  protected static final short MAGIC_NUMBER = 77;
+
+  /**
+   * The version number of the row keys supported by this builder.
+   */
+  protected static final byte VERSION = (byte) 1;
+
+  /**
+   * A salt can be prepended to the row key to help prevent hot-spotting.  
The salt
+   * divisor is used to generate the salt.  The salt divisor should be 
roughly equal
+   * to the number of nodes in the Hbase cluster.
+   */
+  private int saltDivisor;
+
+  /**
+   * The duration of each profile period in milliseconds.
+   */
+  private long periodDurationMillis;
+
+  public DecodableRowKeyBuilder() {
+this(PROFILER_SALT_DIVISOR.getDefault(Integer.class),
+PROFILER_PERIOD.getDefault(Long.class),
+
TimeUnit.valueOf(PROFILER_PERIOD_UNITS.getDefault(String.class)));
+  }
+
+  public 

[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16099596#comment-16099596
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user mattf-horton commented on a diff in the pull request:

https://github.com/apache/metron/pull/622#discussion_r129226069
  
--- Diff: 
metron-analytics/metron-profiler-common/src/main/java/org/apache/metron/profiler/hbase/DecodableRowKeyBuilder.java
 ---
@@ -0,0 +1,402 @@
+/*
+ *
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ *
+ */
+
+package org.apache.metron.profiler.hbase;
+
+import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.metron.profiler.ProfileMeasurement;
+import org.apache.metron.profiler.ProfilePeriod;
+
+import java.nio.BufferUnderflowException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.security.MessageDigest;
+import java.security.NoSuchAlgorithmException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.TimeUnit;
+
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD;
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD_UNITS;
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_SALT_DIVISOR;
+
+/**
+ * Responsible for building the row keys used to store profile data in 
HBase.
+ *
+ * This builder generates decodable row keys.  A decodable row key is one 
that can be interrogated to extract
+ * the constituent components of that row key.  Given a previously 
generated row key this builder
+ * can extract the profile name, entity name, group name(s), period 
duration, and period.
+ *
+ * The row key is composed of the following fields.
+ * 
+ * magic number - Helps to validate the row key.
+ * version - The version number of the row key.
+ * salt - A salt that helps prevent hot-spotting.
+ * profile - The name of the profile.
+ * entity - The name of the entity being profiled.
+ * group(s) - The group(s) used to sort the data in HBase. For 
example, a group may distinguish between weekends and weekdays.
+ * period - The period in which the measurement was taken. The first 
period starts at the epoch and increases monotonically.
+ * 
+ */
+public class DecodableRowKeyBuilder implements RowKeyBuilder {
+
+  /**
+   * Defines the byte order when encoding and decoding the row keys.
+   *
+   * Making this configurable is likely not necessary and is left as a 
practice exercise for the reader. :)
+   */
+  private static final ByteOrder byteOrder = ByteOrder.BIG_ENDIAN;
+
+  /**
+   * Defines some level of sane max field length to avoid any shenanigans 
with oddly encoded row keys.
+   */
+  private static final int MAX_FIELD_LENGTH = 1000;
+
+  /**
+   * A magic number embedded in each row key to help validate the row key 
and byte ordering when decoding.
+   */
+  protected static final short MAGIC_NUMBER = 77;
+
+  /**
+   * The version number of the row keys supported by this builder.
+   */
+  protected static final byte VERSION = (byte) 1;
+
+  /**
+   * A salt can be prepended to the row key to help prevent hot-spotting.  
The salt
+   * divisor is used to generate the salt.  The salt divisor should be 
roughly equal
+   * to the number of nodes in the Hbase cluster.
+   */
+  private int saltDivisor;
+
+  /**
+   * The duration of each profile period in milliseconds.
+   */
+  private long periodDurationMillis;
+
+  public DecodableRowKeyBuilder() {
+this(PROFILER_SALT_DIVISOR.getDefault(Integer.class),
+PROFILER_PERIOD.getDefault(Long.class),
+
TimeUnit.valueOf(PROFILER_PERIOD_UNITS.getDefault(String.class)));
+  }
+
+  public 

[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16099566#comment-16099566
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user simonellistonball commented on a diff in the pull request:

https://github.com/apache/metron/pull/622#discussion_r129221553
  
--- Diff: 
metron-analytics/metron-profiler-common/src/main/java/org/apache/metron/profiler/hbase/DecodableRowKeyBuilder.java
 ---
@@ -0,0 +1,402 @@
+/*
+ *
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ *
+ */
+
+package org.apache.metron.profiler.hbase;
+
+import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.metron.profiler.ProfileMeasurement;
+import org.apache.metron.profiler.ProfilePeriod;
+
+import java.nio.BufferUnderflowException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.security.MessageDigest;
+import java.security.NoSuchAlgorithmException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.TimeUnit;
+
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD;
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD_UNITS;
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_SALT_DIVISOR;
+
+/**
+ * Responsible for building the row keys used to store profile data in 
HBase.
+ *
+ * This builder generates decodable row keys.  A decodable row key is one 
that can be interrogated to extract
+ * the constituent components of that row key.  Given a previously 
generated row key this builder
+ * can extract the profile name, entity name, group name(s), period 
duration, and period.
+ *
+ * The row key is composed of the following fields.
+ * 
+ * magic number - Helps to validate the row key.
+ * version - The version number of the row key.
+ * salt - A salt that helps prevent hot-spotting.
+ * profile - The name of the profile.
+ * entity - The name of the entity being profiled.
+ * group(s) - The group(s) used to sort the data in HBase. For 
example, a group may distinguish between weekends and weekdays.
+ * period - The period in which the measurement was taken. The first 
period starts at the epoch and increases monotonically.
+ * 
+ */
+public class DecodableRowKeyBuilder implements RowKeyBuilder {
+
+  /**
+   * Defines the byte order when encoding and decoding the row keys.
+   *
+   * Making this configurable is likely not necessary and is left as a 
practice exercise for the reader. :)
+   */
+  private static final ByteOrder byteOrder = ByteOrder.BIG_ENDIAN;
+
+  /**
+   * Defines some level of sane max field length to avoid any shenanigans 
with oddly encoded row keys.
+   */
+  private static final int MAX_FIELD_LENGTH = 1000;
+
+  /**
+   * A magic number embedded in each row key to help validate the row key 
and byte ordering when decoding.
+   */
+  protected static final short MAGIC_NUMBER = 77;
+
+  /**
+   * The version number of the row keys supported by this builder.
+   */
+  protected static final byte VERSION = (byte) 1;
+
+  /**
+   * A salt can be prepended to the row key to help prevent hot-spotting.  
The salt
+   * divisor is used to generate the salt.  The salt divisor should be 
roughly equal
+   * to the number of nodes in the Hbase cluster.
+   */
+  private int saltDivisor;
+
+  /**
+   * The duration of each profile period in milliseconds.
+   */
+  private long periodDurationMillis;
+
+  public DecodableRowKeyBuilder() {
+this(PROFILER_SALT_DIVISOR.getDefault(Integer.class),
+PROFILER_PERIOD.getDefault(Long.class),
+
TimeUnit.valueOf(PROFILER_PERIOD_UNITS.getDefault(String.class)));
+  }
+
+  public 

[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16099368#comment-16099368
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user mattf-horton commented on a diff in the pull request:

https://github.com/apache/metron/pull/622#discussion_r129187853
  
--- Diff: 
metron-analytics/metron-profiler-common/src/main/java/org/apache/metron/profiler/hbase/DecodableRowKeyBuilder.java
 ---
@@ -0,0 +1,402 @@
+/*
+ *
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ *
+ */
+
+package org.apache.metron.profiler.hbase;
+
+import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.metron.profiler.ProfileMeasurement;
+import org.apache.metron.profiler.ProfilePeriod;
+
+import java.nio.BufferUnderflowException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.security.MessageDigest;
+import java.security.NoSuchAlgorithmException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.TimeUnit;
+
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD;
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD_UNITS;
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_SALT_DIVISOR;
+
+/**
+ * Responsible for building the row keys used to store profile data in 
HBase.
+ *
+ * This builder generates decodable row keys.  A decodable row key is one 
that can be interrogated to extract
+ * the constituent components of that row key.  Given a previously 
generated row key this builder
+ * can extract the profile name, entity name, group name(s), period 
duration, and period.
+ *
+ * The row key is composed of the following fields.
+ * 
+ * magic number - Helps to validate the row key.
+ * version - The version number of the row key.
+ * salt - A salt that helps prevent hot-spotting.
+ * profile - The name of the profile.
+ * entity - The name of the entity being profiled.
+ * group(s) - The group(s) used to sort the data in HBase. For 
example, a group may distinguish between weekends and weekdays.
+ * period - The period in which the measurement was taken. The first 
period starts at the epoch and increases monotonically.
+ * 
+ */
+public class DecodableRowKeyBuilder implements RowKeyBuilder {
+
+  /**
+   * Defines the byte order when encoding and decoding the row keys.
+   *
+   * Making this configurable is likely not necessary and is left as a 
practice exercise for the reader. :)
+   */
+  private static final ByteOrder byteOrder = ByteOrder.BIG_ENDIAN;
+
+  /**
+   * Defines some level of sane max field length to avoid any shenanigans 
with oddly encoded row keys.
+   */
+  private static final int MAX_FIELD_LENGTH = 1000;
+
+  /**
+   * A magic number embedded in each row key to help validate the row key 
and byte ordering when decoding.
+   */
+  protected static final short MAGIC_NUMBER = 77;
+
+  /**
+   * The version number of the row keys supported by this builder.
+   */
+  protected static final byte VERSION = (byte) 1;
+
+  /**
+   * A salt can be prepended to the row key to help prevent hot-spotting.  
The salt
+   * divisor is used to generate the salt.  The salt divisor should be 
roughly equal
+   * to the number of nodes in the Hbase cluster.
+   */
+  private int saltDivisor;
+
+  /**
+   * The duration of each profile period in milliseconds.
+   */
+  private long periodDurationMillis;
+
+  public DecodableRowKeyBuilder() {
+this(PROFILER_SALT_DIVISOR.getDefault(Integer.class),
+PROFILER_PERIOD.getDefault(Long.class),
+
TimeUnit.valueOf(PROFILER_PERIOD_UNITS.getDefault(String.class)));
+  }
+
+  public 

[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16099365#comment-16099365
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user mattf-horton commented on a diff in the pull request:

https://github.com/apache/metron/pull/622#discussion_r129186527
  
--- Diff: 
metron-analytics/metron-profiler-common/src/main/java/org/apache/metron/profiler/hbase/DecodableRowKeyBuilder.java
 ---
@@ -0,0 +1,402 @@
+/*
+ *
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ *
+ */
+
+package org.apache.metron.profiler.hbase;
+
+import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.metron.profiler.ProfileMeasurement;
+import org.apache.metron.profiler.ProfilePeriod;
+
+import java.nio.BufferUnderflowException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.security.MessageDigest;
+import java.security.NoSuchAlgorithmException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.TimeUnit;
+
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD;
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD_UNITS;
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_SALT_DIVISOR;
+
+/**
+ * Responsible for building the row keys used to store profile data in 
HBase.
+ *
+ * This builder generates decodable row keys.  A decodable row key is one 
that can be interrogated to extract
+ * the constituent components of that row key.  Given a previously 
generated row key this builder
+ * can extract the profile name, entity name, group name(s), period 
duration, and period.
+ *
+ * The row key is composed of the following fields.
+ * 
+ * magic number - Helps to validate the row key.
+ * version - The version number of the row key.
+ * salt - A salt that helps prevent hot-spotting.
+ * profile - The name of the profile.
+ * entity - The name of the entity being profiled.
+ * group(s) - The group(s) used to sort the data in HBase. For 
example, a group may distinguish between weekends and weekdays.
+ * period - The period in which the measurement was taken. The first 
period starts at the epoch and increases monotonically.
+ * 
+ */
+public class DecodableRowKeyBuilder implements RowKeyBuilder {
+
+  /**
+   * Defines the byte order when encoding and decoding the row keys.
+   *
+   * Making this configurable is likely not necessary and is left as a 
practice exercise for the reader. :)
+   */
+  private static final ByteOrder byteOrder = ByteOrder.BIG_ENDIAN;
+
+  /**
+   * Defines some level of sane max field length to avoid any shenanigans 
with oddly encoded row keys.
+   */
+  private static final int MAX_FIELD_LENGTH = 1000;
+
+  /**
+   * A magic number embedded in each row key to help validate the row key 
and byte ordering when decoding.
+   */
+  protected static final short MAGIC_NUMBER = 77;
+
+  /**
+   * The version number of the row keys supported by this builder.
+   */
+  protected static final byte VERSION = (byte) 1;
+
+  /**
+   * A salt can be prepended to the row key to help prevent hot-spotting.  
The salt
+   * divisor is used to generate the salt.  The salt divisor should be 
roughly equal
+   * to the number of nodes in the Hbase cluster.
+   */
+  private int saltDivisor;
+
+  /**
+   * The duration of each profile period in milliseconds.
+   */
+  private long periodDurationMillis;
+
+  public DecodableRowKeyBuilder() {
+this(PROFILER_SALT_DIVISOR.getDefault(Integer.class),
+PROFILER_PERIOD.getDefault(Long.class),
+
TimeUnit.valueOf(PROFILER_PERIOD_UNITS.getDefault(String.class)));
+  }
+
+  public 

[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16099367#comment-16099367
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user mattf-horton commented on a diff in the pull request:

https://github.com/apache/metron/pull/622#discussion_r129193892
  
--- Diff: 
metron-analytics/metron-profiler-common/src/main/java/org/apache/metron/profiler/hbase/DecodableRowKeyBuilder.java
 ---
@@ -0,0 +1,382 @@
+/*
+ *
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ *
+ */
+
+package org.apache.metron.profiler.hbase;
+
+import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.metron.profiler.ProfileMeasurement;
+import org.apache.metron.profiler.ProfilePeriod;
+
+import java.nio.BufferUnderflowException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.security.MessageDigest;
+import java.security.NoSuchAlgorithmException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Optional;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Responsible for building the row keys used to store profile data in 
HBase.
+ *
+ * This builder generates decodable row keys.  A decodable row key is one 
that can be interrogated to extract
+ * the constituent components of that row key.  Given a previously 
generated row key this builder
+ * can extract the profile name, entity name, group name(s), period 
duration, and period.
+ *
+ * The row key is composed of the following fields.
+ * 
+ * magic number - Helps to validate the row key.
+ * version - The version number of the row key.
+ * salt - A salt that helps prevent hot-spotting.
+ * profile - The name of the profile.
+ * entity - The name of the entity being profiled.
+ * group(s) - The group(s) used to sort the data in HBase. For 
example, a group may distinguish between weekends and weekdays.
+ * period - The period in which the measurement was taken. The first 
period starts at the epoch and increases monotonically.
+ * 
+ */
+public class DecodableRowKeyBuilder implements RowKeyBuilder {
+
+  /**
+   * Defines the byte order when encoding and decoding the row keys.
+   *
+   * Making this configurable is likely not necessary and is left as a 
practice exercise for the reader. :)
+   */
+  private static final ByteOrder byteOrder = ByteOrder.BIG_ENDIAN;
+
+  /**
+   * Defines some level of sane max field length to avoid any shenanigans 
with oddly encoded row keys.
+   */
+  private static final int MAX_FIELD_LENGTH = 1000;
+
+  /**
+   * A magic number embedded in each row key to help validate the row key 
and byte ordering when decoding.
+   */
+  protected static final short MAGIC_NUMBER = 77;
+
+  /**
+   * The version number of the row keys supported by this builder.
+   */
+  protected static final byte VERSION = (byte) 1;
+
+  /**
+   * A salt can be prepended to the row key to help prevent hot-spotting.  
The salt
+   * divisor is used to generate the salt.  The salt divisor should be 
roughly equal
+   * to the number of nodes in the Hbase cluster.
+   */
+  private int saltDivisor;
+
+  /**
+   * The duration of each profile period in milliseconds.
+   */
+  private long periodDurationMillis;
+
+  public DecodableRowKeyBuilder() {
+this(1000, 15, TimeUnit.MINUTES);
+  }
+
+  public DecodableRowKeyBuilder(int saltDivisor, long duration, TimeUnit 
units) {
+this.saltDivisor = saltDivisor;
+this.periodDurationMillis = units.toMillis(duration);
+  }
+
+  /**
+   * Builds a list of row keys necessary to retrieve profile measurements 
over
+   * a time horizon.
+   *
+   * @param profile The name of the profile.
+   * @param entity The name of the entity.
+   * @param groups The group(s) used to sort 

[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16099366#comment-16099366
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user mattf-horton commented on a diff in the pull request:

https://github.com/apache/metron/pull/622#discussion_r129190841
  
--- Diff: 
metron-analytics/metron-profiler-common/src/main/java/org/apache/metron/profiler/hbase/DecodableRowKeyBuilder.java
 ---
@@ -0,0 +1,402 @@
+/*
+ *
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ *
+ */
+
+package org.apache.metron.profiler.hbase;
+
+import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.metron.profiler.ProfileMeasurement;
+import org.apache.metron.profiler.ProfilePeriod;
+
+import java.nio.BufferUnderflowException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.security.MessageDigest;
+import java.security.NoSuchAlgorithmException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.TimeUnit;
+
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD;
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_PERIOD_UNITS;
+import static 
org.apache.metron.profiler.ProfilerClientConfig.PROFILER_SALT_DIVISOR;
+
+/**
+ * Responsible for building the row keys used to store profile data in 
HBase.
+ *
+ * This builder generates decodable row keys.  A decodable row key is one 
that can be interrogated to extract
+ * the constituent components of that row key.  Given a previously 
generated row key this builder
+ * can extract the profile name, entity name, group name(s), period 
duration, and period.
+ *
+ * The row key is composed of the following fields.
+ * 
+ * magic number - Helps to validate the row key.
+ * version - The version number of the row key.
+ * salt - A salt that helps prevent hot-spotting.
+ * profile - The name of the profile.
+ * entity - The name of the entity being profiled.
+ * group(s) - The group(s) used to sort the data in HBase. For 
example, a group may distinguish between weekends and weekdays.
+ * period - The period in which the measurement was taken. The first 
period starts at the epoch and increases monotonically.
+ * 
+ */
+public class DecodableRowKeyBuilder implements RowKeyBuilder {
+
+  /**
+   * Defines the byte order when encoding and decoding the row keys.
+   *
+   * Making this configurable is likely not necessary and is left as a 
practice exercise for the reader. :)
+   */
+  private static final ByteOrder byteOrder = ByteOrder.BIG_ENDIAN;
+
+  /**
+   * Defines some level of sane max field length to avoid any shenanigans 
with oddly encoded row keys.
+   */
+  private static final int MAX_FIELD_LENGTH = 1000;
+
+  /**
+   * A magic number embedded in each row key to help validate the row key 
and byte ordering when decoding.
+   */
+  protected static final short MAGIC_NUMBER = 77;
+
+  /**
+   * The version number of the row keys supported by this builder.
+   */
+  protected static final byte VERSION = (byte) 1;
+
+  /**
+   * A salt can be prepended to the row key to help prevent hot-spotting.  
The salt
+   * divisor is used to generate the salt.  The salt divisor should be 
roughly equal
+   * to the number of nodes in the Hbase cluster.
+   */
+  private int saltDivisor;
+
+  /**
+   * The duration of each profile period in milliseconds.
+   */
+  private long periodDurationMillis;
+
+  public DecodableRowKeyBuilder() {
+this(PROFILER_SALT_DIVISOR.getDefault(Integer.class),
+PROFILER_PERIOD.getDefault(Long.class),
+
TimeUnit.valueOf(PROFILER_PERIOD_UNITS.getDefault(String.class)));
+  }
+
+  public 

[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095933#comment-16095933
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user mattf-horton commented on the issue:

https://github.com/apache/metron/pull/622
  
And btw, since there is no easily expressed algorithm for the NLP part of 
the problem, I'm +1 on doing both a decodable rowkey and a ToC.  For the 
existing profiles that @cestella expressed concern about, I would point out 
that as long as one DOES have the Profile specs still lying around, it's 
actually easy to re-write the old Profiles into new format with decodable 
rowkeys.  That is a very modest-sized program, the main problem being noticing 
and dealing with duplicate titled Profiles with different periodDurations.  But 
the info I pointed out in the paper helps sufficiently, I think.


> Create Decodable Row Key for Profiler
> -
>
> Key: METRON-1005
> URL: https://issues.apache.org/jira/browse/METRON-1005
> Project: Metron
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Nick Allen
>Assignee: Nick Allen
> Fix For: Next + 1
>
>
> To be able to answer the types of questions that I outlined in METRON-450, we 
> need a row key that is decodable.  Right now there is no logic to decode a 
> row key, nor is the existing row key easily decodable.  
> Once the row keys can be decoded, you could scan all of the row keys in the 
> Profiler's HBase table, decode each of them and extract things like, the 
> names of all your profiles, the names of entities within a profile, the 
> period duration of a given profile.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16088597#comment-16088597
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/622
  
> Create a Profile Audit Log table in HBase

This is an interesting idea, Matt.  I like it.  Along these same lines I 
was thinking of a table of contents.  I think your audit log idea is one way to 
implement a table of contents. 

I have looked at a TSDB implementation backed by HBase, OpenTSDB, and they 
use a table of contents approach.  The ToC records metadata about the time 
series data that is stored. 

But I don't think these ideas are mutually exclusive with a decodable row 
key.  The decodable row key would allow us to rebuild the ToC should it become 
corrupted or lost.

Are you thinking that a decodable row key is not needed at all?


> Create Decodable Row Key for Profiler
> -
>
> Key: METRON-1005
> URL: https://issues.apache.org/jira/browse/METRON-1005
> Project: Metron
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Nick Allen
>Assignee: Nick Allen
> Fix For: Next + 1
>
>
> To be able to answer the types of questions that I outlined in METRON-450, we 
> need a row key that is decodable.  Right now there is no logic to decode a 
> row key, nor is the existing row key easily decodable.  
> Once the row keys can be decoded, you could scan all of the row keys in the 
> Profiler's HBase table, decode each of them and extract things like, the 
> names of all your profiles, the names of entities within a profile, the 
> period duration of a given profile.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16088589#comment-16088589
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/622
  
> Your proposal has the advantage of making data in HBase self-identifying 
(if one has the key), which I always like. However, it's a large change and 
induces yet more complexity

What do you find unnecessarily complex here?  The code base was already 
designed to accept different row key implementations.  So this change involves 
the following.

1. The new decodable row key 
2. Profiler client logic to instantiate row key builders
3. Profiler client logic to pass parameters to the instantiated row key 
builders

I would agree that I think item 3 is unnecessarily complex.  That's where I 
wanted feedback.  I think just passing parameters through an interface method 
would simplify this a lot.



> Create Decodable Row Key for Profiler
> -
>
> Key: METRON-1005
> URL: https://issues.apache.org/jira/browse/METRON-1005
> Project: Metron
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Nick Allen
>Assignee: Nick Allen
> Fix For: Next + 1
>
>
> To be able to answer the types of questions that I outlined in METRON-450, we 
> need a row key that is decodable.  Right now there is no logic to decode a 
> row key, nor is the existing row key easily decodable.  
> Once the row keys can be decoded, you could scan all of the row keys in the 
> Profiler's HBase table, decode each of them and extract things like, the 
> names of all your profiles, the names of entities within a profile, the 
> period duration of a given profile.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16087750#comment-16087750
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user mattf-horton commented on the issue:

https://github.com/apache/metron/pull/622
  
Your proposal has the advantage of making data in HBase self-identifying 
(if one has the key), which I always like.  However, it's a large change and 
induces yet more complexity.  There's an alternative I've been noodling 
occasionally, which I put forward here for consideration:

Create a Profile Audit Log table in HBase.  Every time a Profiler is 
configured, started, or stopped, make one entry in the audit log.  The idea is 
to be able to answer exactly the kinds of questions posed in METRON-450, so the 
records should include things like the configuration, the first and last 
timestamps, and perhaps the key builder parameters.  This would prevent 
historical profiles from being "lost" because the would-be querier doesn't have 
access to the exact config parameters used to write the profile.

For the sake of housekeeping, one might do a scan, daily and/or at system 
restart, to assure that (a) the set Profiles with a "start" but not an "end" 
recorded in the audit log, and (b) the set of currently running Profiles, are 
actually consistent, and record "inferred end" entries in the audit log for 
orphans found.

This solution is somewhat backward-applicable to existing Profile data; I 
think there are brute-force ways to scan the existing HBase tables and infer 
audit log entries, especially if historical configuration data is still 
available.  We could write such a scanner.


> Create Decodable Row Key for Profiler
> -
>
> Key: METRON-1005
> URL: https://issues.apache.org/jira/browse/METRON-1005
> Project: Metron
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Nick Allen
>Assignee: Nick Allen
> Fix For: Next + 1
>
>
> To be able to answer the types of questions that I outlined in METRON-450, we 
> need a row key that is decodable.  Right now there is no logic to decode a 
> row key, nor is the existing row key easily decodable.  
> Once the row keys can be decoded, you could scan all of the row keys in the 
> Profiler's HBase table, decode each of them and extract things like, the 
> names of all your profiles, the names of entities within a profile, the 
> period duration of a given profile.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086421#comment-16086421
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/622#discussion_r127331446
  
--- Diff: 
metron-analytics/metron-profiler-client/src/main/java/org/apache/metron/profiler/client/stellar/RowKeyBuilderFactory.java
 ---
@@ -0,0 +1,125 @@
+/*
+ *
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ *
+ */
+
+package org.apache.metron.profiler.client.stellar;
+
+import org.apache.commons.beanutils.PropertyUtils;
+import org.apache.commons.lang3.ClassUtils;
+import org.apache.metron.common.utils.ReflectionUtils;
+import org.apache.metron.profiler.hbase.RowKeyBuilder;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.reflect.InvocationTargetException;
+import java.util.Map;
+import java.util.concurrent.TimeUnit;
+
+import static 
org.apache.metron.profiler.client.stellar.ProfilerConfig.PROFILER_PERIOD;
+import static 
org.apache.metron.profiler.client.stellar.ProfilerConfig.PROFILER_PERIOD_UNITS;
+import static 
org.apache.metron.profiler.client.stellar.ProfilerConfig.PROFILER_ROW_KEY_BUILDER;
+import static 
org.apache.metron.profiler.client.stellar.ProfilerConfig.PROFILER_SALT_DIVISOR;
+
+/**
+ * A Factory class that can create a RowKeyBuilder based on global 
property values.
+ */
+public class RowKeyBuilderFactory {
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(RowKeyBuilderFactory.class);
+
+  /**
+   * Create a RowKeyBuilder.
+   * @param global The global properties.
+   * @return A RowKeyBuilder instantiated using the global property values.
+   */
+  public static RowKeyBuilder create(Map global) {
+String rowKeyBuilderClass = PROFILER_ROW_KEY_BUILDER.get(global, 
String.class);
+LOG.debug("profiler client: {}={}", PROFILER_ROW_KEY_BUILDER, 
rowKeyBuilderClass);
+
+// instantiate the RowKeyBuilder
+RowKeyBuilder builder = 
ReflectionUtils.createInstance(rowKeyBuilderClass);
+setSaltDivisor(global, builder);
+setPeriodDuration(global, builder);
--- End diff --

If I had some IoC-like functionality like Flux or Spring here, then this 
wouldn't be a problem at all.


> Create Decodable Row Key for Profiler
> -
>
> Key: METRON-1005
> URL: https://issues.apache.org/jira/browse/METRON-1005
> Project: Metron
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Nick Allen
>Assignee: Nick Allen
> Fix For: Next + 1
>
>
> To be able to answer the types of questions that I outlined in METRON-450, we 
> need a row key that is decodable.  Right now there is no logic to decode a 
> row key, nor is the existing row key easily decodable.  
> Once the row keys can be decoded, you could scan all of the row keys in the 
> Profiler's HBase table, decode each of them and extract things like, the 
> names of all your profiles, the names of entities within a profile, the 
> period duration of a given profile.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086413#comment-16086413
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/622#discussion_r127329289
  
--- Diff: 
metron-analytics/metron-profiler-client/src/main/java/org/apache/metron/profiler/client/stellar/RowKeyBuilderFactory.java
 ---
@@ -0,0 +1,125 @@
+/*
+ *
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ *
+ */
+
+package org.apache.metron.profiler.client.stellar;
+
+import org.apache.commons.beanutils.PropertyUtils;
+import org.apache.commons.lang3.ClassUtils;
+import org.apache.metron.common.utils.ReflectionUtils;
+import org.apache.metron.profiler.hbase.RowKeyBuilder;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.reflect.InvocationTargetException;
+import java.util.Map;
+import java.util.concurrent.TimeUnit;
+
+import static 
org.apache.metron.profiler.client.stellar.ProfilerConfig.PROFILER_PERIOD;
+import static 
org.apache.metron.profiler.client.stellar.ProfilerConfig.PROFILER_PERIOD_UNITS;
+import static 
org.apache.metron.profiler.client.stellar.ProfilerConfig.PROFILER_ROW_KEY_BUILDER;
+import static 
org.apache.metron.profiler.client.stellar.ProfilerConfig.PROFILER_SALT_DIVISOR;
+
+/**
+ * A Factory class that can create a RowKeyBuilder based on global 
property values.
+ */
+public class RowKeyBuilderFactory {
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(RowKeyBuilderFactory.class);
+
+  /**
+   * Create a RowKeyBuilder.
+   * @param global The global properties.
+   * @return A RowKeyBuilder instantiated using the global property values.
+   */
+  public static RowKeyBuilder create(Map global) {
+String rowKeyBuilderClass = PROFILER_ROW_KEY_BUILDER.get(global, 
String.class);
+LOG.debug("profiler client: {}={}", PROFILER_ROW_KEY_BUILDER, 
rowKeyBuilderClass);
+
+// instantiate the RowKeyBuilder
+RowKeyBuilder builder = 
ReflectionUtils.createInstance(rowKeyBuilderClass);
+setSaltDivisor(global, builder);
+setPeriodDuration(global, builder);
+
+return builder;
+  }
+
+  /**
+   * Set the period duration on the RowKeyBuilder.
+   * @param global The global properties from Zk.
+   * @param builder The RowKeyBuilder implementation.
+   */
+  private static void setPeriodDuration(Map global, 
RowKeyBuilder builder) {
+
+// how long is the profile period?
+long duration = PROFILER_PERIOD.get(global, Long.class);
+LOG.debug("profiler client: {}={}", PROFILER_PERIOD, duration);
+
+// which units are used to define the profile period?
+String configuredUnits = PROFILER_PERIOD_UNITS.get(global, 
String.class);
+TimeUnit units = TimeUnit.valueOf(configuredUnits);
+LOG.debug("profiler client: {}={}", PROFILER_PERIOD_UNITS, units);
+
+// set the period duration
+final String periodDurationProperty = "periodDurationMillis";
+setProperty(builder, periodDurationProperty, units.toMillis(duration));
+  }
+
+  /**
+   * Set the salt divisor property on the RowKeyBuilder.
+   * @param global The global properties from Zk.
+   * @param builder The RowKeyBuilder implementation.
+   */
+  private static void setSaltDivisor(Map global, 
RowKeyBuilder builder) {
+
+// what is the salt divisor?
+Integer saltDivisor = PROFILER_SALT_DIVISOR.get(global, Integer.class);
+LOG.debug("profiler client: {}={}", PROFILER_SALT_DIVISOR, 
saltDivisor);
+
+final String saltDivisorProperty = "saltDivisor";
+setProperty(builder, saltDivisorProperty, saltDivisor);
--- End diff --

This basically sets the 'salt 

[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086410#comment-16086410
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/622#discussion_r127327126
  
--- Diff: 
metron-analytics/metron-profiler/src/main/flux/profiler/remote.yaml ---
@@ -29,7 +29,7 @@ components:
 - name: "saltDivisor"
--- End diff --

Notice that the legacy `RowKeyBuilder`, the `SaltyRowKeyBuilder`, is still 
the default.  If a user wants to use the new `RowKeyBuilder` then they need to 
change the flux file here and specify 
`org.apache.metron.profiler.hbase.DecodableRowKeyBuilder`.


> Create Decodable Row Key for Profiler
> -
>
> Key: METRON-1005
> URL: https://issues.apache.org/jira/browse/METRON-1005
> Project: Metron
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Nick Allen
>Assignee: Nick Allen
> Fix For: Next + 1
>
>
> To be able to answer the types of questions that I outlined in METRON-450, we 
> need a row key that is decodable.  Right now there is no logic to decode a 
> row key, nor is the existing row key easily decodable.  
> Once the row keys can be decoded, you could scan all of the row keys in the 
> Profiler's HBase table, decode each of them and extract things like, the 
> names of all your profiles, the names of entities within a profile, the 
> period duration of a given profile.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086417#comment-16086417
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/622#discussion_r127328660
  
--- Diff: 
metron-analytics/metron-profiler-client/src/main/java/org/apache/metron/profiler/client/stellar/GetProfile.java
 ---
@@ -216,21 +211,7 @@ private ColumnBuilder getColumnBuilder(Map global) {
* @param global The global configuration.
*/
   private RowKeyBuilder getRowKeyBuilder(Map global) {
-
-// how long is the profile period?
-long duration = PROFILER_PERIOD.get(global, Long.class);
-LOG.debug("profiler client: {}={}", PROFILER_PERIOD, duration);
-
-// which units are used to define the profile period?
-String configuredUnits = PROFILER_PERIOD_UNITS.get(global, 
String.class);
-TimeUnit units = TimeUnit.valueOf(configuredUnits);
-LOG.debug("profiler client: {}={}", PROFILER_PERIOD_UNITS, units);
-
-// what is the salt divisor?
-Integer saltDivisor = PROFILER_SALT_DIVISOR.get(global, Integer.class);
-LOG.debug("profiler client: {}={}", PROFILER_SALT_DIVISOR, 
saltDivisor);
-
-return new SaltyRowKeyBuilder(saltDivisor, duration, units);
+return RowKeyBuilderFactory.create(global);
--- End diff --

This is where we need to instantiate the `RowKeyBuilder` for the Profiler 
Client API.  Like I will discuss in another thread, the logic got complex and 
kind of nasty so I encapsulated it in its own `RowKeyBuilderFactory`.  See that 
class for a further discussion as to why it is kind of nasty.


> Create Decodable Row Key for Profiler
> -
>
> Key: METRON-1005
> URL: https://issues.apache.org/jira/browse/METRON-1005
> Project: Metron
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Nick Allen
>Assignee: Nick Allen
> Fix For: Next + 1
>
>
> To be able to answer the types of questions that I outlined in METRON-450, we 
> need a row key that is decodable.  Right now there is no logic to decode a 
> row key, nor is the existing row key easily decodable.  
> Once the row keys can be decoded, you could scan all of the row keys in the 
> Profiler's HBase table, decode each of them and extract things like, the 
> names of all your profiles, the names of entities within a profile, the 
> period duration of a given profile.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086414#comment-16086414
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/622#discussion_r127329675
  
--- Diff: 
metron-analytics/metron-profiler-common/src/main/java/org/apache/metron/profiler/hbase/SaltyRowKeyBuilder.java
 ---
@@ -44,7 +46,17 @@
  * group(s) - The group(s) used to sort the data in HBase. For 
example, a group may distinguish between weekends and weekdays.
  * period - The period in which the measurement was taken. The first 
period starts at the epoch and increases monotonically.
  * 
+ *
+ * This row key builder has no logic to decode a row key, nor is the row 
key generated by this builder
+ * easily decodable.  More specifically, the profile, entity, groups and 
period that make up the row key
+ * cannot be extracted from a previously generated row key.  This makes it 
difficult to answer questions
+ * like; What entities are included in this profile?  What is the period 
for this profile?  Use the
+ * DecodableRowKeyBuilder instead.
+ *
+ * @deprecated Replaced by DecodableRowKeyBuilder
+ * @see DecodableRowKeyBuilder
  */
+@Deprecated
 public class SaltyRowKeyBuilder implements RowKeyBuilder {
--- End diff --

I marked the old `RowKeyBuilder` as deprecated.


> Create Decodable Row Key for Profiler
> -
>
> Key: METRON-1005
> URL: https://issues.apache.org/jira/browse/METRON-1005
> Project: Metron
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Nick Allen
>Assignee: Nick Allen
> Fix For: Next + 1
>
>
> To be able to answer the types of questions that I outlined in METRON-450, we 
> need a row key that is decodable.  Right now there is no logic to decode a 
> row key, nor is the existing row key easily decodable.  
> Once the row keys can be decoded, you could scan all of the row keys in the 
> Profiler's HBase table, decode each of them and extract things like, the 
> names of all your profiles, the names of entities within a profile, the 
> period duration of a given profile.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086412#comment-16086412
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/622#discussion_r127329548
  
--- Diff: 
metron-analytics/metron-profiler-common/src/main/java/org/apache/metron/profiler/hbase/DecodableRowKeyBuilder.java
 ---
@@ -0,0 +1,382 @@
+/*
+ *
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ *
+ */
+
+package org.apache.metron.profiler.hbase;
+
+import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.metron.profiler.ProfileMeasurement;
+import org.apache.metron.profiler.ProfilePeriod;
+
+import java.nio.BufferUnderflowException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.security.MessageDigest;
+import java.security.NoSuchAlgorithmException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Optional;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Responsible for building the row keys used to store profile data in 
HBase.
+ *
+ * This builder generates decodable row keys.  A decodable row key is one 
that can be interrogated to extract
+ * the constituent components of that row key.  Given a previously 
generated row key this builder
+ * can extract the profile name, entity name, group name(s), period 
duration, and period.
+ *
+ * The row key is composed of the following fields.
+ * 
+ * magic number - Helps to validate the row key.
+ * version - The version number of the row key.
+ * salt - A salt that helps prevent hot-spotting.
+ * profile - The name of the profile.
+ * entity - The name of the entity being profiled.
+ * group(s) - The group(s) used to sort the data in HBase. For 
example, a group may distinguish between weekends and weekdays.
+ * period - The period in which the measurement was taken. The first 
period starts at the epoch and increases monotonically.
+ * 
+ */
+public class DecodableRowKeyBuilder implements RowKeyBuilder {
--- End diff --

The new `RowKeyBuilder` implementation that is decodable.  Everyone should 
just use this, but the older implementation is left for backwards compatibility.


> Create Decodable Row Key for Profiler
> -
>
> Key: METRON-1005
> URL: https://issues.apache.org/jira/browse/METRON-1005
> Project: Metron
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Nick Allen
>Assignee: Nick Allen
> Fix For: Next + 1
>
>
> To be able to answer the types of questions that I outlined in METRON-450, we 
> need a row key that is decodable.  Right now there is no logic to decode a 
> row key, nor is the existing row key easily decodable.  
> Once the row keys can be decoded, you could scan all of the row keys in the 
> Profiler's HBase table, decode each of them and extract things like, the 
> names of all your profiles, the names of entities within a profile, the 
> period duration of a given profile.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086416#comment-16086416
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/622#discussion_r127330773
  
--- Diff: 
metron-analytics/metron-profiler-common/src/main/java/org/apache/metron/profiler/hbase/DecodableRowKeyBuilder.java
 ---
@@ -0,0 +1,382 @@
+/*
+ *
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ *
+ */
+
+package org.apache.metron.profiler.hbase;
+
+import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.metron.profiler.ProfileMeasurement;
+import org.apache.metron.profiler.ProfilePeriod;
+
+import java.nio.BufferUnderflowException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.security.MessageDigest;
+import java.security.NoSuchAlgorithmException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Optional;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Responsible for building the row keys used to store profile data in 
HBase.
+ *
+ * This builder generates decodable row keys.  A decodable row key is one 
that can be interrogated to extract
+ * the constituent components of that row key.  Given a previously 
generated row key this builder
+ * can extract the profile name, entity name, group name(s), period 
duration, and period.
+ *
+ * The row key is composed of the following fields.
+ * 
+ * magic number - Helps to validate the row key.
+ * version - The version number of the row key.
+ * salt - A salt that helps prevent hot-spotting.
+ * profile - The name of the profile.
+ * entity - The name of the entity being profiled.
+ * group(s) - The group(s) used to sort the data in HBase. For 
example, a group may distinguish between weekends and weekdays.
+ * period - The period in which the measurement was taken. The first 
period starts at the epoch and increases monotonically.
+ * 
+ */
+public class DecodableRowKeyBuilder implements RowKeyBuilder {
+
+  /**
+   * Defines the byte order when encoding and decoding the row keys.
+   *
+   * Making this configurable is likely not necessary and is left as a 
practice exercise for the reader. :)
+   */
+  private static final ByteOrder byteOrder = ByteOrder.BIG_ENDIAN;
+
+  /**
+   * Defines some level of sane max field length to avoid any shenanigans 
with oddly encoded row keys.
+   */
+  private static final int MAX_FIELD_LENGTH = 1000;
+
+  /**
+   * A magic number embedded in each row key to help validate the row key 
and byte ordering when decoding.
+   */
+  protected static final short MAGIC_NUMBER = 77;
+
+  /**
+   * The version number of the row keys supported by this builder.
+   */
+  protected static final byte VERSION = (byte) 1;
--- End diff --

I added a `VERSION` field to the row key, hoping that this might help 
future changes to the `RowKeyBuilder`.  With this, I could potentially start to 
parse the row key and then choose the right `RowKeyBuilder` implementation; the 
one used to create the row key.  This would make row key changes seemless to 
users.


> Create Decodable Row Key for Profiler
> -
>
> Key: METRON-1005
> URL: https://issues.apache.org/jira/browse/METRON-1005
> Project: Metron
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Nick Allen
>Assignee: Nick Allen
> Fix For: Next + 1
>
>
> To be able to answer the types of questions that I outlined in METRON-450, we 
> need a row key that is decodable.  Right now there is no logic to decode a 
> row key, nor is the existing row key easily decodable.  
> Once the row keys can be decoded, you could scan all of the row 

[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086415#comment-16086415
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/622#discussion_r127326080
  
--- Diff: 
metron-analytics/metron-profiler-client/src/main/java/org/apache/metron/profiler/client/stellar/RowKeyBuilderFactory.java
 ---
@@ -0,0 +1,125 @@
+/*
+ *
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ *
+ */
+
+package org.apache.metron.profiler.client.stellar;
+
+import org.apache.commons.beanutils.PropertyUtils;
+import org.apache.commons.lang3.ClassUtils;
+import org.apache.metron.common.utils.ReflectionUtils;
+import org.apache.metron.profiler.hbase.RowKeyBuilder;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.reflect.InvocationTargetException;
+import java.util.Map;
+import java.util.concurrent.TimeUnit;
+
+import static 
org.apache.metron.profiler.client.stellar.ProfilerConfig.PROFILER_PERIOD;
+import static 
org.apache.metron.profiler.client.stellar.ProfilerConfig.PROFILER_PERIOD_UNITS;
+import static 
org.apache.metron.profiler.client.stellar.ProfilerConfig.PROFILER_ROW_KEY_BUILDER;
+import static 
org.apache.metron.profiler.client.stellar.ProfilerConfig.PROFILER_SALT_DIVISOR;
+
+/**
+ * A Factory class that can create a RowKeyBuilder based on global 
property values.
+ */
+public class RowKeyBuilderFactory {
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(RowKeyBuilderFactory.class);
+
+  /**
+   * Create a RowKeyBuilder.
+   * @param global The global properties.
+   * @return A RowKeyBuilder instantiated using the global property values.
+   */
+  public static RowKeyBuilder create(Map global) {
+String rowKeyBuilderClass = PROFILER_ROW_KEY_BUILDER.get(global, 
String.class);
+LOG.debug("profiler client: {}={}", PROFILER_ROW_KEY_BUILDER, 
rowKeyBuilderClass);
+
+// instantiate the RowKeyBuilder
+RowKeyBuilder builder = 
ReflectionUtils.createInstance(rowKeyBuilderClass);
+setSaltDivisor(global, builder);
+setPeriodDuration(global, builder);
--- End diff --

I don't really like how I go about setting the salt divisor and period 
duration on the `RowKeyBuilder`.  There are no methods in the `RowKeyBuilder` 
interface to do set these values.  I could add something like 
`RowKeyBuilder.setSaltDivisor`, but I was trying not to pollute that interface 
with variables like salt divisor that may not apply to all RowKeyBuilder 
implementations.


> Create Decodable Row Key for Profiler
> -
>
> Key: METRON-1005
> URL: https://issues.apache.org/jira/browse/METRON-1005
> Project: Metron
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Nick Allen
>Assignee: Nick Allen
> Fix For: Next + 1
>
>
> To be able to answer the types of questions that I outlined in METRON-450, we 
> need a row key that is decodable.  Right now there is no logic to decode a 
> row key, nor is the existing row key easily decodable.  
> Once the row keys can be decoded, you could scan all of the row keys in the 
> Profiler's HBase table, decode each of them and extract things like, the 
> names of all your profiles, the names of entities within a profile, the 
> period duration of a given profile.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1005) Create Decodable Row Key for Profiler

2017-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086409#comment-16086409
 ] 

ASF GitHub Bot commented on METRON-1005:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/622#discussion_r127326600
  
--- Diff: 
metron-analytics/metron-profiler-client/src/main/java/org/apache/metron/profiler/client/stellar/RowKeyBuilderFactory.java
 ---
@@ -0,0 +1,125 @@
+/*
+ *
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ *
+ */
+
+package org.apache.metron.profiler.client.stellar;
+
+import org.apache.commons.beanutils.PropertyUtils;
+import org.apache.commons.lang3.ClassUtils;
+import org.apache.metron.common.utils.ReflectionUtils;
+import org.apache.metron.profiler.hbase.RowKeyBuilder;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.reflect.InvocationTargetException;
+import java.util.Map;
+import java.util.concurrent.TimeUnit;
+
+import static 
org.apache.metron.profiler.client.stellar.ProfilerConfig.PROFILER_PERIOD;
+import static 
org.apache.metron.profiler.client.stellar.ProfilerConfig.PROFILER_PERIOD_UNITS;
+import static 
org.apache.metron.profiler.client.stellar.ProfilerConfig.PROFILER_ROW_KEY_BUILDER;
+import static 
org.apache.metron.profiler.client.stellar.ProfilerConfig.PROFILER_SALT_DIVISOR;
+
+/**
+ * A Factory class that can create a RowKeyBuilder based on global 
property values.
+ */
+public class RowKeyBuilderFactory {
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(RowKeyBuilderFactory.class);
+
+  /**
+   * Create a RowKeyBuilder.
+   * @param global The global properties.
+   * @return A RowKeyBuilder instantiated using the global property values.
+   */
+  public static RowKeyBuilder create(Map global) {
+String rowKeyBuilderClass = PROFILER_ROW_KEY_BUILDER.get(global, 
String.class);
+LOG.debug("profiler client: {}={}", PROFILER_ROW_KEY_BUILDER, 
rowKeyBuilderClass);
+
+// instantiate the RowKeyBuilder
+RowKeyBuilder builder = 
ReflectionUtils.createInstance(rowKeyBuilderClass);
+setSaltDivisor(global, builder);
+setPeriodDuration(global, builder);
--- End diff --

But I think this actually turned out worse, than the alternative of just 
adding `RowKeyBuilder.setSaltDivisor` and polluting the interface.


> Create Decodable Row Key for Profiler
> -
>
> Key: METRON-1005
> URL: https://issues.apache.org/jira/browse/METRON-1005
> Project: Metron
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Nick Allen
>Assignee: Nick Allen
> Fix For: Next + 1
>
>
> To be able to answer the types of questions that I outlined in METRON-450, we 
> need a row key that is decodable.  Right now there is no logic to decode a 
> row key, nor is the existing row key easily decodable.  
> Once the row keys can be decoded, you could scan all of the row keys in the 
> Profiler's HBase table, decode each of them and extract things like, the 
> names of all your profiles, the names of entities within a profile, the 
> period duration of a given profile.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)