[
https://issues.apache.org/jira/browse/METRON-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16086416#comment-16086416
]
ASF GitHub Bot commented on METRON-1005:
----------------------------------------
Github user nickwallen commented on a diff in the pull request:
https://github.com/apache/metron/pull/622#discussion_r127330773
--- Diff:
metron-analytics/metron-profiler-common/src/main/java/org/apache/metron/profiler/hbase/DecodableRowKeyBuilder.java
---
@@ -0,0 +1,382 @@
+/*
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ *
+ */
+
+package org.apache.metron.profiler.hbase;
+
+import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.metron.profiler.ProfileMeasurement;
+import org.apache.metron.profiler.ProfilePeriod;
+
+import java.nio.BufferUnderflowException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.security.MessageDigest;
+import java.security.NoSuchAlgorithmException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Optional;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Responsible for building the row keys used to store profile data in
HBase.
+ *
+ * This builder generates decodable row keys. A decodable row key is one
that can be interrogated to extract
+ * the constituent components of that row key. Given a previously
generated row key this builder
+ * can extract the profile name, entity name, group name(s), period
duration, and period.
+ *
+ * The row key is composed of the following fields.
+ * <ul>
+ * <li>magic number - Helps to validate the row key.</li>
+ * <li>version - The version number of the row key.</li>
+ * <li>salt - A salt that helps prevent hot-spotting.
+ * <li>profile - The name of the profile.
+ * <li>entity - The name of the entity being profiled.
+ * <li>group(s) - The group(s) used to sort the data in HBase. For
example, a group may distinguish between weekends and weekdays.
+ * <li>period - The period in which the measurement was taken. The first
period starts at the epoch and increases monotonically.
+ * </ul>
+ */
+public class DecodableRowKeyBuilder implements RowKeyBuilder {
+
+ /**
+ * Defines the byte order when encoding and decoding the row keys.
+ *
+ * Making this configurable is likely not necessary and is left as a
practice exercise for the reader. :)
+ */
+ private static final ByteOrder byteOrder = ByteOrder.BIG_ENDIAN;
+
+ /**
+ * Defines some level of sane max field length to avoid any shenanigans
with oddly encoded row keys.
+ */
+ private static final int MAX_FIELD_LENGTH = 1000;
+
+ /**
+ * A magic number embedded in each row key to help validate the row key
and byte ordering when decoding.
+ */
+ protected static final short MAGIC_NUMBER = 77;
+
+ /**
+ * The version number of the row keys supported by this builder.
+ */
+ protected static final byte VERSION = (byte) 1;
--- End diff --
I added a `VERSION` field to the row key, hoping that this might help
future changes to the `RowKeyBuilder`. With this, I could potentially start to
parse the row key and then choose the right `RowKeyBuilder` implementation; the
one used to create the row key. This would make row key changes seemless to
users.
> Create Decodable Row Key for Profiler
> -------------------------------------
>
> Key: METRON-1005
> URL: https://issues.apache.org/jira/browse/METRON-1005
> Project: Metron
> Issue Type: Improvement
> Affects Versions: 0.3.0
> Reporter: Nick Allen
> Assignee: Nick Allen
> Fix For: Next + 1
>
>
> To be able to answer the types of questions that I outlined in METRON-450, we
> need a row key that is decodable. Right now there is no logic to decode a
> row key, nor is the existing row key easily decodable.
> Once the row keys can be decoded, you could scan all of the row keys in the
> Profiler's HBase table, decode each of them and extract things like, the
> names of all your profiles, the names of entities within a profile, the
> period duration of a given profile.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)