This is an automated email from the ASF dual-hosted git repository.
mmiklavcic pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/metron.git
The following commit(s) were added to refs/heads/master by this push:
new b8e426c METRON-1795: General Purpose Regex Parser (jadeepsinh2 via
mmiklavc) closes apache/metron#1245
b8e426c is described below
commit b8e426c755a5969e24dba50f5d8fa81d1ccb472d
Author: jagdeepsingh2 <[email protected]>
AuthorDate: Mon Dec 17 09:44:50 2018 -0700
METRON-1795: General Purpose Regex Parser (jadeepsinh2 via mmiklavc) closes
apache/metron#1245
---
metron-platform/metron-parsers/README.md | 90 +++++
.../parsers/regex/RegularExpressionsParser.java | 435 +++++++++++++++++++++
.../regex/RegularExpressionsParserTest.java | 275 +++++++++++++
3 files changed, 800 insertions(+)
diff --git a/metron-platform/metron-parsers/README.md
b/metron-platform/metron-parsers/README.md
index cfcf6ed..5aff84a 100644
--- a/metron-platform/metron-parsers/README.md
+++ b/metron-platform/metron-parsers/README.md
@@ -52,6 +52,96 @@ There are two general types types of parsers:
This is using the default value for `wrapEntityName` if that property
is not set.
* `wrapEntityName` : Sets the name to use when wrapping JSON using
`wrapInEntityArray`. The `jsonpQuery` should reference this name.
* A field called `timestamp` is expected to exist and, if it does not,
then current time is inserted.
+ * Regular Expressions Parser
+ * `recordTypeRegex` : A regular expression to uniquely identify a record
type.
+ * `messageHeaderRegex` : A regular expression used to extract fields
from a message part which is common across all the messages.
+ * `convertCamelCaseToUnderScore` : If this property is set to true, this
parser will automatically convert all the camel case property names to
underscore seperated.
+ For example, following convertions will automatically happen:
+
+ ```
+ ipSrcAddr -> ip_src_addr
+ ipDstAddr -> ip_dst_addr
+ ipSrcPort -> ip_src_port
+ ```
+ Note this property may be necessary, because java does not support
underscores in the named group names. So in case your property naming
conventions requires underscores in property names, use this property.
+
+ * `fields` : A json list of maps contaning a record type to regular
expression mapping.
+
+ A complete configuration example would look like:
+
+ ```json
+ "convertCamelCaseToUnderScore": true,
+ "recordTypeRegex": "kernel|syslog",
+ "messageHeaderRegex":
"(<syslogPriority>(<=^<)\\d{1,4}(?=>)).*?(<timestamp>(<=>)[A-Za-z]
{3}\\s{1,2}\\d{1,2}\\s\\d{1,2}:\\d{1,2}:\\d{1,2}(?=\\s)).*?(<syslogHost>(<=\\s).*?(?=\\s))",
+ "fields": [
+ {
+ "recordType": "kernel",
+ "regex": ".*(<eventInfo>(<=\\]|\\w\\:).*?(?=$))"
+ },
+ {
+ "recordType": "syslog",
+ "regex":
".*(<processid>(<=PID\\s=\\s).*?(?=\\sLine)).*(<filePath>(<=64\\s)\/([A-Za-z0-9_-]+\/)+(?=\\w))
(<fileName>.*?(?=\")).*(<eventInfo>(<=\").*?(?=$))"
+ }
+ ]
+ ```
+ **Note**: messageHeaderRegex and regex (withing fields) could be
specified as lists also e.g.
+ ```json
+ "messageHeaderRegex": [
+ "regular expression 1",
+ "regular expression 2"
+ ]
+ ```
+ Where **regular expression 1** are valid regular expressions and may
have named
+ groups, which would be extracted into fields. This list will be
evaluated in order until a
+ matching regular expression is found.
+
+ **messageHeaderRegex** is run on all the messages.
+ Yes, all the messages are expected to contain the fields which are being
extracted using the **messageHeaderRegex**.
+ **messageHeaderRegex** is a sort of HCF (highest common factor) in all
messages.
+
+ **recordTypeRegex** can be a more advanced regular expression containing
named goups. For example
+
+ "recordTypeRegex":
"(<process>(<=\\s)\\b(kernel|syslog)\\b(?=\\[|:))"
+
+ Here all the named goups (process in above example) will be extracted as
fields.
+
+ Though having named group in recordType is completely optional, still
one could want extract named groups in recordType for following reasons:
+
+ 1. Since **recordType** regular expression is already getting matched
and we are paying the price for a regular expression match already,
+ we can extract certain fields as a by product of this match.
+ 2. Most likely the **recordType** field is common across all the
messages. Hence having it extracted in the recordType (or messageHeaderRegex)
would
+ reduce the overall complexity of regular expressions in the regex field.
+
+ **regex** within a field could be a list of regular expressions also. In
this case all regular expressions in the list will be attempted to match until
a match is found. Once a full match is found remaining regular expressions are
ignored.
+
+ ```json
+ "regex": [ "record type specific regular expression 1",
+ "record type specific regular expression 2"]
+
+ ```
+
+ **timesamp**
+
+ Since this parser is a general purpose parser, it will populate the
timestamp field with current UTC timestamp. Actual timestamp value can be
overridden later using stellar.
+ For example in case of syslog timestamps, one could use following
stellar construct to override the timestamp value.
+ Let us say you parsed actual timestamp from the raw log:
+
+ <38>Jun 20 15:01:17 hostName sshd[11672]: Accepted publickey for prod
from 55.55.55.55 port 66666 ssh2
+
+ syslogTimestamp="Jun 20 15:01:17"
+
+ Then something like below could be used to override the timestamp.
+
+ ```
+ "timestamp_str": "FORMAT('%s%s%s', YEAR(),' ',syslogTimestamp)",
+ "timestamp":"TO_EPOCH_TIMESTAMP(timestamp_str, 'yyyy MMM dd HH:mm:ss' )"
+ ```
+
+ OR, if you want to factor in the timezone
+
+ ```
+ "timestamp":"TO_EPOCH_TIMESTAMP(timestamp_str, timestamp_format,
timezone_name )"
+ ```
## Parser Error Routing
diff --git
a/metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/regex/RegularExpressionsParser.java
b/metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/regex/RegularExpressionsParser.java
new file mode 100644
index 0000000..c9f1ec9
--- /dev/null
+++
b/metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/regex/RegularExpressionsParser.java
@@ -0,0 +1,435 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
contributor license
+ * agreements. See the NOTICE file distributed with this work for additional
information regarding
+ * copyright ownership. The ASF licenses this file to you under the Apache
License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the
License. You may obtain a
+ * copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express
+ * or implied. See the License for the specific language governing permissions
and limitations under
+ * the License.
+ */
+
+package org.apache.metron.parsers.regex;
+
+import com.google.common.base.CaseFormat;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.metron.common.Constants;
+import org.apache.metron.parsers.BasicParser;
+import org.json.simple.JSONObject;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.nio.charset.Charset;
+import java.util.*;
+import java.util.regex.Matcher;
+import java.util.regex.Pattern;
+import java.util.regex.PatternSyntaxException;
+
+//@formatter:off
+/**
+ * General purpose class to parse unstructured text message into a json
object. This class parses
+ * the message as per supplied parser config as part of sensor config. Sensor
parser config example:
+ *
+ * <pre>
+ * <code>
+ * "convertCamelCaseToUnderScore": true,
+ * "recordTypeRegex":
"(?<process>(?<=\\s)\\b(kernel|syslog)\\b(?=\\[|:))",
+ * "messageHeaderRegex":
"(?<syslogpriority>(?<=^<)\\d{1,4}(?=>)).*?(?<timestamp>(?<=>)[A-Za-z]{3}\\s{1,2}\\d{1,2}\\s\\d{1,2}:\\d{1,2}:\\d{1,2}(?=\\s)).*?(?<syslogHost>(?<=\\s).*?(?=\\s))",
+ * "fields": [
+ * {
+ * "recordType": "kernel",
+ * "regex": ".*(?<eventInfo>(?<=\\]|\\w\\:).*?(?=$))"
+ * },
+ * {
+ * "recordType": "syslog",
+ * "regex":
".*(?<processid>(?<=PID\\s=\\s).*?(?=\\sLine)).*(?<filePath>(?<=64\\s)\/([A-Za-z0-9_-]+\/)+(?=\\w))(?<fileName>.*?(?=\")).*(?<eventInfo>(?<=\").*?(?=$))"
+ * }
+ * ]
+ * </code>
+ * </pre>
+ *
+ * Note: messageHeaderRegex could be specified as lists also e.g.
+ *
+ * <pre>
+ * <code>
+ * "messageHeaderRegex": [
+ * "regular expression 1",
+ * "regular expression 2"
+ * ]
+ * </code>
+ * </pre>
+ *
+ * Where <strong>regular expression 1</strong> are valid regular expressions
and may have named
+ * groups, which would be extracted into fields. This list will be evaluated
in order until a
+ * matching regular expression is found.<br>
+ * <br>
+ *
+ * <strong>Configuration fields explanation</strong>
+ *
+ * <pre>
+ * recordTypeRegex : used to specify a regular expression to distinctly
identify a record type.
+ * messageHeaderRegex : used to specify a regular expression to extract
fields from a message part which is common across all the messages.
+ * e.g. rhel logs looks like
+ * <code>
+ * <7>Jun 26 16:18:01 hostName kernel: SELinux: initialized (dev tmpfs, type
tmpfs), uses transition SIDs
+ * </code>
+ * <br>
+ * </pre>
+ *
+ * Here message structure (<7>Jun 26 16:18:01 hostName kernel) is common
across all messages.
+ * Hence messageHeaderRegex could be used to extract fields from this part.
+ *
+ * fields : json list of objects containing recordType and regex. regex could
be a further list e.g.
+ *
+ * <pre>
+ * <code>
+ * "regex": [ "record type specific regular expression 1",
+ * "record type specific regular expression 2"]
+ *
+ * </code>
+ * </pre>
+ *
+ * <strong>Limitation</strong> <br>
+ * Currently the named groups in java regular expressions have a limitation.
Only following
+ * characters could be used to name a named group. A capturing group can also
be assigned a "name",
+ * a named-capturing group, and then be back-referenced later by the "name".
Group names are
+ * composed of the following characters. The first character must be a letter.
+ *
+ * <pre>
+ * <code>
+ * The uppercase letters 'A' through 'Z' ('\u0041' through '\u005a'),
+ * The lowercase letters 'a' through 'z' ('\u0061' through '\u007a'),
+ * The digits '0' through '9' ('\u0030' through '\u0039'),
+ * </code>
+ * </pre>
+ *
+ * This means that an _ (underscore), cannot be used as part of a named group
name. E.g. this is an
+ * invalid regular expression
<code>.*(?<event_info>(?<=\\]|\\w\\:).*?(?=$))</code>
+ *
+ * However, this limitation can be easily overcome by adding a parser
configuration setting.
+ *
+ * <code>
+ * "convertCamelCaseToUnderScore": true,
+ * <code>
+ * If above property is added to the sensor parser configuration, in
parserConfig object, this parser will automatically convert all the camel case
property names to underscore seperated.
+ * For example, following conversions will automatically happen:
+ *
+ * <code>
+ * ipSrcAddr -> ip_src_addr
+ * ipDstAddr -> ip_dst_addr
+ * ipSrcPort -> ip_src_port
+ * <code>
+ * etc.
+ */
+//@formatter:on
+public class RegularExpressionsParser extends BasicParser {
+
+ protected static final Logger LOG =
+ LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+ private static final Charset UTF_8 = Charset.forName("UTF-8");
+
+ private List<Map<String, Object>> fields;
+ private Map<String, Object> parserConfig;
+ private final Pattern namedGroupPattern =
Pattern.compile("\\(\\?<([a-zA-Z][a-zA-Z0-9]*)>");
+ Pattern capitalLettersPattern = Pattern.compile("^.*[A-Z]+.*$");
+ private Pattern recordTypePattern;
+ private final Set<String> recordTypePatternNamedGroups = new HashSet<>();
+ private final Map<String, Map<Pattern, Set<String>>> recordTypePatternMap =
+ new LinkedHashMap<>();
+ private final Map<Pattern, Set<String>> messageHeaderPatternsMap = new
LinkedHashMap<>();
+
+ /**
+ * Parses an unstructured text message into a json object based upon the
regular expression
+ * configuration supplied.
+ *
+ * @param rawMessage incoming unstructured raw text.
+ * @return List of json parsed json objects. In this case list will have a
single element only.
+ */
+ @Override
+ public List<JSONObject> parse(byte[] rawMessage) {
+ String originalMessage = null;
+ try {
+ originalMessage = new String(rawMessage, UTF_8).trim();
+ LOG.debug(" raw message. {}", originalMessage);
+ if (originalMessage.isEmpty()) {
+ LOG.warn("Message is empty.");
+ return Arrays.asList(new JSONObject());
+ }
+ } catch (Exception e) {
+ LOG.error("[Metron] Could not read raw message. {} " +
originalMessage, e);
+ throw new RuntimeException(e.getMessage(), e);
+ }
+
+ JSONObject parsedJson = new JSONObject();
+ if (messageHeaderPatternsMap.size() > 0) {
+ parsedJson.putAll(extractHeaderFields(originalMessage));
+ }
+ parsedJson.putAll(parse(originalMessage));
+ parsedJson.put(Constants.Fields.ORIGINAL.getName(), originalMessage);
+ /**
+ * Populate the output json with default timestamp.
+ */
+ parsedJson.put(Constants.Fields.TIMESTAMP.getName(),
System.currentTimeMillis());
+ applyFieldTransformations(parsedJson);
+ return Arrays.asList(parsedJson);
+ }
+
+ private void applyFieldTransformations(JSONObject parsedJson) {
+ if
(getParserConfig().get(ParserConfigConstants.CONVERT_CAMELCASE_TO_UNDERSCORE.getName())
+ != null && (Boolean) getParserConfig()
+
.get(ParserConfigConstants.CONVERT_CAMELCASE_TO_UNDERSCORE.getName())) {
+ convertCamelCaseToUnderScore(parsedJson);
+ }
+
+ }
+
+ // @formatter:off
+ /**
+ * This method is called during the parser initialization. It parses the
parser
+ * configuration and configures the parser accordingly. It then initializes
+ * instance variables.
+ *
+ * @param parserConfig ParserConfig(Map<String, Object>) supplied to the
sensor.
+ * @see
org.apache.metron.parsers.interfaces.Configurable#configure(java.util.Map)<br>
+ * <br>
+ */
+ // @formatter:on
+ @Override
+ public void configure(Map<String, Object> parserConfig) {
+ setParserConfig(parserConfig);
+ setFields((List<Map<String, Object>>) getParserConfig()
+ .get(ParserConfigConstants.FIELDS.getName()));
+ String recordTypeRegex =
+ (String)
getParserConfig().get(ParserConfigConstants.RECORD_TYPE_REGEX.getName());
+
+ if (StringUtils.isBlank(recordTypeRegex)) {
+ LOG.error("Invalid config :recordTypeRegex is missing in
parserConfig");
+ throw new IllegalStateException(
+ "Invalid config :recordTypeRegex is missing in parserConfig");
+ }
+
+ setRecordTypePattern(recordTypeRegex);
+ recordTypePatternNamedGroups.addAll(getNamedGroups(recordTypeRegex));
+ List<Map<String, Object>> fields =
+ (List<Map<String, Object>>)
getParserConfig().get(ParserConfigConstants.FIELDS.getName());
+
+ try {
+ configureRecordTypePatterns(fields);
+ configureMessageHeaderPattern();
+ } catch (PatternSyntaxException e) {
+ LOG.error("Invalid config : {} ", e.getMessage());
+ throw new IllegalStateException("Invalid config : " +
e.getMessage());
+ }
+
+ validateConfig();
+ }
+
+ private void configureMessageHeaderPattern() {
+ if
(getParserConfig().get(ParserConfigConstants.MESSAGE_HEADER.getName()) != null)
{
+ if (getParserConfig()
+ .get(ParserConfigConstants.MESSAGE_HEADER.getName())
instanceof List) {
+ List<String> messageHeaderPatternList = (List<String>)
getParserConfig()
+ .get(ParserConfigConstants.MESSAGE_HEADER.getName());
+ for (String messageHeaderPatternStr :
messageHeaderPatternList) {
+
messageHeaderPatternsMap.put(Pattern.compile(messageHeaderPatternStr),
+ getNamedGroups(messageHeaderPatternStr));
+ }
+ } else if (getParserConfig()
+ .get(ParserConfigConstants.MESSAGE_HEADER.getName())
instanceof String) {
+ String messageHeaderPatternStr =
+ (String)
getParserConfig().get(ParserConfigConstants.MESSAGE_HEADER.getName());
+ if (StringUtils.isNotBlank(messageHeaderPatternStr)) {
+
messageHeaderPatternsMap.put(Pattern.compile(messageHeaderPatternStr),
+ getNamedGroups(messageHeaderPatternStr));
+ }
+ }
+ }
+ }
+
+ private void configureRecordTypePatterns(List<Map<String, Object>> fields)
{
+
+ for (Map<String, Object> field : fields) {
+ if (field.get(ParserConfigConstants.RECORD_TYPE.getName()) != null
+ && field.get(ParserConfigConstants.REGEX.getName()) != null) {
+ String recordType =
+ ((String)
field.get(ParserConfigConstants.RECORD_TYPE.getName())).toLowerCase();
+ recordTypePatternMap.put(recordType, new LinkedHashMap<>());
+ if (field.get(ParserConfigConstants.REGEX.getName())
instanceof List) {
+ List<String> regexList =
+ (List<String>)
field.get(ParserConfigConstants.REGEX.getName());
+ regexList.forEach(s -> {
+ recordTypePatternMap.get(recordType)
+ .put(Pattern.compile(s), getNamedGroups(s));
+ });
+ } else if (field.get(ParserConfigConstants.REGEX.getName())
instanceof String) {
+ recordTypePatternMap.get(recordType).put(
+ Pattern.compile((String)
field.get(ParserConfigConstants.REGEX.getName())),
+ getNamedGroups((String)
field.get(ParserConfigConstants.REGEX.getName())));
+ }
+ }
+ }
+ }
+
+ private void setRecordTypePattern(String recordTypeRegex) {
+ if (recordTypeRegex != null) {
+ recordTypePattern = Pattern.compile(recordTypeRegex);
+ }
+ }
+
+ private JSONObject parse(String originalMessage) {
+ JSONObject parsedJson = new JSONObject();
+ Optional<String> recordIdentifier = getField(recordTypePattern,
originalMessage);
+ if (recordIdentifier.isPresent()) {
+ extractNamedGroups(parsedJson, recordIdentifier.get(),
originalMessage);
+ }
+ /*
+ * Extract fields(named groups) from record type regular expression
+ */
+ Matcher matcher = recordTypePattern.matcher(originalMessage);
+ if (matcher.find()) {
+ for (String namedGroup : recordTypePatternNamedGroups) {
+ if (matcher.group(namedGroup) != null) {
+ parsedJson.put(namedGroup,
matcher.group(namedGroup).trim());
+ }
+ }
+ }
+ return parsedJson;
+ }
+
+ private void extractNamedGroups(Map<String, Object> json, String
recordType,
+ String originalMessage) {
+ Map<Pattern, Set<String>> patternMap =
recordTypePatternMap.get(recordType.toLowerCase());
+ if (patternMap != null) {
+ for (Map.Entry<Pattern, Set<String>> entry :
patternMap.entrySet()) {
+ Pattern pattern = entry.getKey();
+ Set<String> namedGroups = entry.getValue();
+ if (pattern != null && namedGroups != null &&
namedGroups.size() > 0) {
+ Matcher m = pattern.matcher(originalMessage);
+ if (m.matches()) {
+ LOG.debug("RecordType : {} Trying regex : {} for
message : {} ", recordType,
+ pattern.toString(), originalMessage);
+ for (String namedGroup : namedGroups) {
+ if (m.group(namedGroup) != null) {
+ json.put(namedGroup,
m.group(namedGroup).trim());
+ }
+ }
+ break;
+ }
+ }
+ }
+ } else {
+ LOG.warn("No pattern found for record type : {}", recordType);
+ }
+ }
+
+ public Optional<String> getField(Pattern pattern, String originalMessage) {
+ Matcher matcher = pattern.matcher(originalMessage);
+ while (matcher.find()) {
+ return Optional.of(matcher.group());
+ }
+ return Optional.empty();
+ }
+
+ private Set<String> getNamedGroups(String regex) {
+ Set<String> namedGroups = new TreeSet<>();
+ Matcher matcher = namedGroupPattern.matcher(regex);
+ while (matcher.find()) {
+ namedGroups.add(matcher.group(1));
+ }
+ return namedGroups;
+ }
+
+ private Map<String, Object> extractHeaderFields(String originalMessage) {
+ Map<String, Object> messageHeaderJson = new JSONObject();
+ for (Map.Entry<Pattern, Set<String>> syslogPatternEntry :
messageHeaderPatternsMap
+ .entrySet()) {
+ Matcher m = syslogPatternEntry.getKey().matcher(originalMessage);
+ if (m.find()) {
+ for (String namedGroup : syslogPatternEntry.getValue()) {
+ if (StringUtils.isNotBlank(m.group(namedGroup))) {
+ messageHeaderJson.put(namedGroup,
m.group(namedGroup).trim());
+ }
+ }
+ break;
+ }
+ }
+ return messageHeaderJson;
+ }
+
+ @Override
+ public void init() {
+ LOG.info("RegularExpressions parser initialised.");
+ }
+
+ public void validateConfig() {
+ if (getFields() == null) {
+ LOG.error("Invalid config : fields is missing in parserConfig");
+ throw new IllegalStateException("Invalid config :fields is missing
in parserConfig");
+ }
+ }
+
+ private void convertCamelCaseToUnderScore(Map<String, Object> json) {
+ Map<String, String> oldKeyNewKeyMap = new HashMap<>();
+ for (Map.Entry<String, Object> entry : json.entrySet()) {
+ if (capitalLettersPattern.matcher(entry.getKey()).matches()) {
+ oldKeyNewKeyMap.put(entry.getKey(),
+ CaseFormat.UPPER_CAMEL.to(CaseFormat.LOWER_UNDERSCORE,
entry.getKey()));
+ }
+ }
+ oldKeyNewKeyMap.forEach((oldKey, newKey) -> json.put(newKey,
json.remove(oldKey)));
+ }
+
+ public List<Map<String, Object>> getFields() {
+ return fields;
+ }
+
+ public void setFields(List<Map<String, Object>> fields) {
+ this.fields = fields;
+ }
+
+ public Map<String, Object> getParserConfig() {
+ return parserConfig;
+ }
+
+ public void setParserConfig(Map<String, Object> parserConfig) {
+ this.parserConfig = parserConfig;
+ }
+
+ enum ParserConfigConstants {
+ //@formatter:off
+ RECORD_TYPE("recordType"),
+ RECORD_TYPE_REGEX("recordTypeRegex"),
+ REGEX("regex"),
+ FIELDS("fields"),
+ MESSAGE_HEADER("messageHeaderRegex"),
+ CONVERT_CAMELCASE_TO_UNDERSCORE("convertCamelCaseToUnderScore");
+ //@formatter:on
+ private final String name;
+ private static Map<String, ParserConfigConstants> nameToField;
+
+ ParserConfigConstants(String name) {
+ this.name = name;
+ }
+
+ public String getName() {
+ return name;
+ }
+
+ static {
+ nameToField = new HashMap<>();
+ for (ParserConfigConstants f : ParserConfigConstants.values()) {
+ nameToField.put(f.getName(), f);
+ }
+ }
+
+ public static ParserConfigConstants fromString(String fieldName) {
+ return nameToField.get(fieldName);
+ }
+ }
+}
diff --git
a/metron-platform/metron-parsers/src/test/java/org/apache/metron/parsers/regex/RegularExpressionsParserTest.java
b/metron-platform/metron-parsers/src/test/java/org/apache/metron/parsers/regex/RegularExpressionsParserTest.java
new file mode 100644
index 0000000..5097ec0
--- /dev/null
+++
b/metron-platform/metron-parsers/src/test/java/org/apache/metron/parsers/regex/RegularExpressionsParserTest.java
@@ -0,0 +1,275 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
contributor license
+ * agreements. See the NOTICE file distributed with this work for additional
information regarding
+ * copyright ownership. The ASF licenses this file to you under the Apache
License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the
License. You may obtain a
+ * copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express
+ * or implied. See the License for the specific language governing permissions
and limitations under
+ * the License.
+ */
+package org.apache.metron.parsers.regex;
+
+import org.adrianwalker.multilinestring.Multiline;
+import org.json.simple.JSONObject;
+import org.json.simple.parser.JSONParser;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+public class RegularExpressionsParserTest {
+
+ private RegularExpressionsParser regularExpressionsParser;
+
+ @Before
+ public void setUp() throws Exception {
+ regularExpressionsParser = new RegularExpressionsParser();
+ }
+
+ //@formatter:off
+ /**
+ {
+ "convertCamelCaseToUnderScore": true,
+ "messageHeaderRegex":
"(?<syslogpriority>(?<=^<)\\d{1,4}(?=>)).*?(?<timestampDeviceOriginal>(?<=>)[A-Za-z]{3}\\s{1,2}\\d{1,2}\\s\\d{1,2}:\\d{1,2}:\\d{1,2}(?=\\s)).*?(?<deviceName>(?<=\\s).*?(?=\\s))",
+ "recordTypeRegex":
"(?<dstProcessName>(?<=\\s)\\b(kesl|sshd|run-parts|kernel|vsftpd|ftpd|su)\\b(?=\\[|:))",
+ "fields": [
+ {
+ "recordType": "kesl",
+ "regex": ".*(?<eventInfo>(?<=\\:).*?(?=$))"
+ },
+ {
+ "recordType": "run-parts",
+ "regex": ".*(?<eventInfo>(?<=\\sparts).*?(?=$))"
+ },
+ {
+ "recordType": "sshd",
+ "regex": [
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=\\]:\\s).*?(?=\\sfor)).*?(?<dstUserId>(?<=\\sfor\\s).*?(?=\\sfrom)).*?(?<ipSrcAddr>(?<=\\sfrom\\s).*?(?=\\sport)).*?(?<ipSrcPort>(?<=\\sport\\s).*?(?=\\s)).*?(?<appProtocol>(?<=port\\s\\d{1,5}\\s).*(?=:\\s)).*?(?<encryptionAlgorithm>(?<=:\\s).+?(?=\\s)).*(?<correlationId>(?<=\\s).+?(?=$))",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=\\]:\\s).*?(?=\\sfor)).*?(?<dstUserId>(?<=\\sfor\\s).*?(?=\\sfrom)).*?(?<ipSrcAddr>(?<=\\sfrom\\s).*?(?=\\sport)).*?(?<ipSrcPort>(?<=\\sport\\s).*?(?=\\s)).*?(?<appProtocol>(?<=port\\s\\d{1,5}\\s).*?(?=$))",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<ipDstAddr>(?<=Remote:).*?(?=\\-)).*?(?<ipDstPort>(?<=\\-).*?(?=;)).*?(?<appProtocol>(?<=Protocol:).*?(?=;)).*?(?<sshClient>(?<=Client:).*?(?=$))",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<appProtocol>(?<=\\]:).*?(?=:)).*?(?<ipDstAddr>(?<=Remote:).*?(?=\\-)).*?(?<ipDstPort>(?<=\\-).*?(?=;)).*?(?<encryptionAlgorithm>(?<=Enc:\\s).*?(?=$))",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<ipDstAddr>(?<=Remote:).*?(?=\\-)).*?(?<ipDstPort>(?<=\\-).*?(?=;)).*?(?<encryptionAlgorithm>(?<=Enc:\\s).*?(?=$))",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=:).*?(?=for)).*?(?<dstUserId>(?<=for).*?(?=from)).*?(?<ipSrcAddr>(?<=from).*?(?=port)).*?(?<ipSrcPort>(?<=port).*?(?=\\s)).*?(?<appProtocol>(?<=\\s).*?(?=$))",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\]))]:\\s.*?(?<eventInfo>subsystem.*?(?=by\\suser)).*?(?<srcUserId>(?<=user).*?(?=$))",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<action>(?<=Received).*?(?=from)).*?(?<ipSrcAddr>(?<=from).*?(?=:)).*?(?<eventInfo>(?<=11:).*?(?=$))",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=\\]:\\s)Server\\slistening(?=\\s)).*?(?<ipSrcAddr>(?<=\\son\\s).*?(?=port)).*?(?<ipSrcPort>(?<=port\\s)\\d{1,6}(?=\\.)).*$",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=\\]:\\s)Invalid
user(?=\\s)).*?(?<dstUserId>(?<=\\s).*?(?=from)).*?(?<ipSrcAddr>(?<=from\\s).*(?=$))",
+
".*(?<dstProcessId>(?<=\\[).*?(?=]:\\s)).*(?<subProcess>(?<=]:\\s).*\\)(?=:)).*(?<eventInfo>(?<=:\\s).*(?=;)).*(?<logname>(?<=logname=).*?(?=\\s)).*(?<dstUserId>(?<=uid=).*?(?=\\s)).*(?<effectiveUserId>(?<=euid=).*?(?=\\s)).*(?<sessionName>(?<=tty=).*?(?=\\s)).*(?<srcUserId>(?<=ruser=).*?(?=\\s)).*(?<ipSrcAddr>(?<=rhost=).*?(?=\\s)).*(?<userId>(?<=user=).*?(?=$))",
+
".*(?<dstProcessId>(?<=\\[).*?(?=]:\\s)).*(?<eventInfo>(?<=:\\s).*(?=;)).*(?<logname>(?<=logname=).*?(?=\\s)).*(?<dstUserId>(?<=uid=).*?(?=\\s)).*(?<effectiveUserId>(?<=euid=).*?(?=\\s)).*(?<sessionName>(?<=tty=).*?(?=\\s)).*(?<srcUserId>(?<=ruser=).*?(?=\\s)).*(?<ipSrcAddr>(?<=rhost=).*?(?=\\s)).*(?<userId>(?<=user=).*?(?=$))",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=\\]:\\s).*?(?=for)).*?(?<dstUserId>(?<=\\sfor).*?(?=\\[)).*?(?<subProcess>(?<=\\[).*?(?=\\])).*$",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=:\\s)Excess
permission or bad ownership on
file(?=\\s\\/)).*?(?<filePath>(?<=\\s).*(?=\\/)).*?(?<fileName>(?<=\\/).*(?=$))",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=:).*?(?=;)).*$",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=:).*?(?=\\d)).*$",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=:).*?(?=$))"
+ ]
+ },
+ {
+ "recordType": "kernel",
+ "regex": [
+
".*(?<connectedDeviceName>(?<=\\:\\susb).*?(?=\\:)).*?(?<eventInfo>(?<=\\:).*?(?=$))",
+
".*(?<subProcess>(?<=\\:\\s).*?(?=\\:)).*?(?<eventInfo>(?<=\\:).*?(?=$))"
+ ]
+ },
+ {
+ "recordType": "vsftpd",
+ "regex":
".*(?<dstProcessId>(?<=\\[).*?(?=]:\\s)).*(?<subProcess>(?<=]:\\s).*\\)(?=:)).*(?<eventInfo>(?<=:\\s).*(?=;)).*(?<effectiveUserId>(?<=euid=).*?(?=\\s)).*(?<sessionName>(?<=tty=).*?(?=\\s)).*(?<srcUserId>(?<=user=).*?(?=\\s)).*(?<ipSrcAddr>(?<=rhost=).*?(?=\\s)).*(?<dstUserId>(?<=user=).*?(?=$))"
+ },
+ {
+ "recordType": "ftpd",
+ "regex": [
+
".*(?<dstProcessId>(?<=\\[).*?(?=]:\\s)).*(?<eventInfo>(?<=:\\s).*(?=FROM)).*(?<srcHost>(?<=\\s).*?(?=\\s)).*(?<ipSrcAddr>(?<=\\s).*?(?=,)).*(?<dstUserId>(?<=,).*?(?=$))",
+
".*(?<dstProcessId>(?<=\\[).*?(?=]:\\s)).*(?<eventInfo>(?<=:\\s).*(?=from)).*(?<srcHost>(?<=\\s).*?(?=\\s)).*(?<ipSrcAddr>(?<=\\s).*?(?=,)).*(?<dstUserId>(?<=,).*?(?=$))"
+ ]
+ },
+ {
+ "recordType": "su",
+ "regex": [
+
".*(?<eventInfo>(?<=:\\s).*(?=for)).*(?<dstUserId>(?<=user=).*?(?=to)).*(?<responseCode>(?<=to).*?(?=$))"
+ ]
+ }
+ ]
+ }
+ */
+ @Multiline
+ public static String parserConfig1;
+ //@formatter:on
+
+
+ @Test
+ public void testSSHDParse() throws Exception {
+ String message =
+ "<38>Jun 20 15:01:17 deviceName sshd[11672]: Accepted publickey
for prod from 22.22.22.22 port 55555 ssh2";
+
+ JSONObject parserConfig = (JSONObject) new
JSONParser().parse(parserConfig1);
+ regularExpressionsParser.configure(parserConfig);
+ JSONObject parsed = parse(message);
+ // Expected
+ Map<String, Object> expectedJson = new HashMap<>();
+ Assert.assertEquals(parsed.get("device_name"), "deviceName");
+ Assert.assertEquals(parsed.get("dst_process_name"), "sshd");
+ Assert.assertEquals(parsed.get("dst_process_id"), "11672");
+ Assert.assertEquals(parsed.get("dst_user_id"), "prod");
+ Assert.assertEquals(parsed.get("ip_src_addr"), "22.22.22.22");
+ Assert.assertEquals(parsed.get("ip_src_port"), "55555");
+ Assert.assertEquals(parsed.get("app_protocol"), "ssh2");
+ Assert.assertEquals(parsed.get("original_string"),
+ "<38>Jun 20 15:01:17 deviceName sshd[11672]: Accepted publickey
for prod from 22.22.22.22 port 55555 ssh2");
+ Assert.assertTrue(parsed.containsKey("timestamp"));
+
+ }
+
+ //@formatter:off
+ /**
+ {
+ "convertCamelCaseToUnderScore": true,
+ "recordTypeRegex":
"(?<dstProcessName>(?<=\\s)\\b(kesl|sshd|run-parts|kernel|vsftpd|ftpd|su)\\b(?=\\[|:))",
+ "fields": [
+ {
+ "recordType": "kesl",
+ "regex": ".*(?<eventInfo>(?<=\\:).*?(?=$))"
+ },
+ {
+ "recordType": "run-parts",
+ "regex": ".*(?<eventInfo>(?<=\\sparts).*?(?=$))"
+ },
+ {
+ "recordType": "sshd",
+ "regex": [
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=\\]:\\s).*?(?=\\sfor)).*?(?<dstUserId>(?<=\\sfor\\s).*?(?=\\sfrom)).*?(?<ipSrcAddr>(?<=\\sfrom\\s).*?(?=\\sport)).*?(?<ipSrcPort>(?<=\\sport\\s).*?(?=\\s)).*?(?<appProtocol>(?<=port\\s\\d{1,5}\\s).*(?=:\\s)).*?(?<encryptionAlgorithm>(?<=:\\s).+?(?=\\s)).*(?<correlationId>(?<=\\s).+?(?=$))",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=\\]:\\s).*?(?=\\sfor)).*?(?<dstUserId>(?<=\\sfor\\s).*?(?=\\sfrom)).*?(?<ipSrcAddr>(?<=\\sfrom\\s).*?(?=\\sport)).*?(?<ipSrcPort>(?<=\\sport\\s).*?(?=\\s)).*?(?<appProtocol>(?<=port\\s\\d{1,5}\\s).*?(?=$))",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<ipDstAddr>(?<=Remote:).*?(?=\\-)).*?(?<ipDstPort>(?<=\\-).*?(?=;)).*?(?<appProtocol>(?<=Protocol:).*?(?=;)).*?(?<sshClient>(?<=Client:).*?(?=$))",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<appProtocol>(?<=\\]:).*?(?=:)).*?(?<ipDstAddr>(?<=Remote:).*?(?=\\-)).*?(?<ipDstPort>(?<=\\-).*?(?=;)).*?(?<encryptionAlgorithm>(?<=Enc:\\s).*?(?=$))",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<ipDstAddr>(?<=Remote:).*?(?=\\-)).*?(?<ipDstPort>(?<=\\-).*?(?=;)).*?(?<encryptionAlgorithm>(?<=Enc:\\s).*?(?=$))",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=:).*?(?=for)).*?(?<dstUserId>(?<=for).*?(?=from)).*?(?<ipSrcAddr>(?<=from).*?(?=port)).*?(?<ipSrcPort>(?<=port).*?(?=\\s)).*?(?<appProtocol>(?<=\\s).*?(?=$))",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\]))]:\\s.*?(?<eventInfo>subsystem.*?(?=by\\suser)).*?(?<srcUserId>(?<=user).*?(?=$))",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<action>(?<=Received).*?(?=from)).*?(?<ipSrcAddr>(?<=from).*?(?=:)).*?(?<eventInfo>(?<=11:).*?(?=$))",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=\\]:\\s)Server\\slistening(?=\\s)).*?(?<ipSrcAddr>(?<=\\son\\s).*?(?=port)).*?(?<ipSrcPort>(?<=port\\s)\\d{1,6}(?=\\.)).*$",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=\\]:\\s)Invalid
user(?=\\s)).*?(?<dstUserId>(?<=\\s).*?(?=from)).*?(?<ipSrcAddr>(?<=from\\s).*(?=$))",
+
".*(?<dstProcessId>(?<=\\[).*?(?=]:\\s)).*(?<subProcess>(?<=]:\\s).*\\)(?=:)).*(?<eventInfo>(?<=:\\s).*(?=;)).*(?<logname>(?<=logname=).*?(?=\\s)).*(?<dstUserId>(?<=uid=).*?(?=\\s)).*(?<effectiveUserId>(?<=euid=).*?(?=\\s)).*(?<sessionName>(?<=tty=).*?(?=\\s)).*(?<srcUserId>(?<=ruser=).*?(?=\\s)).*(?<ipSrcAddr>(?<=rhost=).*?(?=\\s)).*(?<userId>(?<=user=).*?(?=$))",
+
".*(?<dstProcessId>(?<=\\[).*?(?=]:\\s)).*(?<eventInfo>(?<=:\\s).*(?=;)).*(?<logname>(?<=logname=).*?(?=\\s)).*(?<dstUserId>(?<=uid=).*?(?=\\s)).*(?<effectiveUserId>(?<=euid=).*?(?=\\s)).*(?<sessionName>(?<=tty=).*?(?=\\s)).*(?<srcUserId>(?<=ruser=).*?(?=\\s)).*(?<ipSrcAddr>(?<=rhost=).*?(?=\\s)).*(?<userId>(?<=user=).*?(?=$))",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=\\]:\\s).*?(?=for)).*?(?<dstUserId>(?<=\\sfor).*?(?=\\[)).*?(?<subProcess>(?<=\\[).*?(?=\\])).*$",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=:\\s)Excess
permission or bad ownership on
file(?=\\s\\/)).*?(?<filePath>(?<=\\s).*(?=\\/)).*?(?<fileName>(?<=\\/).*(?=$))",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=:).*?(?=;)).*$",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=:).*?(?=\\d)).*$",
+
".*(?<dstProcessId>(?<=\\[).*?(?=\\])).*?(?<eventInfo>(?<=:).*?(?=$))"
+ ]
+ },
+ {
+ "recordType": "kernel",
+ "regex": [
+
".*(?<connectedDeviceName>(?<=\\:\\susb).*?(?=\\:)).*?(?<eventInfo>(?<=\\:).*?(?=$))",
+
".*(?<subProcess>(?<=\\:\\s).*?(?=\\:)).*?(?<eventInfo>(?<=\\:).*?(?=$))"
+ ]
+ },
+ {
+ "recordType": "vsftpd",
+ "regex":
".*(?<dstProcessId>(?<=\\[).*?(?=]:\\s)).*(?<subProcess>(?<=]:\\s).*\\)(?=:)).*(?<eventInfo>(?<=:\\s).*(?=;)).*(?<effectiveUserId>(?<=euid=).*?(?=\\s)).*(?<sessionName>(?<=tty=).*?(?=\\s)).*(?<srcUserId>(?<=user=).*?(?=\\s)).*(?<ipSrcAddr>(?<=rhost=).*?(?=\\s)).*(?<dstUserId>(?<=user=).*?(?=$))"
+ },
+ {
+ "recordType": "ftpd",
+ "regex": [
+
".*(?<dstProcessId>(?<=\\[).*?(?=]:\\s)).*(?<eventInfo>(?<=:\\s).*(?=FROM)).*(?<srcHost>(?<=\\s).*?(?=\\s)).*(?<ipSrcAddr>(?<=\\s).*?(?=,)).*(?<dstUserId>(?<=,).*?(?=$))",
+
".*(?<dstProcessId>(?<=\\[).*?(?=]:\\s)).*(?<eventInfo>(?<=:\\s).*(?=from)).*(?<srcHost>(?<=\\s).*?(?=\\s)).*(?<ipSrcAddr>(?<=\\s).*?(?=,)).*(?<dstUserId>(?<=,).*?(?=$))"
+ ]
+ },
+ {
+ "recordType": "su",
+ "regex": [
+
".*(?<eventInfo>(?<=:\\s).*(?=for)).*(?<dstUserId>(?<=user=).*?(?=to)).*(?<responseCode>(?<=to).*?(?=$))"
+ ]
+ }
+ ]
+ }
+ */
+ @Multiline
+ public static String parserConfigNoMessageHeader;
+ //@formatter:on
+
+ @Test
+ public void testNoMessageHeaderRegex() throws Exception {
+ String message =
+ "<38>Jun 20 15:01:17 deviceName sshd[11672]: Accepted publickey
for prod from 22.22.22.22 port 55555 ssh2";
+ JSONObject parserConfig = (JSONObject) new
JSONParser().parse(parserConfigNoMessageHeader);
+ regularExpressionsParser.configure(parserConfig);
+ JSONObject parsed = parse(message);
+ // Expected
+
+ Assert.assertEquals(parsed.get("dst_process_name"), "sshd");
+ Assert.assertEquals(parsed.get("dst_process_id"), "11672");
+ Assert.assertEquals(parsed.get("dst_user_id"), "prod");
+ Assert.assertEquals(parsed.get("ip_src_addr"), "22.22.22.22");
+ Assert.assertEquals(parsed.get("ip_src_port"), "55555");
+ Assert.assertEquals(parsed.get("app_protocol"), "ssh2");
+ Assert.assertEquals(parsed.get("original_string"),
+ "<38>Jun 20 15:01:17 deviceName sshd[11672]: Accepted publickey
for prod from 22.22.22.22 port 55555 ssh2");
+ Assert.assertTrue(parsed.containsKey("timestamp"));
+
+ }
+
+ //@formatter:off
+ /**
+ {
+ "messageHeaderRegex":
"(?<syslog_priority>(?<=^<)\\d{1,4}(?=>)).*?(?<timestampDeviceOriginal>(?<=>)[A-Za-z]{3}\\s{1,2}\\d{1,2}\\s\\d{1,2}:\\d{1,2}:\\d{1,2}(?=\\s)).*?(?<deviceName>(?<=\\s).*?(?=\\s))",
+ "recordTypeRegex":
"(?<dstProcessName>(?<=\\s)\\b(tch-replicant|audispd|syslog)\\b(?=\\[|:))",
+ "fields": [
+ {
+ "recordType": "syslog",
+ "regex":
".*(?<dstProcessId>(?<=PID\\s=\\s).*?(?=\\sLine)).*"
+ }
+ ]
+ }
+ */
+ @Multiline
+ public static String invalidParserConfig;
+ //@formatter:on
+
+ @Test(expected = IllegalStateException.class)
+ public void testMalformedRegex() throws Exception {
+ String message =
+ "<38>Jun 20 15:01:17 deviceName sshd[11672]: Accepted publickey
for prod from 22.22.22.22 port 55555 ssh2";
+ JSONObject parserConfig = (JSONObject) new
JSONParser().parse(invalidParserConfig);
+ regularExpressionsParser.configure(parserConfig);
+ parse(message);
+ }
+
+ //@formatter:off
+ /**
+ {
+ "messageHeaderRegex":
"(?<syslog_priority>(?<=^<)\\d{1,4}(?=>)).*?(?<timestampDeviceOriginal>(?<=>)[A-Za-z]{3}\\s{1,2}\\d{1,2}\\s\\d{1,2}:\\d{1,2}:\\d{1,2}(?=\\s)).*?(?<deviceName>(?<=\\s).*?(?=\\s))",
+ "fields": [
+ {
+ "recordType": "syslog",
+ "regex":
".*(?<dstProcessId>(?<=PID\\s=\\s).*?(?=\\sLine)).*"
+ }
+ ]
+ }
+ */
+ @Multiline
+ public static String noRecordTypeParserConfig;
+ //@formatter:on
+
+ @Test(expected = IllegalStateException.class)
+ public void testNoRecordTypeRegex() throws Exception {
+ String message =
+ "<38>Jun 20 15:01:17 deviceName sshd[11672]: Accepted publickey
for prod from 22.22.22.22 port 55555 ssh2";
+ JSONObject parserConfig = (JSONObject) new
JSONParser().parse(noRecordTypeParserConfig);
+ regularExpressionsParser.configure(parserConfig);
+ parse(message);
+ }
+
+ private JSONObject parse(String message) throws Exception {
+ List<JSONObject> result =
regularExpressionsParser.parse(message.getBytes());
+ if (result.size() > 0) {
+ return result.get(0);
+ }
+ throw new Exception("Could not parse : " + message);
+ }
+}