Github user bbende commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2805#discussion_r197142251
  
    --- Diff: 
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ParseSyslog5424.java
 ---
    @@ -0,0 +1,174 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.nifi.processors.standard;
    +
    +import com.github.palindromicity.syslog.NilPolicy;
    +import org.apache.nifi.annotation.behavior.EventDriven;
    +import org.apache.nifi.annotation.behavior.InputRequirement;
    +import org.apache.nifi.annotation.behavior.InputRequirement.Requirement;
    +import org.apache.nifi.annotation.behavior.SideEffectFree;
    +import org.apache.nifi.annotation.behavior.SupportsBatching;
    +import org.apache.nifi.annotation.behavior.WritesAttribute;
    +import org.apache.nifi.annotation.behavior.WritesAttributes;
    +import org.apache.nifi.annotation.documentation.CapabilityDescription;
    +import org.apache.nifi.annotation.documentation.SeeAlso;
    +import org.apache.nifi.annotation.documentation.Tags;
    +import org.apache.nifi.components.AllowableValue;
    +import org.apache.nifi.components.PropertyDescriptor;
    +import org.apache.nifi.expression.ExpressionLanguageScope;
    +import org.apache.nifi.flowfile.FlowFile;
    +import org.apache.nifi.processor.AbstractProcessor;
    +import org.apache.nifi.processor.ProcessContext;
    +import org.apache.nifi.processor.ProcessSession;
    +import org.apache.nifi.processor.Relationship;
    +import org.apache.nifi.processor.exception.ProcessException;
    +import org.apache.nifi.processor.io.InputStreamCallback;
    +import org.apache.nifi.processor.util.StandardValidators;
    +import org.apache.nifi.processors.standard.syslog.StrictSyslog5424Parser;
    +import org.apache.nifi.processors.standard.syslog.Syslog5424Event;
    +import org.apache.nifi.stream.io.StreamUtils;
    +
    +import java.io.IOException;
    +import java.io.InputStream;
    +import java.nio.charset.Charset;
    +import java.util.ArrayList;
    +import java.util.HashSet;
    +import java.util.List;
    +import java.util.Set;
    +
    +
    +@EventDriven
    +@SideEffectFree
    +@SupportsBatching
    +@InputRequirement(Requirement.INPUT_REQUIRED)
    +@Tags({"logs", "syslog", "syslog5424", "attributes", "system", "event", 
"message"})
    +@CapabilityDescription("Attempts to parse the contents of a well formed 
Syslog message in accordance to RFC5424 " +
    +        "format and adds attributes to the FlowFile for each of the parts 
of the Syslog message, including Structured Data." +
    +        "Structured Data will be written to attributes as on attribute per 
item id + parameter "+
    +        "see https://tools.ietf.org/html/rfc5424."; +
    +        "Note: ParseSyslog5424 follows the specification more closely than 
ParseSyslog.  If your Syslog producer " +
    +        "does not follow the spec closely, with regards to using '-' for 
missing header entries for example, those logs " +
    +        "will fail with this parser, where they would not fail with 
ParseSyslog.")
    +@WritesAttributes({@WritesAttribute(attribute = "syslog.priority", 
description = "The priority of the Syslog message."),
    +    @WritesAttribute(attribute = "syslog.severity", description = "The 
severity of the Syslog message derived from the priority."),
    +    @WritesAttribute(attribute = "syslog.facility", description = "The 
facility of the Syslog message derived from the priority."),
    +    @WritesAttribute(attribute = "syslog.version", description = "The 
optional version from the Syslog message."),
    +    @WritesAttribute(attribute = "syslog.timestamp", description = "The 
timestamp of the Syslog message."),
    +    @WritesAttribute(attribute = "syslog.hostname", description = "The 
hostname or IP address of the Syslog message."),
    +    @WritesAttribute(attribute = "syslog.appname", description = "The 
appname of the Syslog message."),
    +    @WritesAttribute(attribute = "syslog.procid", description = "The 
procid of the Syslog message."),
    +    @WritesAttribute(attribute = "syslog.messageid", description = "The 
messageid the Syslog message."),
    +    @WritesAttribute(attribute = "syslog.structuredData", description = 
"Multiple entries per structuredData of the Syslog message."),
    +    @WritesAttribute(attribute = "syslog.sender", description = "The 
hostname of the Syslog server that sent the message."),
    +    @WritesAttribute(attribute = "syslog.body", description = "The body of 
the Syslog message, everything after the hostname.")})
    --- End diff --
    
    I guess it depends what is most likely to be done with syslog messages 
after this processor... basically will it be more common that a user wants the 
full message in the flow file content because they are going to send this data 
to an external system that accepts the full message? or will it be more common 
that they just want the message body? 
    
    I lean towards leaving the flow file content as is, since that is what 
ParseSyslog does, and just letting them control whether or not the message body 
is added as an attribute when parsing. Most cases won't need the message body 
in attribute, they just want to use one of the other fields to make a 
routing/filtering decisions and then send the whole message somewhere.


---

Reply via email to