[ 
https://issues.apache.org/jira/browse/NIFI-5147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16603264#comment-16603264
 ] 

ASF GitHub Bot commented on NIFI-5147:
--------------------------------------

Github user alopresto commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2980#discussion_r214979341
  
    --- Diff: 
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/security/util/crypto/HashService.java
 ---
    @@ -0,0 +1,121 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.nifi.security.util.crypto;
    +
    +import java.nio.charset.Charset;
    +import java.nio.charset.StandardCharsets;
    +import org.apache.commons.codec.binary.Hex;
    +import org.apache.commons.codec.digest.DigestUtils;
    +import org.bouncycastle.crypto.digests.Blake2bDigest;
    +import org.slf4j.Logger;
    +import org.slf4j.LoggerFactory;
    +
    +/**
    + * This class provides a generic service for cryptographic hashing. It is 
used in
    + * {@link org.apache.nifi.processors.standard.CalculateAttributeHash} and
    + * {@link org.apache.nifi.processors.standard.HashContent}.
    + * <p>
    + * See also:
    + * * {@link HashAlgorithm}
    + */
    +public class HashService {
    +    private static final Logger logger = 
LoggerFactory.getLogger(HashService.class);
    +
    +    /**
    +     * Returns the hex-encoded hash of the specified value.
    +     *
    +     * @param algorithm the hash algorithm to use
    +     * @param value     the value to hash (cannot be {@code null} but can 
be an empty String)
    +     * @param charset   the charset to use
    +     * @return the hash value in hex
    +     */
    +    public static String hashValue(HashAlgorithm algorithm, String value, 
Charset charset) {
    +        byte[] rawHash = hashValueRaw(algorithm, value, charset);
    +        return Hex.encodeHexString(rawHash);
    +    }
    +
    +    /**
    +     * Returns the hex-encoded hash of the specified value. The default 
charset ({@code StandardCharsets.UTF_8}) is used.
    +     *
    +     * @param algorithm the hash algorithm to use
    +     * @param value     the value to hash (cannot be {@code null} but can 
be an empty String)
    +     * @return the hash value in hex
    +     */
    +    public static String hashValue(HashAlgorithm algorithm, String value) {
    +        return hashValue(algorithm, value, StandardCharsets.UTF_8);
    +    }
    +
    +    /**
    +     * Returns the raw {@code byte[]} hash of the specified value.
    +     *
    +     * @param algorithm the hash algorithm to use
    +     * @param value     the value to hash (cannot be {@code null} but can 
be an empty String)
    +     * @param charset   the charset to use
    +     * @return the hash value in bytes
    +     */
    +    public static byte[] hashValueRaw(HashAlgorithm algorithm, String 
value, Charset charset) {
    +        if (value == null) {
    +            throw new IllegalArgumentException("The value cannot be null");
    +        }
    +        return hashValueRaw(algorithm, value.getBytes(charset));
    +    }
    +
    +    /**
    +     * Returns the raw {@code byte[]} hash of the specified value. The 
default charset ({@code StandardCharsets.UTF_8}) is used.
    +     *
    +     * @param algorithm the hash algorithm to use
    +     * @param value     the value to hash (cannot be {@code null} but can 
be an empty String)
    +     * @return the hash value in bytes
    +     */
    +    public static byte[] hashValueRaw(HashAlgorithm algorithm, String 
value) {
    +        return hashValueRaw(algorithm, value, StandardCharsets.UTF_8);
    +    }
    +
    +    /**
    +     * Returns the raw {@code byte[]} hash of the specified value.
    +     *
    +     * @param algorithm the hash algorithm to use
    +     * @param value     the value to hash
    +     * @return the hash value in bytes
    +     */
    +    public static byte[] hashValueRaw(HashAlgorithm algorithm, byte[] 
value) {
    +        if (algorithm == null) {
    +            throw new IllegalArgumentException("The hash algorithm cannot 
be null");
    +        }
    +        if (value == null) {
    +            throw new IllegalArgumentException("The value cannot be null");
    +        }
    +        if (algorithm.isBlake2()) {
    +            return blake2Hash(algorithm, value);
    +        } else {
    +            return traditionalHash(algorithm, value);
    +        }
    +    }
    +
    --- End diff --
    
    I don't think it makes sense to move the execution logic into the enum. The 
enum is there to capture metadata about the acceptable values, while the logic 
is independent from that selection. 


> Improve HashAttribute processor
> -------------------------------
>
>                 Key: NIFI-5147
>                 URL: https://issues.apache.org/jira/browse/NIFI-5147
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>    Affects Versions: 1.6.0
>            Reporter: Andy LoPresto
>            Assignee: Otto Fowler
>            Priority: Major
>              Labels: hash, security
>             Fix For: 1.8.0
>
>
> The {{HashAttribute}} processor currently has surprising behavior. Barring 
> familiarity with the processor, a user would expect {{HashAttribute}} to 
> generate a hash value over one or more attributes. Instead, the processor as 
> it is implemented "groups" incoming flowfiles into groups based on regular 
> expressions which match attribute values, and then generates a 
> (non-configurable) MD5 hash over the concatenation of the matching attribute 
> keys and values. 
> In addition:
> * the processor throws an error and routes to failure any incoming flowfile 
> which does not have all attributes specified in the processor
> * the use of MD5 is vastly deprecated
> * no other hash algorithms are available
> I am unaware of community use of this processor, but I do not want to break 
> backward compatibility. I propose the following steps:
> * Implement a new {{CalculateAttributeHash}} processor (awkward name, but 
> this processor already has the desired name)
> ** This processor will perform the "standard" use case -- identify an 
> attribute, calculate the specified hash over the value, and write it to an 
> output attribute
> ** This processor will have a required property descriptor allowing a 
> dropdown menu of valid hash algorithms
> ** This processor will accept arbitrary dynamic properties identifying the 
> attributes to be hashed as a key, and the resulting attribute name as a value
> ** Example: I want to generate a SHA-512 hash on the attribute {{username}}, 
> and a flowfile enters the processor with {{username}} value {{alopresto}}. I 
> configure {{algorithm}} with {{SHA-512}} and add a dynamic property 
> {{username}} -- {{username_SHA512}}. The resulting flowfile will have 
> attribute {{username_SHA512}} with value 
> {{739b4f6722fb5de20125751c7a1a358b2a7eb8f07e530e4bf18561fbff93234908aa9d2577770c876bca9ede5ba784d5ce6081dbbdfe5ddd446678f223b8d632}}
> * Improve the documentation of this processor to explain the goal/expected 
> use case (?)
> * Link in processor documentation to new processor for standard use cases
> * Remove the error alert when an incoming flowfile does not contain all 
> expected attributes. I propose changing the severity to INFO and still 
> routing to failure



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to