[ 
https://issues.apache.org/jira/browse/NIFI-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16150452#comment-16150452
 ] 

ASF GitHub Bot commented on NIFI-4326:
--------------------------------------

Github user btwood commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2111#discussion_r136566431
  
    --- Diff: 
nifi-nar-bundles/nifi-email-bundle/nifi-email-processors/src/main/java/org/apache/nifi/processors/email/ExtractEmailHeaders.java
 ---
    @@ -168,21 +173,40 @@ public void process(final InputStream rawIn) throws 
IOException {
                                 }
                             }
                         }
    -                    if 
(Array.getLength(originalMessage.getAllRecipients()) > 0) {
    -                        for (int toCount = 0; toCount < 
ArrayUtils.getLength(originalMessage.getRecipients(Message.RecipientType.TO)); 
toCount++) {
    -                            attributes.put(EMAIL_HEADER_TO + "." + 
toCount, 
originalMessage.getRecipients(Message.RecipientType.TO)[toCount].toString());
    +
    +                    // Get Non-Strict Recipient Addresses
    +                    InternetAddress[] recipients;
    +                    if 
(originalMessage.getHeader(Message.RecipientType.TO.toString(), ",") != null) {
    +                        recipients = 
InternetAddress.parseHeader(originalMessage.getHeader(Message.RecipientType.TO.toString(),
 ","), false);
    +                        for (int toCount = 0; toCount < 
ArrayUtils.getLength(recipients); toCount++) {
    +                            attributes.put(EMAIL_HEADER_TO + "." + 
toCount, recipients[toCount].toString());
                             }
    -                        for (int toCount = 0; toCount < 
ArrayUtils.getLength(originalMessage.getRecipients(Message.RecipientType.BCC)); 
toCount++) {
    -                            attributes.put(EMAIL_HEADER_BCC + "." + 
toCount, 
originalMessage.getRecipients(Message.RecipientType.BCC)[toCount].toString());
    +                    }
    +                    if 
(originalMessage.getHeader(Message.RecipientType.BCC.toString(), ",") != null) {
    +                        recipients = 
InternetAddress.parseHeader(originalMessage.getHeader(Message.RecipientType.BCC.toString(),
 ","), false);
    +                        for (int toCount = 0; toCount < 
ArrayUtils.getLength(recipients); toCount++) {
    +                            attributes.put(EMAIL_HEADER_BCC + "." + 
toCount, recipients[toCount].toString());
                             }
    -                        for (int toCount = 0; toCount < 
ArrayUtils.getLength(originalMessage.getRecipients(Message.RecipientType.CC)); 
toCount++) {
    -                            attributes.put(EMAIL_HEADER_CC + "." + 
toCount, 
originalMessage.getRecipients(Message.RecipientType.CC)[toCount].toString());
    +                    }
    +                    if 
(originalMessage.getHeader(Message.RecipientType.CC.toString(), ",") != null) {
    +                        recipients = 
InternetAddress.parseHeader(originalMessage.getHeader(Message.RecipientType.CC.toString(),
 ","), false);
    +                        for (int toCount = 0; toCount < 
ArrayUtils.getLength(recipients); toCount++) {
    +                            attributes.put(EMAIL_HEADER_CC + "." + 
toCount, recipients[toCount].toString());
                             }
                         }
    -                    // Incredibly enough RFC-2822 specified From as a 
"mailbox-list" so an array I returned by getFrom
    -                    for (int toCount = 0; toCount < 
ArrayUtils.getLength(originalMessage.getFrom()); toCount++) {
    -                        attributes.put(EMAIL_HEADER_FROM + "." + toCount, 
originalMessage.getFrom()[toCount].toString());
    +
    +                    // Get Non-Strict Sender Addresses
    +                    InternetAddress[] sender = null;
    +                    if (originalMessage.getHeader("From",",") != null) {
    +                        sender = 
(InternetAddress[])ArrayUtils.addAll(sender, 
InternetAddress.parseHeader(originalMessage.getHeader("From", ","), false));
    +                    }
    +                    if (originalMessage.getHeader("Sender",",") != null) {
    +                        sender = 
(InternetAddress[])ArrayUtils.addAll(sender, 
InternetAddress.parseHeader(originalMessage.getHeader("Sender", ","), false));
    --- End diff --
    
    My logic here was that I wanted ALL of the From/Sender addresses. So 
mailbox-list or not, in sender or not, this would collect them all. Note that 
I'm merging them. So if they are both present, then they will both be added.
    
    Again, I'll have to re-read the RFC to see if this is correct. Based on the 
implementation of getFrom() I found on grepcode though, I figured it was. 
    
    Perhaps a bad assumption though, because having read [RFC 
822](https://www.ietf.org/rfc/rfc822.txt) a lot of implementations get email 
addresses wrong. I've seen plenty of accepted mail like " "@example.com I think 
the break-down is in the SHOULD/MUST contract, where in a mail server SHOULD 
accept that address.
    
    Let me read up on the additional RFCs and get back to you. I can also do 
some digging in my mail archive to see what postfix has accepted/interpreted in 
the past. I've seen a lot of email address regexes that break because they 
don't assume "this is a valid address"@example.com is valid, even though 
postfix accepted it.


> ExtractEmailHeaders.java unhandled Exceptions
> ---------------------------------------------
>
>                 Key: NIFI-4326
>                 URL: https://issues.apache.org/jira/browse/NIFI-4326
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 1.3.0
>         Environment: jdk 1.8.0_121-b13
>            Reporter: Benjamin Wood
>            Priority: Minor
>             Fix For: 1.4.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> The ExtractEmailHeaders  processor throws a NullPointerException if there is 
> no TO, CC, and BCC recipients.
> If there are no recipients "originalMessage.getAllRecipients()" returns NULL, 
> and not a 0 length array.
> If an address is empty (<> or " ") then getRecipients() will throw an "Empty 
> Address" AddressException
> It's possible this is only an issue with Oracle Java.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to