RE: [EXT] Parsing Email Attachments

Peter Wicks (pwicks) Wed, 17 May 2017 22:55:06 -0700

Nick,

Try escaping your \n’s, see if that helps.


(?s)(.*\\n\\n${boundary}\\nContent-Type: text\/plain; 
charset="UTF-8"\\n\\n)(.*?)(\\n\\n${boundary}.*)

From: Nick Carenza [mailto:[email protected]]
Sent: Thursday, May 18, 2017 11:27 AM
To: [email protected]
Subject: [EXT] Parsing Email Attachments

Hey Nifi-ers,

I haven't been having any luck trying to parse email after consuming them with 
pop3.

I am composing a simple message with gmail with just plain text and it comes 
out like this (with many headers removed):

Delivered-To: [email protected]<mailto:[email protected]>
Return-Path: <[email protected]<mailto:[email protected]>>
MIME-Version: 1.0
Received: by 0.0.0.0 with HTTP; Tue, 16 May 2017 17:54:04 -0700 (PDT)
From: User <[email protected]<mailto:[email protected]>>
Date: Tue, 16 May 2017 17:54:04 -0700
Subject: test subject
To: [email protected]<mailto:[email protected]>
Content-Type: multipart/alternative; boundary="f403045f83d499711a054fadb980"

--f403045f83d499711a054fadb980
Content-Type: text/plain; charset="UTF-8"

test email body

--f403045f83d499711a054fadb980
Content-Type: text/html; charset="UTF-8"

<div dir="ltr">test email body</div>

--f403045f83d499711a054fadb980--

I just want the email body and ExtractEmailAttachments doesn't seem to extract 
the parts between the boundaries like I hoped it would.

So instead I use ExtractEmailHeaders and additionally extract the Content-Type 
header which I then retrieve just the boundary value with an UpdateAttribute 
processor configure like:

boundary: 
${email.headers.content-type:substringAfter('boundary="'):substringBefore('"'):prepend('--')}

Then I wrote a sweet regex for ReplaceText to clean this up:

(?s)(.*\n\n${boundary}\nContent-Type: text\/plain; 
charset="UTF-8"\n\n)(.*?)(\n\n${boundary}.*)

[Inline image 1]

... but even though this works in regex testers and sublimetext, it seems to 
have no effect in my flow.

Anyone have any insight on this?

Thanks,
Nick

RE: [EXT] Parsing Email Attachments

Reply via email to