Yes, exactly. The challenge would be what to do with the mime boundary header (include with content, or dump and rebuild with a merge).
Simon > On 19 May 2016, at 12:45, Andre <[email protected]> wrote: > > Simon, > > Are you suggesting attributes similar to UnpackContent? > > If yes, seems like a great approach. > > Cheers > On 19 May 2016 14:50, "Simon Elliston Ball" <[email protected]> > wrote: > > Fantastic idea! > > Would SplitEmail not make sense to divide by the mime boundary? If you add > fragment indices in the way other Split processors do, it would be easy to > recombine an email after processing splits. To be honest, I'm not sure what > the use case for doing so would be, but it feels consistent with the Split, > Process, Merge pattern you see elsewhere in NiFi. > > Simon > >> On 19 May 2016, at 03:11, Joe Witt <[email protected]> wrote: >> >> Andre >> >> I like the idea. I'd suggest having 'ListenSMTP' go ahead and create >> a good set of FlowFile attributes for things like >> to/from/cc/subject/number of attachments/time/etc... that make sense >> for a given e-mail. The body of the flowfile would be the entire >> message which i believe would include the attachments themselves which >> is fair game. If you did need/want to split out the attachments in >> your flow then I'd say the 'ParseEmail' idea is good but perhaps call >> it 'SplitEmail' or 'ExtractEmailAttachment' or something like that. >> >> Thanks >> Joe >> >>> On Wed, May 18, 2016 at 7:43 PM, Andre F de Miranda <[email protected]> > wrote: >>> All, >>> >>> I have been considering writing a "ListenSMTP" processor and was > wondering >>> *what is the best way of dealing with multiple attachments*. >>> >>> Looking in here >>> > https://mail-archives.apache.org/mod_mbox/nifi-users/201602.mbox/%3ccaljk9a5ulcitnfo0dlsvd5d-jkcsqm+rqjxuruzwgrdbqad...@mail.gmail.com%3E >>> >>> >>> I can read Joe suggesting not using attributes to store large volumes of >>> data, so far so good, however, as far as I understand a flowfile can only >>> contain one "content". >>> >>> Currently the way I envision this would be modular that taps into the >>> pattern set by ListenSyslog / ParseSyslog: >>> >>> ListenSMTP - A processor that only provides an SMTP interface >>> >>> ParseEmail - A processor that reads the flowfile holding the email body > and >>> split it into 1 or more flowfiles containing the attached mime objects. >>> >>> The advantage here is that people can use FetchFile or to create a > GetIMAP >>> processor to parse messages. >>> >>> Would anyone have a different view on how to achieve this? >>> >>> I thank you in advance
