Andre

Unless I am missing something, I would stay on the side of "keeping it simple 
and modularized” where if any extraction/transformation/modification etc., of 
anything is required that is a job of another component and in fact faced the 
very similar question few days ago about attachments from IMAP/POP3 etc. 
As you mentioned already using MimeMessageParser is straight forward and allows 
one to restore InputStream back into a java.mail.Message from which you can 
extract and get to pretty much anything and you are already doing in in 
ExtactAttachment processor so I would continue on that pass.

That is of course my opinion, so would be nice to see what other’s think.
Cheers
Oleg

> On Jul 24, 2016, at 8:38 AM, Andre <[email protected]> wrote:
> 
> I have raised NIFI-2380 to track this improvement.
> 
> While raising the ticket I was wondering:
> 
> are you happy to give the use the option to chose if to extract the
> winmail.dat or not?
> 
> I mean something like:
> 
> - PROPERTY: "Extract Attachments within a TNEF (i.e. winmail.data): true /
> false
> 
> If yes, then every time a decoding occur we test the name (or something
> better in case it is possible) and then extract it. An attachment created
> by a TNEF file would have an attribute email.attachment.tnefdecoded (or
> whatever name we decide) set to yes.
> 
> If no, processing continues as it is today (i.e. purely based on Apache
> Commons MimeMessageParser).
> 
> 
> Another possible solution would be an additional processor but IMNSHO this
> would be overkill and counter productive.
> 
> Ken to hear your thoughts
> 
> On Sun, Jul 17, 2016 at 4:46 PM, Andre <[email protected]> wrote:
> 
>> Dan,
>> 
>> Ingesting Microsoft Journals seem like a great suggestion for a new
>> processor ( ParseExchangeJounal ?).
>> 
>> Regarding TNEF: As far as I know, Apache Commons - Mail does not pase 
>> "winmail.dat"
>> type attachments. As far as I understand the only ASL compatible
>> implementation of a TNEF extractor is Apache's POI and even that
>> implementation is not part of POI's main release.
>> 
>> If TNEF support is required we will ether have to code from scratch or
>> perhaps use https://github.com/koodaamo/tnefparse together with
>> ExecuteScript (although since tnefparse  is LGPL, this solution cannot be
>> packaged as part of NiFi).
>> 
>> Cheers
>> 
>> On Sun, Jul 17, 2016 at 10:53 AM, djmdata <[email protected]> wrote:
>> 
>>> What is the JIRA #?
>>> 
>>> I have a production system that reads email from a custom SMTP listener
>>> and
>>> places the SMTP payload into Kafka. A Storm topology reads messages from
>>> Kafka and parses the emails (Java code using JavaMail API) into useful
>>> info
>>> (subject, text, attachments, body, etc...).
>>> 
>>> I'm looking at plugging NiFi into this to replace the custom SMTP
>>> listener.
>>> If you had a processor that could act as a reliable (we can't lose emails)
>>> and performant SMTP listener alternative we would use it.
>>> 
>>> Your "email parser processor" is an interesting idea - but beware of the
>>> mess you'll find in the wild with email. In our case, we try to parse
>>> Exchange (full of non-standard wonders like "TNEF" attachments") as well
>>> as
>>> email from virtually anywhere (GMail, Yahoo, Joe's email client...). If
>>> you
>>> can crack that you'll be on to something. We have even more complexity in
>>> that we read "Microsoft Journals" which wrap the standard SMTP layout in a
>>> Microsoft layer (you'll see this at large Exchange shops doing this kind
>>> of
>>> thing for use cases like compliance).
>>> 
>>> 
>>> 
>>> --
>>> View this message in context:
>>> http://apache-nifi-developer-list.39713.n7.nabble.com/ListenSMTP-processor-tp10510p12827.html
>>> Sent from the Apache NiFi Developer List mailing list archive at
>>> Nabble.com.
>>> 
>> 
>> 

Reply via email to