I ported some code that does a good job of extracting the interesting parts
from an email reply:


http://smalltalkhub.com/#!/~pdebruic/EmailReplyParser

It has examples and can parse raw mails and text only or multipart emails.

Its based on what github uses

 https://github.com/github/email_reply_parser


I see no reason why it couldn't also be adapted for use with an initial
email, as well as the replies.  





Sven Van Caekenberghe-2 wrote
> With ZnHeaders and ZnMimePart you should get a long way in parsing mail
> boxes. I believe some people have already experimented with this, but I am
> not sure and I forgot.
> 
>> On 06 Jul 2015, at 16:11, Dmitri Zagidulin <

> dmitri@

> > wrote:
>> 
>> I've been doing some mailing list analysis recently (in Ruby), and would
>> be very interested in porting it over to Smalltalk. (I was actually
>> getting really frustrated at the lack of proper debugging setup in Ruby,
>> even though it had some great mail-related libraries). I was looking at
>> thread lengths, numbers of unanswered threads, etc.
>> 
>> Alexandre -- I haven't been able to find a good Mail parsing library for
>> Smalltalk (preferably one that reads the Mbox format natively), I'd be
>> curious to know what you end up using.
>> 
>> As for the download URL -- the link Marcus gave is, unfortunately, in
>> Piper-mail's own format (a simplified version of mbox, really).
>> To get the actual .mbox file, you'd need to use this link:
>> 
>> http://lists.pharo.org/mailman/private/pharo-dev_lists.pharo.org.mbox/pharo-dev_lists.pharo.org.mbox
>> 
>> (Note that it requires you to authenticate with your mailing list email
>> and password (that you created when you first signed up for the mailing
>> list)). But once authenticated, you can download it with Zinc (or wget)
>> or whatever, and start processing it.
>> 
>> Let us know how it goes!
>> 
>> 
>> On Mon, Jul 6, 2015 at 8:41 AM, Thierry Goubier <

> thierry.goubier@

> > wrote:
>> 
>> 
>> 2015-07-06 14:29 GMT+02:00 Peter Uhnák <

> i.uhnak@

> >:
>> The archives are straight text files, in which the individual messages
>> are
>> separated by a seemingly random number of LFs.
>> 
>> Actually they are valid mbox files. (At least my mutt opened it just
>> fine.)
>> The separator is "From " line, not newlines.
>> 
>> From followed by a space. Each message ends with an blank line 
>> 
>> https://en.wikipedia.org/wiki/Mbox, https://tools.ietf.org/html/rfc4155
>> 
>> It seems there are multiple, incompatible mbox formats.
>> 
>> Thierry
>> 
>>





--
View this message in context: 
http://forum.world.st/Getting-the-mbox-file-for-this-mailing-list-tp4835958p4836140.html
Sent from the Pharo Smalltalk Developers mailing list archive at Nabble.com.

Reply via email to