Re: [Evolution-hackers] extracting email to structured text (csv)

2008-06-29 Thread simon le bayon
Hi all of you,

thanks for you replies (in very short delay :o)


I think i will do wat you propose, create a new mbox with the messages I
want then parse it to extract data.

I'm more experimented in python but, this is certainly a good way to
practice ruby.

Do you think it's useful for you, me to send results (scripts) on the
evolution-hack list or somewhereelse ? 

thx
Simon


-  
Simon Le Bayon
   
   ZINDEP
   25, rue de l'Ancienne Mairie 
   35230 Bourgbarré
   Tél : 02 99 57 79 73
   Fax : 02 99 57 03 69
   Por : 06 63 40 32 19
   url : http://www.zindep.com
   skype : slebayon


Le samedi 21 juin 2008 à 10:46 -0400, Reid Thompson a écrit :
 Reid Thompson wrote:
  Tobias Mueller wrote:
  Hey Simon :)
 
  On 18.06.2008 19:52 simon le bayon wrote:
  I'm a sociology phd student, with few competences in it, and i'd like to
  extract thousands of email from evolution to a csv or other structured
  file.
 
  I don't know whether mbox or maildir is structered enough and whether 
  Evolution supports copypaste old mails to a new mbox account. But if 
  both facts are given, you might want to create a new mbox (or maildir) 
  account, copypaste your mails into that accout, so that all mails are 
  in the mbox file and mess around with your data.
 
  HTH,
Tobi
 
 
  
 
  ___
  Evolution-hackers mailing list
  Evolution-hackers@gnome.org
  http://mail.gnome.org/mailman/listinfo/evolution-hackers
  
  rubymail, or one of the other ruby mail libraries may allow you to do 
  what you want..
  
  http://www.rfc20.org/rubymail/docs/
  
  http://www.rfc20.org/rubymail/
  
  rubymail has a parse_mbox call that might allow what you want...
  
  found this on the web... might be modifiable for what you want.
  
  #!/usr/bin/ruby -w
  # Split a mbox file into $year-$month files
  # Copyright (C) 2008 Joerg Jaspert
  # BSD style license, on Debian see /usr/share/common-licenses/BSD
  require 'pathname'
  require 'rmail'
  count = 0
  File.open(Pathname.new(ARGV[0]), 'r') do  mbox
RMail::Mailbox.parse_mbox(mbox) do  raw
  count += 1
  print # count  mails\n
  begin
  
  File.open(RMail::Parser.read(raw).header.date.strftime(split/mail-%y%m), 
  'a') do  out
  out.print(raw)
end
  rescue NoMethodError
print Couldn't parse date header, ignoring broken spam mail\n
  end
end
  end
  
 
 ruby tmail is also very nice.  it should allow you to do what you want.
 
 http://tmail.rubyforge.org/reference/index.html notes that it recognizes 
 mbox, 
 maildir, etc.
 

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] extracting email to structured text (csv)

2008-06-27 Thread Reid Thompson
On Fri, 2008-06-27 at 10:45 +0200, simon le bayon wrote:
 Hi all of you,
 
 thanks for you replies (in very short delay :o)
 
 
 I think i will do wat you propose, create a new mbox with the messages I
 want then parse it to extract data.
 
 I'm more experimented in python but, this is certainly a good way to
 practice ruby.
 
 Do you think it's useful for you, me to send results (scripts) on the
 evolution-hack list or somewhereelse ? 
 
 thx
 Simon
 

   
   rubymail, or one of the other ruby mail libraries may allow you to do 
   what you want..
   
   http://www.rfc20.org/rubymail/docs/
   
   http://www.rfc20.org/rubymail/
   
   rubymail has a parse_mbox call that might allow what you want...
   
   found this on the web... might be modifiable for what you want.
   
   #!/usr/bin/ruby -w
   # Split a mbox file into $year-$month files
   # Copyright (C) 2008 Joerg Jaspert
   # BSD style license, on Debian see /usr/share/common-licenses/BSD
   require 'pathname'
   require 'rmail'
   count = 0
   File.open(Pathname.new(ARGV[0]), 'r') do  mbox
 RMail::Mailbox.parse_mbox(mbox) do  raw
   count += 1
   print # count  mails\n
   begin
   
   File.open(RMail::Parser.read(raw).header.date.strftime(split/mail-%y%m),

   'a') do  out
   out.print(raw)
 end
   rescue NoMethodError
 print Couldn't parse date header, ignoring broken spam mail\n
   end
 end
   end
   
  
  ruby tmail is also very nice.  it should allow you to do what you want.
  
  http://tmail.rubyforge.org/reference/index.html notes that it recognizes 
  mbox, 
  maildir, etc.
  

I'd be interested in seeing/using it also.

reid

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] extracting email to structured text (csv)

2008-06-21 Thread Reid Thompson

Reid Thompson wrote:

Tobias Mueller wrote:

Hey Simon :)

On 18.06.2008 19:52 simon le bayon wrote:

I'm a sociology phd student, with few competences in it, and i'd like to
extract thousands of email from evolution to a csv or other structured
file.

I don't know whether mbox or maildir is structered enough and whether 
Evolution supports copypaste old mails to a new mbox account. But if 
both facts are given, you might want to create a new mbox (or maildir) 
account, copypaste your mails into that accout, so that all mails are 
in the mbox file and mess around with your data.


HTH,
  Tobi




___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


rubymail, or one of the other ruby mail libraries may allow you to do 
what you want..


http://www.rfc20.org/rubymail/docs/

http://www.rfc20.org/rubymail/

rubymail has a parse_mbox call that might allow what you want...

found this on the web... might be modifiable for what you want.

#!/usr/bin/ruby -w
# Split a mbox file into $year-$month files
# Copyright (C) 2008 Joerg Jaspert
# BSD style license, on Debian see /usr/share/common-licenses/BSD
require 'pathname'
require 'rmail'
count = 0
File.open(Pathname.new(ARGV[0]), 'r') do  mbox
  RMail::Mailbox.parse_mbox(mbox) do  raw
count += 1
print # count  mails\n
begin

File.open(RMail::Parser.read(raw).header.date.strftime(split/mail-%y%m), 
'a') do  out

out.print(raw)
  end
rescue NoMethodError
  print Couldn't parse date header, ignoring broken spam mail\n
end
  end
end



ruby tmail is also very nice.  it should allow you to do what you want.

http://tmail.rubyforge.org/reference/index.html notes that it recognizes mbox, 
maildir, etc.


___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers