Hi Tony,
We are starting to work on MAILBOX-44 "Design and implement a
distributed mailbox using Hadoop" [1]
We will need to store the mail in hadoop and the JSON format (in avro
file) may be a option.
You said you are "still polishing for release" your JSON transformer.
Have you got any plan to release it in opensource so we could use it ?
Tks,
Eric
[1] https://issues.apache.org/jira/browse/MAILBOX-44
On 10/05/2011 10:00, Robert Burrell Donkin wrote:
On Sun, May 8, 2011 at 2:44 PM, Tony Zakula<tonyzak...@gmail.com> wrote:
Not sure on where the project leaders want to go,
Projects are community led here at Apache (see eg [1][2][3][4]). If
there's development interest from the community and it's in scope for
the project, then that's a direction the code will move in.
but I think being
able to store messages in different formats to be able to plugin to
systems would be great. Instead of each person writing their own
parser, most people would just plugin the larger piece to their system
and start there.
+1
This vision seems to fit with the work over at Tika [5] and Lucene [6].
I did not see where you specified what you are thinking about for
summer. Is that a link somewhere yet?
The mailing lists (see [7] and eg [8]) are the primary tools we use
here at Apache. Stuff only tends to get written down later, if at all.
We've been throwing ideas around on the lists, hoping that people
might pick some of them up and run with them ;-)
Robert
[1] http://www.apache.org/foundation/how-it-works.html
[2] http://www.apache.org/foundation/getinvolved.html
[3] http://jakarta.apache.org/site/contributing.html
[4] http://www.apache.org/dev/contributors.html
[5] http://tika.apache.org/
[6] http://lucene.apache.org/
[7] http://www.apache.org/dev/#mail
[8] http://www.apache.org/dev/contrib-email-tips.html