[gsoc2011] Design and implement a distributed mailbox using Hadoop
------------------------------------------------------------------

                 Key: MAILBOX-44
                 URL: https://issues.apache.org/jira/browse/MAILBOX-44
             Project: James Mailbox
          Issue Type: New Feature
            Reporter: Eric Charles
            Assignee: Norman Maurer


Context: The mailbox subproject (http://james.apache.org/mailbox/) supports 
maildir, SQL database (via JPA) and Java Content Repository (JCR) as technology 
for mail storage. This flexibility is achieved thanks to a API design that 
abstracts mail storage from the mail protocols.

Task: We need to implement mailbox storage as a distributed system on top of 
Hadoop HDFS. The James mailbox API will be used. A first step is to design how 
to interact with Hadoop (native api, gora incubator at apache,...) and deal 
with specific performance questions related to mail loading/parsing in a 
distributed system (use map/reduce or not, use existing local lucene indexes 
for search,...). The second step is to implement the HDFS mailbox (maildir 
mailbox is similar because is stores mails as a file and can be an 
inspiration). A single James server will still be deployed because we don't 
have any distributed UID generation.

Mentor: eric at apache dot org

Complexity: medium 


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org
For additional commands, e-mail: server-dev-h...@james.apache.org

Reply via email to