[
https://issues.apache.org/jira/browse/MAILBOX-44?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049808#comment-13049808
]
stack commented on MAILBOX-44:
------------------------------
bq. First of welcome :)
Thank you.
Thats sweet that you have the prior experience hacking this on top of a store
already. I defer to experience!
Why separate row for message metadata and content if you don't mind me asking
rather than a message per row with say content in one column family and
metadata in another (Probably best to have cells no bigger than N MB in HBase
too... we say > 10MB is usually to avoided so that splitting across cells
probably applies to HBase too).
Did you use order preserving partitioner?
Random IMAP querys sounds ugly.
> [gsoc2011] Design and implement a distributed mailbox using Hadoop
> ------------------------------------------------------------------
>
> Key: MAILBOX-44
> URL: https://issues.apache.org/jira/browse/MAILBOX-44
> Project: James Mailbox
> Issue Type: New Feature
> Reporter: Eric Charles
> Assignee: Norman Maurer
> Labels: gsoc2011
> Fix For: 0.3
>
>
> Context: The mailbox subproject (http://james.apache.org/mailbox/) supports
> maildir, SQL database (via JPA) and Java Content Repository (JCR) as
> technology for mail storage. This flexibility is achieved thanks to a API
> design that abstracts mail storage from the mail protocols.
> Task: We need to implement mailbox storage as a distributed system on top of
> Hadoop HDFS. The James mailbox API will be used. A first step is to design
> how to interact with Hadoop (native api, gora incubator at apache,...) and
> deal with specific performance questions related to mail loading/parsing in a
> distributed system (use map/reduce or not, use existing local lucene indexes
> for search,...). The second step is to implement the HDFS mailbox (maildir
> mailbox is similar because is stores mails as a file and can be an
> inspiration). A single James server will still be deployed because we don't
> have any distributed UID generation.
> Mentor: eric at apache dot org
> Complexity: medium
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]