On Wed, Apr 28, 2010 at 5:55 AM, Norman Maurer <norman.mau...@googlemail.com> wrote:
<snip> >>> I think it would be a good think to "simplify" the api a bit to make >>> it a bit easier to understand. So some points which came to me mind: >>> >>> 1) UidChangeTracking: >>> >>> Is this really necessary ? It does some kind of caching but I don't >>> see something else for which its useful. Why not just fire the events >>> directly with a shared MailboxEventDispatcher which is the same for >>> all Mailboxes? >> >> i'm not convinced it's needed but beware... >> >> this is one of the few areas retained from the design before i started >> reworking. i had hoped to replace it but never really worked out how >> to do that without crippling performance or breaking IMAP. > > I'm currently testing imap without the UidChangeTracker and so far it > seems like its not really slower then before.. it's only slower than the alternatives that required to make IMAP work properly ;-) IIRC UIDChangeTracker tracks UID changes made by concurrent sessions accessing the same mailbox. the local caching should work for users own changes. it's possible that some of the changes i might have made it redundant by now but i don't trust the functional concurrency tests. >>> 2) Global Mailbox caching >>> >>> At the moment the Mailbox is cached in a HashMap. The problem with >>> this is it will never get recycled by the GC. This can generate a OOM >>> over long time >> >> i run IMAP with approx 1.5G spread over around a hundred mailboxes. >> i've never had an OOM. so i never bothered changing this. > > I think you use Torque right ? Maybe it behave a bit different there. i inherited torque and this is one area i left alone ;-) > I'm using JPA and its reproducable with feeding a mailbox with ca 1 > million emails. You will see the memory usage just grow and grow.. > When I took a heap dump it seems like the OpenJPA objects where never > released, because the where hold in the HashMap. for torque the session needs to be held to manage concurrency (mailbox access needs to be synchronized). for OpenJPA, sounds like the mailbox structure needs to be there to manage synchronization and caching but a new OpenJPA object needs to be created each time. >>> The other problem with this is, the Mailbox should be "tight" to the >>> MailboxSession. Let me explain why. For example in JCR we could use >>> the User/Pass which is bound to the MailboxSession to access different >>> parts of the JCR Repository etc.. >> >> i thought this too originally but i couldn't work out how to do so >> without cripple performance or breaking IMAP. > > Sure good performance is a must, but I would prefer to have a "good" > api first ;) this wasn't a good performance issue but a usable at all one when two sessions are accessing the same mailbox, there are a handful number of operations which require caching and concurrency control to maintain correctness. there are a number of ways that this design could work. mailbox et al is inherited, and probably not my first choice. i would prefer to revise the API by pushing the Mailbox functions into MailboxManager, and so making it an internal feature which could be varied by implementations. the namespace handling is problematic, so i would then model namespace by a Mailbox object which could be passed in to each method in the API. >> IIRC these are related issues. the essential function is caching and >> synchronization. in performances terms, i think much higher >> performance could be achieved by replacement by something asynchronous >> and event driven using a blocking queue. this would be a substantial >> change. >> > > I agree with you here. But as you outlined already, its not a "easy" > thing todo, without rewrite a lot of stuff. very little rewriting but hard, and risky for the poorly tested concurrent use cases. then again, maybe these don't work ATM. the best place to start would be by using creating some more concurrency tests. there's an application that creates tests in the package org.apache.james.imap.functional.builder in seda. > I even tend to believe we should do something similar to what we have > in SMTP/POP3. Just have some kind of LineHandler which push data in > the processor when a CRLF was detected and so not using blocking > streams as input at all. the IMAP protocol makes this approach tricky, but in general yes. the protocol handling foo is intended to address this, and should be quite close now. - robert --------------------------------------------------------------------- To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org