I'm building an application which has to provide "real-time" searching of emails as they come in. I have a number of search strings that I need to apply against each email as it comes in and then do something with the email based on which search string(s) get a hit.
My initial thought was to create a lucene index of the emails received in the last N seconds (where N is around 5 since I don't have to be quite real-time) in a memory directory, do my searches and then delete the index and create a new index for emails received in the next 5 seconds. I'm a little concerned because the number of search strings will probably grow over time and so there is a bit of a scalability issue-though I'm not sure there's anyway around that other than doing parallel processing on different machines. I'm wondering if anyone has any experience doing this kind of thing and has additional or alternate suggestions?? Scott