On Sun, 11 Jul 2010 07:22:51 -0700 (PDT) osocurious2 <ken.fos...@realestate.com> wrote:
> > Gora, > Our environment, currently under development, is very nearly the > exact same thing as yours. My DB is currently only about 10GB, > but likely to grow. [...] Thanks for your response. It is good to hear from people dealing with similar issues. > I'm still trying out different architectures to deal with this. > I've tried doing a Bulk Copy from the DB to some flat files and > importing from there. File handles seem to be more stable than > database connections. But it brings it's own issues to the party. Yes, we tried that too, but creating the XMLs turned out to be as time-consuming. We ended up using multiple cores on several Solr instances. Please see some further details in a separate response to Willem. > I'm also currently looking at using queuing (either MSMQ or > Amazons Simple Queue service) so the database piece isn't used > for 20 hours, but gets it's part over fairly quickly. I haven't > done this using DataImportHandler however, not sure yet how, so > I'm writing my own Import manager. [...] We are considering using Amazon, but at this point I believe that we will have the indexing time down to our requirements through multiple cores on multiple Solr instances. The DataImportHandler docs are pretty good, but I will try to get the time to write up an example on using transformers, etc., which turned out to be a little tricky. Or, at least it took me some trial-and-error beyond the available documentation. > As to the GData handler and response writer. I would be very > interested in OData versions, which wouldn't be too much of a > stretch from GData to deal with. Would you be moving in that > direction later? Or if you put your contrib out there could > someone else (maybe me if time allows) be able to take it there? > That would be a great edition for our work in a few months. Yes, we would be happy to do that, though I do need to look at how closely our solution meets the GData specifications. Also, at the moment, we have only implemented the GET part, i.e., search results can only be retrieved through the GData interface. > Good luck, and I'd love to keep in touch about your solutions, > I'm sure I could get some great ideas from them for our own work. [...] Likewise, I am sure that we can learn much from you guys. Willem and you have already given me some ideas. We should maybe start getting use cases up on the Solr Wiki, or at least on a blog somewhere. Regards, Gora