On 6/27/2013 9:19 AM, Learner wrote:
I was using ConcurrentUpdateSOLR for indexing documents to Solr. Later I had
a need to do portable indexing hence started using Embedded solr server.
I created a multithreaded program to create /submit the documents in batch
of 100 to Embedded SOLR server (running inside Solrj indexing process) but
for some reason it takes more time to index the data when compared with
ConcurrentUpdateSOLR server(CUSS). I was under assumption that embedded
server would take less time compared to http update (made when using CUSS)
but not sure why it takes more time...
Is there a way to speed up the indexing when using Embedded solr
serveretc..(something like specifying thread and queue size similar to
CUSS)?
A lot more time has been spent optimizing the traditional Solr server
model than the embedded version.
If you want the same performance from Embedded that you get from
Concurrent, you'll need to use that object in multiple threads that you
create yourself. The Concurrent object handles all that threading for
you, but due to its nature, Embedded can't. You say that your program
is multithreaded, so I really don't know what's going on here.
An FYI that on something that might have escaped your awareness: CUSS
swallows exceptions - it will never inform the calling application about
errors that occur, unless you override its handleError method in some
way, and I don't know what is required to make it do that. This is part
of why CUSS is so fast - it returns to the calling application
*immediately*, no matter what actually happens in the background while
talking to the server.
Thanks,
Shawn