Re: URGENT HELP: Improving Solr indexing time
str name=Total Requests made to DataSource16276/str ... so I am doing a delta import of around 500,000 rows at a time. http://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport
Re: URGENT HELP: Improving Solr indexing time
Thanks Faud, Have started working optimizing my Database structure, since the tables are huge in terms of records, optimization is taking time. Will update the results when complete. Regards, Rohit From: Fuad Efendi f...@efendi.ca To: Solr-User@Lucene. Org solr-user@lucene.apache.org Sent: Sun, 5 June, 2011 10:05:22 AM Subject: Re: URGENT HELP: Improving Solr indexing time Hi Rohit, I am currently working on https://issues.apache.org/jira/browse/SOLR-2233 which fixes multithreading issues How complex is your dataimport schema? SOLR-2233 (multithreading, better connection handling) improves performance... Especially if SQL is extremely complex and uses few long-running CachedSqlEntityProcessors and etc. Also, check your SQL and indexes, in most cases you can _significantly_ improve performance by simply adding appropriate (for your specific SQL) indexes. I noticed that even very experienced DBAs sometimes create index KEY1, KEY2, and developer executes query WHERE KEY2=? ORDER BY KEY1 - check everything... Thanks, -- Fuad Efendi 416-993-2060 Tokenizer Inc., Canada Data Mining, Search Engines http://www.tokenizer.ca http://www.tokenizer.ca/ On 11-06-05 12:09 AM, Rohit Gupta ro...@in-rev.com wrote: No didn't double post, my be it was in my outbox and went out again. The queries outside solr dont take so long, to return around 50 rows it takes 250 seconds, so I am doing a delta import of around 500,000 rows at a time. I have tried turning auto commit on and things are moving a bit faster now. Are there any more tweeking i can do? Also, planning to move to master-salve model, but am failing to understand where to start exactly. Regards, Rohit From: lee carroll lee.a.carr...@googlemail.com To: solr-user@lucene.apache.org Sent: Sun, 5 June, 2011 4:59:44 AM Subject: Re: URGENT HELP: Improving Solr indexing time Rohit - you have double posted maybe - did Otis's answer not help with your issue or at least need a response to clarify ? On 4 June 2011 22:53, Chris Cowan chrisco...@plus3network.com wrote: How long does the query against the DB take (outside of Solr)? If that's slow then it's going to take a while to update the index. You might need to figure a way to break things up a bit, maybe use a delta import instead of a full import. Chris On Jun 4, 2011, at 6:23 AM, Rohit Gupta wrote: My Solr server takes very long to update index. The table it hits to index is huge with 10Million + records , but even in that case I feel this is very long time to index. Below is the snapshot of the /dataimport page str name=statusbusy/str str name=importResponseA command is still running.../str lst name=statusMessages str name=Time Elapsed1:53:39.664/str str name=Total Requests made to DataSource16276/str str name=Total Rows Fetched24237/str str name=Total Documents Processed16273/str str name=Total Documents Skipped0/str str name=Full Dump Started2011-06-04 11:25:26/str /lst How can i determine why this is happening and how can I improve this. During all our test on the local server before the migration we could index 5 million records in 4-5 hrs, but now its taking too long on the live server. Regards, Rohit
URGENT HELP: Improving Solr indexing time
My Solr server takes very long to update index. The table it hits to index is huge with 10Million + records , but even in that case I feel this is very long time to index. Below is the snapshot of the /dataimport page str name=statusbusy/str str name=importResponseA command is still running.../str lst name=statusMessages str name=Time Elapsed1:53:39.664/str str name=Total Requests made to DataSource16276/str str name=Total Rows Fetched24237/str str name=Total Documents Processed16273/str str name=Total Documents Skipped0/str str name=Full Dump Started2011-06-04 11:25:26/str /lst How can i determine why this is happening and how can I improve this. During all our test on the local server before the migration we could index 5 million records in 4-5 hrs, but now its taking too long on the live server. Regards, Rohit
Re: URGENT HELP: Improving Solr indexing time
How long does the query against the DB take (outside of Solr)? If that's slow then it's going to take a while to update the index. You might need to figure a way to break things up a bit, maybe use a delta import instead of a full import. Chris On Jun 4, 2011, at 6:23 AM, Rohit Gupta wrote: My Solr server takes very long to update index. The table it hits to index is huge with 10Million + records , but even in that case I feel this is very long time to index. Below is the snapshot of the /dataimport page str name=statusbusy/str str name=importResponseA command is still running.../str lst name=statusMessages str name=Time Elapsed1:53:39.664/str str name=Total Requests made to DataSource16276/str str name=Total Rows Fetched24237/str str name=Total Documents Processed16273/str str name=Total Documents Skipped0/str str name=Full Dump Started2011-06-04 11:25:26/str /lst How can i determine why this is happening and how can I improve this. During all our test on the local server before the migration we could index 5 million records in 4-5 hrs, but now its taking too long on the live server. Regards, Rohit
Re: URGENT HELP: Improving Solr indexing time
Rohit - you have double posted maybe - did Otis's answer not help with your issue or at least need a response to clarify ? On 4 June 2011 22:53, Chris Cowan chrisco...@plus3network.com wrote: How long does the query against the DB take (outside of Solr)? If that's slow then it's going to take a while to update the index. You might need to figure a way to break things up a bit, maybe use a delta import instead of a full import. Chris On Jun 4, 2011, at 6:23 AM, Rohit Gupta wrote: My Solr server takes very long to update index. The table it hits to index is huge with 10Million + records , but even in that case I feel this is very long time to index. Below is the snapshot of the /dataimport page str name=statusbusy/str str name=importResponseA command is still running.../str lst name=statusMessages str name=Time Elapsed1:53:39.664/str str name=Total Requests made to DataSource16276/str str name=Total Rows Fetched24237/str str name=Total Documents Processed16273/str str name=Total Documents Skipped0/str str name=Full Dump Started2011-06-04 11:25:26/str /lst How can i determine why this is happening and how can I improve this. During all our test on the local server before the migration we could index 5 million records in 4-5 hrs, but now its taking too long on the live server. Regards, Rohit
Re: URGENT HELP: Improving Solr indexing time
No didn't double post, my be it was in my outbox and went out again. The queries outside solr dont take so long, to return around 50 rows it takes 250 seconds, so I am doing a delta import of around 500,000 rows at a time. I have tried turning auto commit on and things are moving a bit faster now. Are there any more tweeking i can do? Also, planning to move to master-salve model, but am failing to understand where to start exactly. Regards, Rohit From: lee carroll lee.a.carr...@googlemail.com To: solr-user@lucene.apache.org Sent: Sun, 5 June, 2011 4:59:44 AM Subject: Re: URGENT HELP: Improving Solr indexing time Rohit - you have double posted maybe - did Otis's answer not help with your issue or at least need a response to clarify ? On 4 June 2011 22:53, Chris Cowan chrisco...@plus3network.com wrote: How long does the query against the DB take (outside of Solr)? If that's slow then it's going to take a while to update the index. You might need to figure a way to break things up a bit, maybe use a delta import instead of a full import. Chris On Jun 4, 2011, at 6:23 AM, Rohit Gupta wrote: My Solr server takes very long to update index. The table it hits to index is huge with 10Million + records , but even in that case I feel this is very long time to index. Below is the snapshot of the /dataimport page str name=statusbusy/str str name=importResponseA command is still running.../str lst name=statusMessages str name=Time Elapsed1:53:39.664/str str name=Total Requests made to DataSource16276/str str name=Total Rows Fetched24237/str str name=Total Documents Processed16273/str str name=Total Documents Skipped0/str str name=Full Dump Started2011-06-04 11:25:26/str /lst How can i determine why this is happening and how can I improve this. During all our test on the local server before the migration we could index 5 million records in 4-5 hrs, but now its taking too long on the live server. Regards, Rohit
Re: URGENT HELP: Improving Solr indexing time
Hi Rohit, I am currently working on https://issues.apache.org/jira/browse/SOLR-2233 which fixes multithreading issues How complex is your dataimport schema? SOLR-2233 (multithreading, better connection handling) improves performance... Especially if SQL is extremely complex and uses few long-running CachedSqlEntityProcessors and etc. Also, check your SQL and indexes, in most cases you can _significantly_ improve performance by simply adding appropriate (for your specific SQL) indexes. I noticed that even very experienced DBAs sometimes create index KEY1, KEY2, and developer executes query WHERE KEY2=? ORDER BY KEY1 - check everything... Thanks, -- Fuad Efendi 416-993-2060 Tokenizer Inc., Canada Data Mining, Search Engines http://www.tokenizer.ca http://www.tokenizer.ca/ On 11-06-05 12:09 AM, Rohit Gupta ro...@in-rev.com wrote: No didn't double post, my be it was in my outbox and went out again. The queries outside solr dont take so long, to return around 50 rows it takes 250 seconds, so I am doing a delta import of around 500,000 rows at a time. I have tried turning auto commit on and things are moving a bit faster now. Are there any more tweeking i can do? Also, planning to move to master-salve model, but am failing to understand where to start exactly. Regards, Rohit From: lee carroll lee.a.carr...@googlemail.com To: solr-user@lucene.apache.org Sent: Sun, 5 June, 2011 4:59:44 AM Subject: Re: URGENT HELP: Improving Solr indexing time Rohit - you have double posted maybe - did Otis's answer not help with your issue or at least need a response to clarify ? On 4 June 2011 22:53, Chris Cowan chrisco...@plus3network.com wrote: How long does the query against the DB take (outside of Solr)? If that's slow then it's going to take a while to update the index. You might need to figure a way to break things up a bit, maybe use a delta import instead of a full import. Chris On Jun 4, 2011, at 6:23 AM, Rohit Gupta wrote: My Solr server takes very long to update index. The table it hits to index is huge with 10Million + records , but even in that case I feel this is very long time to index. Below is the snapshot of the /dataimport page str name=statusbusy/str str name=importResponseA command is still running.../str lst name=statusMessages str name=Time Elapsed1:53:39.664/str str name=Total Requests made to DataSource16276/str str name=Total Rows Fetched24237/str str name=Total Documents Processed16273/str str name=Total Documents Skipped0/str str name=Full Dump Started2011-06-04 11:25:26/str /lst How can i determine why this is happening and how can I improve this. During all our test on the local server before the migration we could index 5 million records in 4-5 hrs, but now its taking too long on the live server. Regards, Rohit