Re: URGENT HELP: Improving Solr indexing time

2011-06-13 Thread Alexey Serba
str name=Total Requests made to DataSource16276/str
...
 so I am doing a delta import of around 500,000 rows at a
 time.

http://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport


Re: URGENT HELP: Improving Solr indexing time

2011-06-05 Thread Rohit Gupta
Thanks Faud,

Have started working optimizing my Database structure, since the tables are 
huge 
in terms of records, optimization is taking time. 

Will update the results when complete.

Regards,
Rohit




From: Fuad Efendi f...@efendi.ca
To: Solr-User@Lucene. Org solr-user@lucene.apache.org
Sent: Sun, 5 June, 2011 10:05:22 AM
Subject: Re: URGENT HELP: Improving Solr indexing time

Hi Rohit,

I am currently working on https://issues.apache.org/jira/browse/SOLR-2233
which fixes multithreading issues

How complex is your dataimport schema? SOLR-2233 (multithreading, better
connection handling) improves performance... Especially if SQL is
extremely complex and uses few long-running CachedSqlEntityProcessors and
etc.

Also, check your SQL and indexes, in most cases you can _significantly_
improve performance by simply adding appropriate (for your specific SQL)
indexes. I noticed that even very experienced DBAs sometimes create index
KEY1, KEY2, and developer executes query WHERE KEY2=? ORDER BY KEY1 -
check everything...

Thanks,


-- 
Fuad Efendi
416-993-2060
Tokenizer Inc., Canada
Data Mining, Search Engines
http://www.tokenizer.ca http://www.tokenizer.ca/







On 11-06-05 12:09 AM, Rohit Gupta ro...@in-rev.com wrote:

No didn't double post, my be it was in my outbox and went out again.

The queries outside solr dont take so long, to return around 50 rows
it 
takes 250 seconds, so I am doing a delta import of around 500,000 rows at
a 
time. I have tried turning auto commit  on and things are moving a bit
faster 
now. Are there any more tweeking i can do?

Also, planning to move to master-salve model, but am failing to
understand where 
to start exactly. 

Regards,
Rohit




From: lee carroll lee.a.carr...@googlemail.com
To: solr-user@lucene.apache.org
Sent: Sun, 5 June, 2011 4:59:44 AM
Subject: Re: URGENT HELP: Improving Solr indexing time

Rohit - you have double posted maybe - did Otis's answer not help with
your issue or at least need a response to clarify ?

On 4 June 2011 22:53, Chris Cowan chrisco...@plus3network.com wrote:
 How long does the query against the DB take (outside of Solr)? If
that's slow 
then it's going to take a while to update the index. You might need to
figure a 
way to break things up a bit, maybe use a delta import instead of a full
import.

 Chris

 On Jun 4, 2011, at 6:23 AM, Rohit Gupta wrote:

 My Solr server takes very long to update index. The table it hits to
index is
 huge with 10Million + records , but even in that case I feel this is
very 
long
 time to index. Below is the snapshot of the /dataimport page

 str name=statusbusy/str
 str name=importResponseA command is still running.../str
 lst name=statusMessages
 str name=Time Elapsed1:53:39.664/str
 str name=Total Requests made to DataSource16276/str
 str name=Total Rows Fetched24237/str
 str name=Total Documents Processed16273/str
 str name=Total Documents Skipped0/str
 str name=Full Dump Started2011-06-04 11:25:26/str
 /lst

 How can i determine why this is happening and how can I improve this.
During 
all
 our test on the local server before the migration we could index 5
million
 records in 4-5 hrs, but now its taking too long on the live server.

 Regards,
 Rohit



URGENT HELP: Improving Solr indexing time

2011-06-04 Thread Rohit Gupta
My Solr server takes very long to update index. The table it hits to index is 
huge with 10Million + records , but even in that case I feel this is very long 
time to index. Below is the snapshot of the /dataimport page

str name=statusbusy/str
str name=importResponseA command is still running.../str
lst name=statusMessages
str name=Time Elapsed1:53:39.664/str
str name=Total Requests made to DataSource16276/str
str name=Total Rows Fetched24237/str
str name=Total Documents Processed16273/str
str name=Total Documents Skipped0/str
str name=Full Dump Started2011-06-04 11:25:26/str
/lst

How can i determine why this is happening and how can I improve this. During 
all 
our test on the local server before the migration we could index 5 million 
records in 4-5 hrs, but now its taking too long on the live server.

Regards,
Rohit

Re: URGENT HELP: Improving Solr indexing time

2011-06-04 Thread Chris Cowan
How long does the query against the DB take (outside of Solr)? If that's slow 
then it's going to take a while to update the index. You might need to figure a 
way to break things up a bit, maybe use a delta import instead of a full import.

Chris

On Jun 4, 2011, at 6:23 AM, Rohit Gupta wrote:

 My Solr server takes very long to update index. The table it hits to index is 
 huge with 10Million + records , but even in that case I feel this is very 
 long 
 time to index. Below is the snapshot of the /dataimport page
 
 str name=statusbusy/str
 str name=importResponseA command is still running.../str
 lst name=statusMessages
 str name=Time Elapsed1:53:39.664/str
 str name=Total Requests made to DataSource16276/str
 str name=Total Rows Fetched24237/str
 str name=Total Documents Processed16273/str
 str name=Total Documents Skipped0/str
 str name=Full Dump Started2011-06-04 11:25:26/str
 /lst
 
 How can i determine why this is happening and how can I improve this. During 
 all 
 our test on the local server before the migration we could index 5 million 
 records in 4-5 hrs, but now its taking too long on the live server.
 
 Regards,
 Rohit



Re: URGENT HELP: Improving Solr indexing time

2011-06-04 Thread lee carroll
Rohit - you have double posted maybe - did Otis's answer not help with
your issue or at least need a response to clarify ?

On 4 June 2011 22:53, Chris Cowan chrisco...@plus3network.com wrote:
 How long does the query against the DB take (outside of Solr)? If that's slow 
 then it's going to take a while to update the index. You might need to figure 
 a way to break things up a bit, maybe use a delta import instead of a full 
 import.

 Chris

 On Jun 4, 2011, at 6:23 AM, Rohit Gupta wrote:

 My Solr server takes very long to update index. The table it hits to index is
 huge with 10Million + records , but even in that case I feel this is very 
 long
 time to index. Below is the snapshot of the /dataimport page

 str name=statusbusy/str
 str name=importResponseA command is still running.../str
 lst name=statusMessages
 str name=Time Elapsed1:53:39.664/str
 str name=Total Requests made to DataSource16276/str
 str name=Total Rows Fetched24237/str
 str name=Total Documents Processed16273/str
 str name=Total Documents Skipped0/str
 str name=Full Dump Started2011-06-04 11:25:26/str
 /lst

 How can i determine why this is happening and how can I improve this. During 
 all
 our test on the local server before the migration we could index 5 million
 records in 4-5 hrs, but now its taking too long on the live server.

 Regards,
 Rohit




Re: URGENT HELP: Improving Solr indexing time

2011-06-04 Thread Rohit Gupta
No didn't double post, my be it was in my outbox and went out again.

The queries outside solr dont take so long, to return around 50 rows it 
takes 250 seconds, so I am doing a delta import of around 500,000 rows at a 
time. I have tried turning auto commit  on and things are moving a bit faster 
now. Are there any more tweeking i can do?

Also, planning to move to master-salve model, but am failing to understand 
where 
to start exactly. 

Regards,
Rohit




From: lee carroll lee.a.carr...@googlemail.com
To: solr-user@lucene.apache.org
Sent: Sun, 5 June, 2011 4:59:44 AM
Subject: Re: URGENT HELP: Improving Solr indexing time

Rohit - you have double posted maybe - did Otis's answer not help with
your issue or at least need a response to clarify ?

On 4 June 2011 22:53, Chris Cowan chrisco...@plus3network.com wrote:
 How long does the query against the DB take (outside of Solr)? If that's slow 
then it's going to take a while to update the index. You might need to figure 
a 
way to break things up a bit, maybe use a delta import instead of a full 
import.

 Chris

 On Jun 4, 2011, at 6:23 AM, Rohit Gupta wrote:

 My Solr server takes very long to update index. The table it hits to index is
 huge with 10Million + records , but even in that case I feel this is very 
long
 time to index. Below is the snapshot of the /dataimport page

 str name=statusbusy/str
 str name=importResponseA command is still running.../str
 lst name=statusMessages
 str name=Time Elapsed1:53:39.664/str
 str name=Total Requests made to DataSource16276/str
 str name=Total Rows Fetched24237/str
 str name=Total Documents Processed16273/str
 str name=Total Documents Skipped0/str
 str name=Full Dump Started2011-06-04 11:25:26/str
 /lst

 How can i determine why this is happening and how can I improve this. During 
all
 our test on the local server before the migration we could index 5 million
 records in 4-5 hrs, but now its taking too long on the live server.

 Regards,
 Rohit




Re: URGENT HELP: Improving Solr indexing time

2011-06-04 Thread Fuad Efendi
Hi Rohit,

I am currently working on https://issues.apache.org/jira/browse/SOLR-2233
which fixes multithreading issues

How complex is your dataimport schema? SOLR-2233 (multithreading, better
connection handling) improves performance... Especially if SQL is
extremely complex and uses few long-running CachedSqlEntityProcessors and
etc.

Also, check your SQL and indexes, in most cases you can _significantly_
improve performance by simply adding appropriate (for your specific SQL)
indexes. I noticed that even very experienced DBAs sometimes create index
KEY1, KEY2, and developer executes query WHERE KEY2=? ORDER BY KEY1 -
check everything...

Thanks,


-- 
Fuad Efendi
416-993-2060
Tokenizer Inc., Canada
Data Mining, Search Engines
http://www.tokenizer.ca http://www.tokenizer.ca/







On 11-06-05 12:09 AM, Rohit Gupta ro...@in-rev.com wrote:

No didn't double post, my be it was in my outbox and went out again.

The queries outside solr dont take so long, to return around 50 rows
it 
takes 250 seconds, so I am doing a delta import of around 500,000 rows at
a 
time. I have tried turning auto commit  on and things are moving a bit
faster 
now. Are there any more tweeking i can do?

Also, planning to move to master-salve model, but am failing to
understand where 
to start exactly. 

Regards,
Rohit




From: lee carroll lee.a.carr...@googlemail.com
To: solr-user@lucene.apache.org
Sent: Sun, 5 June, 2011 4:59:44 AM
Subject: Re: URGENT HELP: Improving Solr indexing time

Rohit - you have double posted maybe - did Otis's answer not help with
your issue or at least need a response to clarify ?

On 4 June 2011 22:53, Chris Cowan chrisco...@plus3network.com wrote:
 How long does the query against the DB take (outside of Solr)? If
that's slow 
then it's going to take a while to update the index. You might need to
figure a 
way to break things up a bit, maybe use a delta import instead of a full
import.

 Chris

 On Jun 4, 2011, at 6:23 AM, Rohit Gupta wrote:

 My Solr server takes very long to update index. The table it hits to
index is
 huge with 10Million + records , but even in that case I feel this is
very 
long
 time to index. Below is the snapshot of the /dataimport page

 str name=statusbusy/str
 str name=importResponseA command is still running.../str
 lst name=statusMessages
 str name=Time Elapsed1:53:39.664/str
 str name=Total Requests made to DataSource16276/str
 str name=Total Rows Fetched24237/str
 str name=Total Documents Processed16273/str
 str name=Total Documents Skipped0/str
 str name=Full Dump Started2011-06-04 11:25:26/str
 /lst

 How can i determine why this is happening and how can I improve this.
During 
all
 our test on the local server before the migration we could index 5
million
 records in 4-5 hrs, but now its taking too long on the live server.

 Regards,
 Rohit