On 2/7/2018 11:40 PM, Srinivas Kashyap wrote:
We have configured Solr index server on tomcat and fetch the data from database 
to index the data. We have implemented delta query indexing based on modify_ts.

What version of Solr? Just as an FYI: Since version 5.0, running in user-provided containers (like Tomcat) is not a supported configuration.

https://wiki.apache.org/solr/WhyNoWar

In our data-config.xml we have a parent entity and 17 child entity. We have 18 
such solr cores. When we call delta-import on a core, it executes 18 SQL query 
to query database.

Each time delta-import is opening a new session onto database. Log-in and 
log-out though happening at a split second, we are finding millions of login 
and logout at database.

As per our DBA, login and logout are costly operation in terms of server 
resources.

Is there a way to reduce the number of  logins and logouts and have a 
persistent DB connection from solr?

Directly, with a JDBC driver configured in the dataimport handler? Probably not. But it looks like there may be a workaround -- setting up a JNDI datasource in your servlet container, and letting that handle the connection pooling for you.

http://lucene.472066.n3.nabble.com/how-to-configure-mysql-pool-connection-on-Solr-Server-tp4038974p4039040.html

It is likely that your container can set up connection pooling with most JDBC drivers, not just MySQL.

The dataimport handler is a useful module, but it has limitations. If you write your own indexing program that is fully aware of your source data, you're likely to get better results.

Something else to consider -- sometimes by clever use of SQL JOINs, you can put the information gathering done by child entities into the main query of the parent entity. If you can do that and eliminate all your child entities, then Solr will make exactly ONE query to your database for any import operation, and you won't need to worry about reusing open connections.

Thanks,
Shawn

Reply via email to