Try readOnly="true" in the dataSource configuration.  This causes several 
defaults to get set in the JDBC connection, and often will solve problems like 
this. (see 
http://wiki.apache.org/solr/DataImportHandler#Configuring_JdbcDataSource)  
Also, try a batch size of 0 to let your jdbc driver pick what it thinks is 
optimal.  This might be better than 10000.

There is also an issue in that it doesn't explicitly close the resultset but 
relies on closing the connection to implicily close the child objects.  I know 
when I tried using DIH with Derby a while back this had at the least caused 
some log warnings, and it wouldn't work at all without readOnly=false.  Not 
sure abour PostgreSql.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: Jasper Floor [mailto:jasper.fl...@m4n.nl] 
Sent: Thursday, June 14, 2012 8:21 AM
To: solr-user@lucene.apache.org
Subject: DIH idle in transaction forever

Hi all,

It seems that DIH always holds two connections open to the database.
One of them is almost always 'idle in transaction'. It may sometimes
seem to do a little work but then it goes idle again.


datasource definition:
        <dataSource name="df-stream-store-ds"
jndiName="java:ext_solr_datafeeds_dba" type="JdbcDataSource"
autoCommit="false" batchSize="10000" />

We have a datasource defined in the jndi:
        <no-tx-datasource>
                <jndi-name>ext_solr_datafeeds_dba</jndi-name>
                <security-domain>ext_solr_datafeeds_dba_realm</security-domain>
                
<connection-url>jdbc:postgresql://db1.live.mbuyu.nl/datafeeds</connection-url>
                <min-pool-size>0</min-pool-size>
                <max-pool-size>5</max-pool-size>
                
<transaction-isolation>TRANSACTION_READ_COMMITTED</transaction-isolation>
                <driver-class>org.postgresql.Driver</driver-class>
                <blocking-timeout-millis>30000</blocking-timeout-millis>
                <idle-timeout-minutes>5</idle-timeout-minutes>
                <new-connection-sql>SELECT 1</new-connection-sql>
                <check-valid-connection-sql>SELECT 
1</check-valid-connection-sql>
        </no-tx-datasource>


If we set autocommit to true then we get an OOM on indexing so that is
not an option.

Does anyone have any idea why this happens? I would guess that DIH
doesn't close the connection, but reading the code I can't be sure of
this. The ResultSet object should close itself once it reaches the
end.

mvg,
JAsper

Reply via email to