[jira] Commented: (SOLR-812) JDBC optimizations: setReadOnly, setMaxRows

Glen Newton (JIRA) Tue, 18 Nov 2008 10:14:46 -0800

    [ 
https://issues.apache.org/jira/browse/SOLR-812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648685#action_12648685
 ]


Glen Newton commented on SOLR-812:
----------------------------------

This is a related issue, but since I just got involved with Solr yesterday and 
got a jira account today, I am reluctant to make a career-limiting error!  :-)

If it is indeed valid, perhaps someone else can make it a full-fledged separate 
issue!  

Perusing: JdbcDataSource  @version $Id: JdbcDataSource.java 696539 2008-09-18 
02:16:26Z ryan
Issue: MySQL fetchSize driver bug

Both my experience and according to:  
http://benjchristensen.wordpress.com/2008/05/27/mysql-jdbc-memory-usage-on-large-resultset/

MySQL does not handle properly any fetchSize > Integer.MIN_VALUE, and the 
entire ResultSet is transfered and loaded into memory, which for large 
ResultSets can result in an out of memory.

In JdbcDataSource.java:
 175:  stmt.setFetchSize(batchSize);

where 
 57:  private int batchSize = FETCH_SIZE;

and 
 326:    private static final int FETCH_SIZE = 500;

Is is, this code will invoke this bug for MySQL for large ResultSets. 
Even for smaller ResultSets that do not cause an out of memory error, having 
all the ResultSet in memory will unnecessarily use up memory.

The work around for this MySQL issue is:
  stmt.setFetchSize(Integer.MIN_VALUE);

>From the blog entry, see also:
* http://javaquirks.blogspot.com/2007/12/mysql-streaming-result-set.html
* 
http://dev.mysql.com/doc/refman/5.0/en/connector-j-reference-implementation-notes.html



> JDBC optimizations: setReadOnly, setMaxRows
> -------------------------------------------
>
>                 Key: SOLR-812
>                 URL: https://issues.apache.org/jira/browse/SOLR-812
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.3
>            Reporter: David Smiley
>
> I'm looking at the DataImport code as of Solr v1.3 and using it with Postgres 
> and very large data sets and there some improvement suggestions I have.
> 1. call setReadOnly(true) on the connection.  DIH doesn't change the data so 
> this is obvious.
> 2. call setAutoCommit(false) on the connection.   (this is needed by Postgres 
> to ensure that the fetchSize hint actually works)
> 3. call setMaxRows(X) on the statement which is to be used when the 
> dataimport.jsp debugger is only grabbing X rows.  fetchSize is just a hint 
> and alone it isn't sufficient.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-812) JDBC optimizations: setReadOnly, setMaxRows

Reply via email to