I think there is a bug in the 1.4 daily builds of data import handler which is causing the batchSize parameter to be ignored. This was probably introduced with more recent patches to resolve variables.

The affected code is in JdbcDataSource.java

    String bsz = initProps.getProperty("batchSize");
    if (bsz != null) {
      bsz = (String) context.getVariableResolver().resolve(bsz);
      try {
        batchSize = Integer.parseInt(bsz);
        if (batchSize == -1)
          batchSize = Integer.MIN_VALUE;
      } catch (NumberFormatException e) {
        LOG.warn("Invalid batch size: " + bsz);
      }
    }


The call to context.getVariableResolver().resolve(bsz) is returning null, leading to a NumberFormatException and the batchSize never being set to Integer.MIN_VALUE. MySql won't use streaming result sets in this case which can lead to the OOM we're seeing.


If your log file contains this entry like mine does, you're being affected by this bug too.

Apr 15, 2009 1:21:58 PM org.apache.solr.handler.dataimport.JdbcDataSource init
WARNING: Invalid batch size: null



-Bryan




On Apr 13, 2009, at Apr 13, 11:48 PM, Noble Paul നോബിള്‍ नोब्ळ् wrote:

DIH streams 1 row at a time.

DIH is just a component in Solr. Solr indexing also takes a lot of memory

On Tue, Apr 14, 2009 at 12:02 PM, Mani Kumar <manikumarchau...@gmail.com > wrote:
Yes its throwing the same OOM error and from same place...
yes i will try increasing the size ... just curious : how this dataimport
works?

Does it loads the whole table into memory?

Is there any estimate about how much memory it needs to create index for 1GB
of data.

thx
mani

On Tue, Apr 14, 2009 at 11:48 AM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:

On Tue, Apr 14, 2009 at 11:36 AM, Mani Kumar <manikumarchau...@gmail.com
wrote:

Hi Shalin:
yes i tried with batchSize="-1" parameter as well

here the config i tried with

<dataConfig>

   <dataSource type="JdbcDataSource" batchSize="-1" name="sp"
driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost/mydb_development"
user="root" password="******" />


I hope i have used batchSize parameter @ right place.


Yes that is correct. Did it still throw OOM from the same place?

I'd suggest you increase the heap and see what works for you. Also try
-server on the jvm.

--
Regards,
Shalin Shekhar Mangar.





--
--Noble Paul

Reply via email to