Hello, I amusing Solr 1.4 (solr-2008-11-19) with Lucene 2.4 dropped in instead of 2.9
I am indexing 500k records using the JDBC Data Import Request Handler. Config: Linux openSUSE 10.2 (X86-64) Dual core dual core 64bit Xeon 3GHz Dell blade 8GB RAM java version "1.6.0_07" Java(TM) SE Runtime Environment (build 1.6.0_07-b06) Java HotSpot(TM) 64-Bit Server VM (build 10.0-b23, mixed mode) 1GB heap for Tomcat DB: MySql on separate but similar server I am finding that the when I do a Full-Import, followed by another Full-import the import takes much longer the second and subsequent times: Run1 = 0:27:31.491 Run2 = 1:14:44:821 Run3 = 1:14:48.316 Run4 = 2:15:12.296 Run5 = 1:37:6.847 (I have run this ~10 times and got roughly the same results). I have also monitored the load on the Solr machine and the databases machine for any other activity that might impact. The final Lucene index size is 923MB. The default clean = 'true', so the index is cleared (emptied) each time, so I am concerned the second run takes 4 times the time of the first run. Am I doing something wrong here? Any help would be appreciated. I have append my data-config.xml thanks, Glen <dataConfig> <dataSource driver="com.mysql.jdbc.Driver" url="jdbc:mysql://blue01/dartejos" user="USER" password="PASSWD"/> <document name="products"> <entity name="item" query="select Publisher.name as pub, Journal.title as jo, Article.rawUrl as textpath, Journal.issn, Volume.number as vol,Volume.coverYear as year, Issue.number as iss, Article.id,Article.title as ti, Article.abstract, Article.startPage as startPage,Article.endPage as endPage from Publisher, Journal, Volume, Issue, Article where Publisher.id = Journal.publisherId and Journal.id = Volume.journalId and Volume.id = Issue.volumeId and Issue.id = Article.issueId limit 500000"> <field column="id" name="id" /> <field column="jo" name="id" /> <field column="issn" name="id" /> <field column="vol" name="id" /> <field column="year" name="id" /> <field column="iss" name="id" /> <field name="abstract" column="abstract"/> <field name="title" column="title"/> <field name="pub" column="pub"/> <field name="textpath" column="textpath"/> <field name="startPage" column="startPage"/> <field name="endPage" column="endPage"/> </entity> </document> </dataConfig> -- -