Hello We've just started using solr to provide search functionality for our application with the DataImportHandler performing a delta-import every 1 fired by crontab, which works great, however it does occasionally miss records that are added to the database while the delta-import is running.
Our data-config.xml has the following queries in its root entity: query="SELECT id, date_published, date_created, publish_flag FROM Item WHERE id > 0 AND record_type_id=0 ORDER BY id DESC" preImportDeleteQuery="SELECT item_id AS Id FROM gnpd_production.item_deletions" deletedPkQuery="SELECT item_id AS id FROM gnpd_production.item_deletions WHERE deletion_date >= SUBDATE('${dataimporter.last_index_time}', INTERVAL 5 MINUTE)" deltaImportQuery="SELECT id, date_published, date_created, publish_flag FROM Item WHERE id > 0 AND record_type_id=0 AND id=${dataimporter.delta.id} ORDER BY id DESC" deltaQuery="SELECT id, date_published, date_created, publish_flag FROM Item WHERE id > 0 AND record_type_id=0 AND sys_time_stamp >= SUBDATE('${dataimporter.last_index_time}', INTERVAL 1 MINUTE) ORDER BY id DESC"> I think the problem i'm having comes from the way solr stores the last_index_time in conf/dataimport.properties as stated on the wiki as ""When delta-import command is executed, it reads the start time stored in conf/dataimport.properties. It uses that timestamp to run delta queries and after completion, updates the timestamp in conf/dataimport.properties."" Which to me seems to indicate that any records with a time-stamp between when the dataimport starts and ends will be missed as the last_index_time is set to when it completes the import. This doesn't seem quite right to me. I would have expected the last_index_time to refer to when the dataimport was last STARTED so that there was no gaps in the timestamp covered. I changed the deltaQuery of our config to include the SUBDATE by INTERVAL 1 MINUTE statement to alleviate this problem, but it does only cover times when the delta-import takes less than a minute. Any ideas as to how this can be overcome? ,other than increasing the INTERVAL to something larger. Regards Barry Tucker -- View this message in context: http://lucene.472066.n3.nabble.com/Delta-Import-occasionally-missing-records-tp2300877p2300877.html Sent from the Solr - User mailing list archive at Nabble.com.