Thank you for your response.
In what way is 'timestamp' not perfect?
I've looked into the SolrEntityProcessor and added a timestamp field to our
index.
However i'm struggling to work out a query to get the max value od the
timestamp field
and does the SolrEntityProcessor entity appear before the root entity or
does it wrap around the root entity.
On 22 January 2011 07:24, Lance Norskog-2 [via Lucene]
ml-node+2307215-627680969-326...@n3.nabble.comml-node%2b2307215-627680969-326...@n3.nabble.com
wrote:
The timestamp thing is not perfect. You can instead do a search
against Solr and find the latest timestamp in the index. SOLR-1499
allows you to search against Solr in the DataImportHandler.
On Fri, Jan 21, 2011 at 2:27 AM, btucker [hidden
email]http://user/SendEmail.jtp?type=nodenode=2307215i=0
wrote:
Hello
We've just started using solr to provide search functionality for our
application with the DataImportHandler performing a delta-import every 1
fired by crontab, which works great, however it does occasionally miss
records that are added to the database while the delta-import is running.
Our data-config.xml has the following queries in its root entity:
query=SELECT id, date_published, date_created, publish_flag FROM Item
WHERE
id 0
AND record_type_id=0
ORDER BY id DESC
preImportDeleteQuery=SELECT item_id AS Id FROM
gnpd_production.item_deletions
deletedPkQuery=SELECT item_id AS id FROM gnpd_production.item_deletions
WHERE deletion_date =
SUBDATE('${dataimporter.last_index_time}', INTERVAL 5 MINUTE)
deltaImportQuery=SELECT id, date_published, date_created, publish_flag
FROM
Item WHERE id 0
AND record_type_id=0
AND id=${dataimporter.delta.id}
ORDER BY id DESC
deltaQuery=SELECT id, date_published, date_created, publish_flag FROM
Item
WHERE id 0
AND record_type_id=0
AND sys_time_stamp =
SUBDATE('${dataimporter.last_index_time}', INTERVAL 1 MINUTE) ORDER BY id
DESC
I think the problem i'm having comes from the way solr stores the
last_index_time in conf/dataimport.properties as stated on the wiki as
When delta-import command is executed, it reads the start time stored
in
conf/dataimport.properties. It uses that timestamp to run delta queries
and
after completion, updates the timestamp in conf/dataimport.properties.
Which to me seems to indicate that any records with a time-stamp between
when the dataimport starts and ends will be missed as the last_index_time
is
set to when it completes the import.
This doesn't seem quite right to me. I would have expected the
last_index_time to refer to when the dataimport was last STARTED so that
there was no gaps in the timestamp covered.
I changed the deltaQuery of our config to include the SUBDATE by INTERVAL
1
MINUTE statement to alleviate this problem, but it does only cover times
when the delta-import takes less than a minute.
Any ideas as to how this can be overcome? ,other than increasing the
INTERVAL to something larger.
Regards
Barry Tucker
--
View this message in context:
http://lucene.472066.n3.nabble.com/Delta-Import-occasionally-missing-records-tp2300877p2300877.htmlhttp://lucene.472066.n3.nabble.com/Delta-Import-occasionally-missing-records-tp2300877p2300877.html?by-user=t
Sent from the Solr - User mailing list archive at Nabble.com.
--
Lance Norskog
[hidden email] http://user/SendEmail.jtp?type=nodenode=2307215i=1
--
If you reply to this email, your message will be added to the discussion
below:
http://lucene.472066.n3.nabble.com/Delta-Import-occasionally-missing-records-tp2300877p2307215.html
To unsubscribe from Delta Import occasionally missing records., click
herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=2300877code=YnR1Y2tlckBtaW50ZWwuY29tfDIzMDA4Nzd8LTEzMDE5MDUxOTI=.
font size=1 face=Verdana
Mintel International Group Ltd | 18-19 Long Lane | London EC1A 9PL UK
Registered in England: Number 1475918. | VAT Number: GB 232 9342 72
Contact details for our other offices can be found at
http://www.mintel.com/office-locations.
This email and any attachments may include content that is confidential,
privileged, or otherwise protected
under applicable law. Unauthorised disclosure, copying, distribution, or use of
the contents is prohibited
and may be unlawful. If you have received this email in error, including
without appropriate authorisation,
then please reply to the sender about the error and delete this email and any
attachments./font
--
View this message in context:
http://lucene.472066.n3.nabble.com/Delta-Import-occasionally-missing-records-tp2300877p2318572.html
Sent from the Solr - User mailing list archive at Nabble.com.