comments inline On Thu, Jul 17, 2008 at 5:00 AM, wojtekpia <[EMAIL PROTECTED]> wrote: > > I have two questions: > > 1. I am pulling data from 2 data sources using the DIH. I am using the > deltaQuery functionality. Since the data sources pull data sequentially, I > find that some data is getting unnecessarily re-indexed from my second data > source. Hopefully this helps illustrate my probem: > > Assume last_index_time is 0. > At time = 1, pull data from data source 1 with a query that includes > "last_modified> '${dataimporter.last_index_time}'". Note that this pulls > data for the time interval [0,1]. This step takes 1 time interval. > At time = 2, data source 2 is polled with the same query. This step takes 1 > time interval. Note that this pulls data for the time interval [0,2]. > At t=3, last_index_time is set to 1 > > Next time I run the DIH, I will be unneccessarily re-indexing data that > appeared in data source 2 in the inteval [1,2]. > > Ideally, I'd like to have access to something like > ${dataimporter.current_index_time}, so I could restrict my delta query to: > "last_modified> '${dataimporter.last_index_time}' AND last_modified < > '${dataimporter.current_index_time}'" > > Is this available? It is not available but can be added easily. I shall give this in the next patch. If you want it earlier I can it must be a small modification in DocBuilder.java > > > 2. I have a transient table that I query with the DIH to load my index. > After loading values into the index, I want to delete them from the > transient table. Is there a way to do this from the DIH? I tried stuffing a > delete statement into the deltaQuery attribute, but that didn't work: > > <dataConfig> > <dataSource driver="org.hsqldb.jdbcDriver" > url="jdbc:hsqldb:/temp/example/ex" user="sa" /> > <document name="products"> > <entity name="item" pk="ID" query="select * from item" > deltaQuery="select id from item where last_modified > > '${dataimporter.last_index_time}'; delete from item where last_modified < > '${dataimporter.last_index_time}'"> > </entity> > </entity> > </document> > </dataConfig>
There is not straight forward way to achieve this. but the last component to get a callback when the indexing is finished is the DatSource#close(). if you are adventurous enough you can extend the JdbcdataSource and override the method close() and invoke a delete query from the close() method . and use that as your DataSource > > > > -- > View this message in context: > http://www.nabble.com/DataImportHandler-current_index_time---post-completion-action-tp18498832p18498832.html > Sent from the Solr - User mailing list archive at Nabble.com. > > -- --Noble Paul