Hello Paul, thank you for your feedback. I will ask to add an expiration date to the DB and run a process that updates the index accordingly.
Cheers, Giovanni On 3/18/09, Noble Paul നോബിള് नोब्ळ् <noble.p...@gmail.com> wrote: > > it is not possible to query details from Solr and find out deleted > items using DIH > > you must maintain a deleted rows ids in the db or just flag them as > deleted. > > --Noble > > > > On Wed, Mar 18, 2009 at 2:46 PM, Giovanni De Stefano > <giovanni.destef...@gmail.com> wrote: > > Hello Paul, > > > > thank you for your reply. > > > > The UPDATE in fact works fine: I only had to update the CREATION_TIME on > the > > DB :-) > > > > Regarding the deletedPkQuery, I understand it has to return the primary > keys > > that should be removed from the index (because they have been removed > from > > the DB) but I don't have any "deleted" flag on the DB. > > > > Basically the deletedPkQuery should be something like "select URI * > > from_the_current_index* where URI is not in (select URI from TEST)" > > > > That is returning a subset of primary keys currently in the index and > that > > are not in the DB anymore. Is this possible? > > > > I am no DB expert...so ANY tip is very welcome! > > > > Thanks, > > Giovanni > > > > > > On 3/18/09, Noble Paul നോബിള് नोब्ळ् <noble.p...@gmail.com> wrote: > >> > >> are you sure your schema.xml has a <uniqueKey> field to UPDATE docs. > >> > >> to remove deleted docs you must have deletedPkQuery attribute in the > root > >> entity > >> > >> On Tue, Mar 17, 2009 at 8:48 PM, Giovanni De Stefano > >> <giovanni.destef...@gmail.com> wrote: > >> > Hello all, > >> > > >> > I have a table TEST in an Oracle DB with the following columns: URI > >> > (varchar), CONTENT (varchar), CREATION_TIME (date). > >> > > >> > The primary key both in the DB and Solr is URI. > >> > > >> > Here is my data-config.xml: > >> > > >> > <dataConfig> > >> > <dataSource > >> > driver="oracle.jdbc.driver.OracleDriver" > >> > url="jdbc:oracle:thin:@localhost:1521/XE" > >> > user="username" > >> > password="password" > >> > /> > >> > <document name="Test"> > >> > <entity > >> > name="test_item" > >> > pk="URI" > >> > query="select URI,CONTENT from TEST" > >> > * deltaQuery="select URI,CONTENT from TEST where > >> > TO_CHAR(CREATION_TIME,'YYYY-MM-DD HH:MI:SS') > > >> > '${dataimporter.last_index_time}'" * > >> > > > >> > <field column="URI" name="uri"/> > >> > <field column="CONTENT" name="content"/> > >> > </entity> > >> > </document> > >> > </dataConfig> > >> > > >> > The problem is that anytime I perform a delta-import, the index keeps > >> being > >> > populated as if new documents were added. In other words, I am not > able > >> to > >> > UPDATE an existing document or REMOVE a document that is not anymore > in > >> the > >> > DB. > >> > > >> > What am I missing? How should I specify my deltaQuery? > >> > > >> > Thanks a lot in advance! > >> > > >> > Giovanni > >> > > >> > >> > >> > >> -- > >> --Noble Paul > >> > > > > > > -- > --Noble Paul >