Thanks, Raheel. That's the approach I took. I modified the deltaQuery like this:
deltaQuery="SELECT l.list_id AS id FROM lists l LEFT JOIN agg_list_view_stats agglvs ON agglvs.list_id = l.list_id WHERE l.status = 'ACTIVE' AND l.is_public = 1 AND ( (('${dih.request.entity1}' = 'true') AND (l.modified_on > '${dih.last_index_time}')) OR (('${dih.request.entity2}' = 'true') AND (agglvs.overall_view_modified_date > DATE_SUB(NOW(), INTERVAL 1 HOUR))) ) Then I pass entity1=true for what was my previous first entity and entity2=true for the previous 2nd entity. On Tue, Jun 4, 2013 at 9:21 AM, Raheel Hasan <raheelhasan....@gmail.com>wrote: > maybe this will help you: > http://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport > > > On Tue, Jun 4, 2013 at 8:38 PM, Arun Rangarajan <arunrangara...@gmail.com > >wrote: > > > Shawn, > > > > Thanks for your reply. My data-config.xml actually has two entities. I > sent > > only the first entity in my previous email. Since I had not run any > imports > > on the 2nd entity, dataimport.properties did not have an entry for it > yet. > > This worked fine in 3.6.2, so looks like a bug in 4.2.1. > > > > For now, I am thinking that I can skip using the dih properties entirely. > > For the first entity, I can look for documents that changed in the last > 10 > > min in the DB and run the delta import cron job every 10 min. For the 2nd > > entity, the interval is 1 hour. Of course, if one of the delta imports > fail > > this approach may skip some documents, but we do full import once a day > so > > those docs should eventually catch up. Guess that's the best I can get > with > > DIH for now! > > > > > > On Tue, Jun 4, 2013 at 7:05 AM, Shawn Heisey <s...@elyograg.org> wrote: > > > > > On 6/4/2013 7:52 AM, Arun Rangarajan wrote: > > > > I upgraded from Solr 3.6.2 to 4.2.1 and I am noticing that my data > > import > > > > handler's delta import is actually doing a full import. > > > > > > <snip> > > > > > > > What changed and how do I get delta import to only index the > documents > > > that > > > > got modified after ${dih.Lists.last_index_time}'? > > > > > > It's a bug. I've built a test that shows the problem, but I haven't > > > figured out yet how to actually fix it. > > > > > > https://issues.apache.org/jira/browse/SOLR-4788 > > > > > > I now have one more data point to add to the mix that I didn't know > > > before - it works in 3.6.2. > > > > > > It looks like you only have the one entity showing a last_indexed_time, > > > so you should be able to use ${dih.last_index_time} instead of > > > ${dih.Lists.last_index_time}. > > > > > > Thanks, > > > Shawn > > > > > > > > > > > > -- > Regards, > Raheel Hasan >