I think delta imports only work on the parent entity and cached child entities 
will load in full, even if you only need to look up a few rows for the delta.  
Others though might have a way to get this to work.

Here's two possible workarounds.

On the child entity, specify:  
<entity processer="SqlEntityProcessor" name="media_tags_map" 
cacheImpl="${cache.impl}" />
When it is a full import, pass the parameter: cache.impl=SortedMapBackedCache . 
For delta imports, leave this blank.  This (I think) will give you a cache for 
the full-import and no cache for the deltas.

Another workaround is to include a subquery on your delta import like this:
Select * from table ${delta.subquery}
When it is a delta import, pass the pass the paremeter: delta.subquery=where 
blah in (select blah from parent_table ...)

This will cause it to cache only the entries needed for that delta import.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: david.r.laroche...@gmail.com [mailto:david.r.laroche...@gmail.com] On 
Behalf Of David Larochelle
Sent: Monday, September 23, 2013 5:22 PM
To: solr-user
Subject: Using CachedSqlEntityProcessor with delta imports in DIH

I'm trying to use the CachedSqlEntityProcessor on a child entity that also
has a delta query.

Full imports and delta imports of the parent entity work fine however delta
imports for the child entity have no effect. If I remove the
processor="CachedSqlEntityProcessor" attribute from the child entity, the
delta import works flawlessly but the full import is very slow.
Here's my data-config.xml:


<dataConfig>
  <xi:include href="db-connection.xml"
      xmlns:xi="http://www.w3.org/2001/XInclude"/>
  <document>
    <entity name="story_sentences"
            pk="story_sentences_id"
            query="select story_sentences_id || '_ss' as id, 'ss' as
field_type, * from story_sentences"
            deltaImportQuery="select story_sentences_id || '_ss' as id,
'ss' as field_type, * from story_sentences where story_sentences_id=${
dataimporter.delta.id}"
            deltaQuery="SELECT story_sentences_id as id, story_sentences_id
from story_sentences where db_row_last_updated &gt;
'${dih.last_index_time}' ">
      <entity name="media_tags_map"
              pk="media_tags_map_id"
              query="select tags_id as tags_id_media, * from media_tags_map"
      cacheKey="media_id"
      cacheLookup="story_sentences.media_id"
      processor="CachedSqlEntityProcessor"
              deltaQuery="select media_tags_map_id, media_id::varchar from
media_tags_map where db_row_last_updated &gt; '${dih.last_index_time}' "
              parentDeltaQuery="select story_sentences_id as id from
story_sentences where media_id = ${media_tags_map.media_id}"
              >
      </entity>
    </entity>
  </document>
</dataConfig>


I need to be able to run delta imports based on the media_tags_map table in
addition to the story_sentences table.

Any idea why delta imports for media_tags_map won't work when the
CachedSqlEntityProcessor is used?

I've searched extensively but can't find an example that uses both
CachedSqlEntityProcessor and deltaQuery on the sub-entity or any
explanation of why the above configuration won't work as expected.

--

Thanks,

David

Reply via email to