Hi there,

my index is created from XML files that are downloaded on the fly.
This also includes downloading a mapping file that is used to resolve IDs in 
the main file (root entity) and map them onto names.

The basic functionality works - the supplier_name is set for each document.
However, the mapping file is downloaded with every iteration of the root 
entity. In order to avoid this and only have it downloaded once and the mapping 
cached, I have set the cacheKey and cacheLookup properties but the file is 
still requested over and over again.

Has someone worked with multiple different XMLs files with mappings loaded via 
different DIH entities? I’d appreciate any samples or hints.
Or maybe someone is able to spot the error in the following configuration?

(The custom DataSource is a subclass of URLDataSource and handles Basic Auth as 
well as decompression.)


<dataConfig>
    <dataSource name="ds1"
                
type="com.itagenten.solr.handler.dataimport.CompressedUrlDataSource"
                baseUrl="${dataimporter.request.baseurl}"
                encoding="UTF-8"
                connectionTimeout="5000"
                readTimeout="10000"/>
    <document>
        <entity name="product"
                pk="id"
                url="${dataimporter.request.filename}"
                processor="XPathEntityProcessor"
                stream="true"
                forEach="/root/products/product"
                transformer="DateFormatTransformer">

            <entity name="supplier"
                    dataSource="ds1"
                    cacheKey="id"
                    cacheLookup="product.supplier_id"
                    cacheImpl="SortedMapBackedCache"
                    url="supplier_mapping.xml"
                    processor="XPathEntityProcessor"
                    forEach="/root/suppliers/supplier">
                <field column="id" 
xpath="/root/suppliers/supplier/@supplier_id"/>
                <field column="supplier_name" 
xpath="/root/suppliers/supplier/@name"/>
            </entity>

            <field column="id" xpath="/root/products/product/@product_id“/>
            <!-- remaining fields omitted -->
        </entity>
    </document>
</dataConfig>


Thank you!
Chantal

Reply via email to