There were 2 major changes to DIH Cache functionality in Solr 3.6, only 1 of 
which was carried to Solr 4.0:

- Solr 3.6 had 2 MAJOR changes:

1. We support pluggable caches so that you can write your own cache 
implemetations and cache however you want.  The goal here is to allow you to 
cache to disk when you had to do large, complex joins and an in-memory cache 
could result in an OOM.  Also, you can specify "cacheImpl" with any 
EntityProcessor, not just SqlEntityProcessor.  So you can join child entities 
that come from XML, flat files, etc.  CachedSqlEntityProcessor is technically 
deprecated as using it is the same as SqlEntityProcessor with 
cacheImpl="SortedMapBackedCache" specified.  This does a simple in-memory cache 
very similar to Solr3.5 and prior. (see 
https://issues.apache.org/jira/browse/SOLR-2382)

2. Extensive work was done to try and make the "threads" parameter work in more 
situations.  This involved some rather invasive changes to the DIH Cache 
functionality. (see https://issues.apache.org/jira/browse/SOLR-3011)

- Solr 4.0 has #1 above, BUT NOT #2.  Rather the "threads" functionality was 
entirely removed.

Subsequently, if the problem is due to #2 (SOLR-3011), this isn't as big a 
problem because 3.x users can simply use the 3.5 DIH jar (but some use-cases 
involding "threads" work with the 3.6(.1) jar and not at all with 3.5, so users 
will have to pick & choose the best version to use for their instance).

My concern is there are issues with #1 (SOLR-2382).  That's why I'm asking if 
at all possible you can try this with SOLR 4.0.  I have tested Solr 4.0 
extensively here and it seems caching works exactly as it ought.  However, DIH 
is flexible on how it can be configured and there could be somethat that was 
broken that I have not uncovered myself.  Any issues that may exist with 
SOLR-2382 need to be identified and fixed in the 4.x branch as soon as possible.

I apologize for the late response.  I was away the past week.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

-----Original Message-----
From: mechravi25 [mailto:mechrav...@yahoo.co.in] 
Sent: Tuesday, August 21, 2012 7:47 AM
To: solr-user@lucene.apache.org
Subject: RE: Dataimport Handler in solr 3.6.1

Hi James,

Thanks for the suggestions. 

Actually it is cacheLookup="ent1.id" . had misspelt it. Also, I will be
needing the transformers mentioned as there are other columns as well.

Actually tried using the 3.5 DIH jars in 3.6.1 and indexed the same and the
indexing was successful. But I wanted this to work with 3.6.1 DIH. Just came
across the SOLR-2382 patch. I tried giving the following 

processor="CachedSqlEntityProcessor" cacheImpl="SortedMapBackedCache" 

in my DIH.xml file. In case of static fields in child entities ,the indexing
happended fine but in case of dynamic fields, only one of the dynamic fields
was indexed and the rest was skipped even though the total rows fetched from
datasource was correct.

Following are my questions

1.) Is there a big difference in solr 3.5 and 3.6.1 DIH handler files? like
is any new feature added in 3.6 DIH that is not present in 3.5?
2.) Am i missing something while giving the cacheImpl="SortedMapBackedCache"
in my DIH.xml because of which dynamic fields are not indexed properly?
There is no change to my DIH file from my previous post apart from this
cacheImpl addition and also the dynamic fields are indexed properly if I do
not give this cacheImpl. Am I missing something here?

Thanks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Dataimport-Handler-in-solr-3-6-1-tp4001149p4002421.html
Sent from the Solr - User mailing list archive at Nabble.com.


Reply via email to