[ 
https://issues.apache.org/jira/browse/SOLR-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravikanth Gangarapu updated SOLR-6439:
--------------------------------------

    Description: 
Solr Delta Import process will fail while running deltaQuery and 
parentDeltaQuery queries throwing Class cast exceptions if cacheImpl attribute 
is set on entity. The problem is if cacheImpl is set, then 
EntityProcessorBase's initCache will instantiate DIHCacheSupport instance and 
during  getNext() method, it will try to load from cacheLookUp based on 
cacheForeignKey. But, when in FIND_DELTA stage, we don't know the parent and we 
trying to find the parent. So, there is no reason to put/get the query results 
in cache as potentially the same deltaQuery is not run again. So, please fix 
the if condition to just go thru with rowiterator and get the deltaQuery 
results instead of the else condition with cache lookup etc. If caching must be 
utilized for deltaQuery and parentDeltaQuery also, then change the 
DIHCacheSupport class to peform a simple cache lookup rather than id cache 
lookup if its a find_delta process.

Another issue around the same feature is that, in solr community people suggest 
to make the cacheImpl attribute value a variable so that variable resolves to a 
real cache implementation class in full-import and on delta-import, we can make 
it empty ( e.g. controlling cache class name via custom request parameter). 
But, EntityProcessorBase doesn't check if cacheImpl class name is a non-empty 
string, just checks null and based on non-null value will try to instantiate an 
empty string class name and fails.

I am providing a patch. Please put them in next release as they are really 
simple fixes.

  was:
Solr Delta Import process will fail while running deltaQuery and 
parentDeltaQuery queries throwing Class cast exceptions if cacheImpl attribute 
is set on entity. The problem is if cacheImpl is set, then 
EntityProcessorBase's initCache will instantiate DIHCacheSupport instance and 
during  getNext() method, it will try to load from cacheLookUp based on 
cacheForeignKey. But, when in FIND_DELTA stage, we don't know the parent and we 
trying to find the parent. So, there is no reason to put/get the query results 
in cache as potentially the same deltaQuery is not run again. So, please fix 
the if condition to just go thru with rowiterator and get the deltaQuery 
results instead of the else condition with cache lookup etc.

Another issue around the same feature is that, in solr community people (e.g. 
James Dyer) suggest to make the cacheImpl attribute value a variable so that 
variable resolves to a nice and shiny cache implementation class in full-import 
and on delta-import, we can make it empty ( e.g. controlling cache class name 
via custom request parameter). But, EntityProcessorBase doesn't check if 
cacheImpl class name is a non-empty string, just checks null and based on 
non-null value will try to instantiate an empty string class name.

I am providing a patch. Please put them in next release as they are really 
simple fixes.


> Delta Import does not work if cacheImpl attribute is set on entity
> ------------------------------------------------------------------
>
>                 Key: SOLR-6439
>                 URL: https://issues.apache.org/jira/browse/SOLR-6439
>             Project: Solr
>          Issue Type: Bug
>          Components: contrib - DataImportHandler
>    Affects Versions: 4.9
>            Reporter: Ravikanth Gangarapu
>            Priority: Minor
>             Fix For: 5.0
>
>         Attachments: SOLR-6439-2.patch, SOLR-6439.patch
>
>
> Solr Delta Import process will fail while running deltaQuery and 
> parentDeltaQuery queries throwing Class cast exceptions if cacheImpl 
> attribute is set on entity. The problem is if cacheImpl is set, then 
> EntityProcessorBase's initCache will instantiate DIHCacheSupport instance and 
> during  getNext() method, it will try to load from cacheLookUp based on 
> cacheForeignKey. But, when in FIND_DELTA stage, we don't know the parent and 
> we trying to find the parent. So, there is no reason to put/get the query 
> results in cache as potentially the same deltaQuery is not run again. So, 
> please fix the if condition to just go thru with rowiterator and get the 
> deltaQuery results instead of the else condition with cache lookup etc. If 
> caching must be utilized for deltaQuery and parentDeltaQuery also, then 
> change the DIHCacheSupport class to peform a simple cache lookup rather than 
> id cache lookup if its a find_delta process.
> Another issue around the same feature is that, in solr community people 
> suggest to make the cacheImpl attribute value a variable so that variable 
> resolves to a real cache implementation class in full-import and on 
> delta-import, we can make it empty ( e.g. controlling cache class name via 
> custom request parameter). But, EntityProcessorBase doesn't check if 
> cacheImpl class name is a non-empty string, just checks null and based on 
> non-null value will try to instantiate an empty string class name and fails.
> I am providing a patch. Please put them in next release as they are really 
> simple fixes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to