Re: Data Import Handler problem in Solr 8

Shawn Heisey Wed, 06 Jul 2022 07:40:57 -0700

On 7/6/22 04:32, Michał Świątkowski wrote:

I checked that and collection data will be erased only when I will useclean=true and optimize=true (first query).
1. clean=true ; optimize=true
webapp=/solr path=/dataimportparams={core=example_collection&optimize=true&indent=on&commit=true&name=dataimport&clean=true&wt=json&command=full-import&_=1657098443936&verbose=true}status=0 QTime=5
2. clean=true
webapp=/solr path=/dataimportparams={core=example_collection&indent=on&commit=true&name=dataimport&clean=true&wt=json&command=full-import&_=1657098443936&verbose=true}status=0 QTime=4
3. clean=false ; optimize=true
webapp=/solr path=/dataimportparams={core=example_collection&optimize=true&indent=on&commit=true&name=dataimport&clean=false&wt=json&command=full-import&_=1657098443936&verbose=true}status=0 QTime=5

If you send clean=true then DIH should wipe the index data before itbegins importing. If you set optimize=true, then Solr should optimizethe index AFTER the import is done. It is very odd to have it behavedifferently when the combination of parameters is used ... maybe whenboth parameters are true, DIH is doing a commit BEFORE importing begins,and without that combo, the commit doesn't happen, and a commit is onlydone after the import.

It might be better to set commit and optimize to false, and manually dothose operations yourself after importing completes. Just an FYI ...optimizing is generally not recommended because of how long it can takeand the fact that it uses a lot of system resources.

Note that in Solr 9.x DIH is no longer present. This is because thefeature has some problems, especially in cloud mode. You seem to havestumbled onto one of the many bugs DIH has.


You may have greater luck with the separate version of DIH:

https://github.com/rohitbemax/dataimporthandler

You can also do the import with a new collection and then update analias to point the "true" collection name to the new one after indexingis complete. This is a good paradigm to use in general.


Thanks,
Shawn

Re: Data Import Handler problem in Solr 8

Reply via email to