[
https://issues.apache.org/jira/browse/SOLR-1004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shalin Shekhar Mangar resolved SOLR-1004.
-----------------------------------------
Resolution: Fixed
Committed revision 745742.
Thanks Marc!
> Optimizing the abort command in delta import
> --------------------------------------------
>
> Key: SOLR-1004
> URL: https://issues.apache.org/jira/browse/SOLR-1004
> Project: Solr
> Issue Type: Improvement
> Components: contrib - DataImportHandler
> Affects Versions: 1.3
> Environment: Java - Lucene - Solr - DataImportHandler
> Reporter: Marc Sturlese
> Assignee: Shalin Shekhar Mangar
> Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-1004.patch
>
> Original Estimate: 0.5h
> Remaining Estimate: 0.5h
>
> I have seen that when abort command is called in a deltaImport, in
> DocBuilder.java, at doDelta functions it's just checked for abortion at the
> begining of collectDelta, after that function and at the end of collectDelta.
> The problem I have found is that if there is a big number of documents to
> modify and abort is called in the middle of delta collection, it will not
> take effect until all data is collected.
> Same happens when we start deleteting or updating documents. In updating
> case, there is an abortion check inside buildDocument but, as it is called
> inside a "while" for all docs to update, it will keep going throw all docs of
> the bucle and skipping them.
> I propose to do an abortion check inside every loop of data collection and
> after calling build document in doDelta function.
> In the case of modifing documents, the code in DocBuilder.java would look
> like:
> while (pkIter.hasNext()) {
> Map<String, Object> map = pkIter.next();
> vri.addNamespace(DataConfig.IMPORTER_NS + ".delta", map);
> buildDocument(vri, null, map, root, true, null);
> pkIter.remove();
> //check if abortion
> if (stop.get())
> {
> allPks = null ;
> pkIter = null ;
> return;
> }
> }
> In the case of document deletion (deleteAll function in DocBuilder): Just
> if (stop.get()){ break ; } at the end of every loop and call this just
> after deleteAll is called (in doDelta)
> if (stop.get())
> {
> allPks = null;
> deletedKeys = null;
> return;
> }
> Finally in collect delta:
> while (true) {
> //check for abortion
> if (stop.get()){ return myModifiedPks; }
> Map<String, Object> row = entityProcessor.nextModifiedRowKey();
> if (row == null)
> break;
> ...
> And the same for delete-query collection and parent-delta-query collection
> I didn't atach de patch because is the first time I open an issue and don't
> know if you want to code it as I do. Just wanted to explain the idea and how
> I solved, I think it can be useful for other users.
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.