Stuffer query will perform poorly under some conditions
-------------------------------------------------------

                 Key: CONNECTORS-290
                 URL: https://issues.apache.org/jira/browse/CONNECTORS-290
             Project: ManifoldCF
          Issue Type: Bug
          Components: Framework agents process
    Affects Versions: ManifoldCF 0.3, ManifoldCF 0.2, ManifoldCF 0.1, 
ManifoldCF 0.4
            Reporter: Karl Wright
            Assignee: Karl Wright
             Fix For: ManifoldCF 0.4


The stuffer query, which returns documents in index order by docpriority for 
processing, performs poorly when lots of documents are in the queue and have a 
good priority but can't be taken because of job state.  This can happen when:
(1) a large job is aborted, leaving lots of jobqueue records with docpriority 
values around;
(2) a job is paused for an extended period of time, while others are running.

In the second case, when the paused job is resumed, there's an added problem 
because, for a while, only documents from the paused job will be processed.

The answer to (1) may well be to clean out all docpriority values on job abort. 
 Right now there is no logic that sets
docpriority values to null, but there clearly needs to be, or the docpriority 
index will remain polluted with rows that must be scanned but cannot be used 
for an extended period of time.

The "correct" answer to (2) is to clear out docpriority values when a job is 
paused, and then redo them all when the job is resumed.  Similarly, docpriority 
values should be set for all of a job's documents when a job is started, and 
should be nulled out when documents enter non-active states.  The former 
currently occurs, but not the latter.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to