[
https://issues.apache.org/jira/browse/CONNECTORS-1299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15244387#comment-15244387
]
Karl Wright commented on CONNECTORS-1299:
-----------------------------------------
Hi Konstantin,
What you are seeing is an issue with scheduling of documents. Documents are
allotted priority values at the time they are crawled. The priority values are
calculated with shared external resources in mind. That is, if you have two
jobs crawling the same resource (as far as the connector defines it), then the
job management code assigns document priorities with ALL users under
consideration.
This leads to some odd effects if you start one job way after you started
another. The first job will continue to make progress, and it will appear as
if the second job doesn't. But what is happening is that the first document
from the second job won't be crawled until the first job gets through the
documents it had queued at the time the second job started.
The jcifs connector assigns document bins by server:
{code}
@Override
public String[] getBinNames(String documentIdentifier)
{
return new String[]{server};
}
{code}
... so plan accordingly. Also, this code was I believe just fixed recently;
some connectors were not using proper bins and would therefore unnecessarily
interfere with each other.
> "Seeding" phase of a job prevents starting others?
> --------------------------------------------------
>
> Key: CONNECTORS-1299
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1299
> Project: ManifoldCF
> Issue Type: Bug
> Components: Framework crawler agent
> Environment: Windows
> Reporter: Konstantin Avdeev
>
> Hello Karl, could you please clarify if this is a bug or a feature? :)
> When I start an smb job for a share containing a lot of files (can be
> reproduced with a \Windows directory :)) and then start a second job, the
> last one remains some time (depends on amount of data processing by the first
> one) with the status "running", but showing {{"Active=1"}} and does not
> progress.
> Setting log level to Debug did not shed a light on this, unfortunately.
> It would be great, if could elaborate on that a little!
> Thank you!
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)