Hi Aehem, I decided that it is safer to always set the prioritySet field to 0 when the docPriority is nullDocumentPriority. I've committed another revision for this, and described it in the CONNECTORS-1091 ticket. I've also included upgrade code that updates the prioritySet field automatically upon initialization.
Karl On Tue, Nov 4, 2014 at 1:23 PM, Karl Wright <[email protected]> wrote: > Hi Aeham, > > I would be careful to set the "priorityset" field value to 0 only for > documents that have state "G" and whose job is active. > > bq. I believe the priority should be set by > ManifoldCF# > resetAllDocumentPriorities but it's not, because > JobManager#getNextNotYetProcessedReprioritizationDocuments returns no rows > to update > > Where is resetAllDocumentPriorities being called from that you are seeing > this? When a job is started, documents that are put into the "G" state all > have prioritySet times set to 0, so that the subsequent > resetAllDocumentPriorities() call will assign priorities to them. If there > are other conditions where resetAllDocumentPriorities() is getting called > after documents are put into the "G" state, where this ISN'T happening, I'd > like to know about them. > > Thanks, > Karl > > > On Tue, Nov 4, 2014 at 1:15 PM, Aeham Abushwashi < > [email protected]> wrote: > >> Hi Karl, >> >> After applying the 1.7.2 revisions for CONNECTORS-1090, -1091, -1092 and >> -1093 to my 1.6.1 branch, if I create a new crawl, then its documents get >> picked up by the next scan; however, that doesn't happen for an existing >> crawl. The docpriority for documents in the existing craw is still at >> 1000000001. >> >> I believe the priority should be set by >> ManifoldCF#resetAllDocumentPriorities but it's not, because >> JobManager#getNextNotYetProcessedReprioritizationDocuments returns no rows >> to update, which I think is due to the legacy job's docs having a >> priorityset of NULL. Replacing the current priorityset condition in >> JobManager#getNextNotYetProcessedReprioritizationDocuments with >> (priorityset IS NULL OR priorityset<?) addresses this specific issue. >> Is this a valid fix or do you see it introducing undesirable behaviour or >> masking an issue elsewhere? >> >> Cheers, >> Aeham >> >> > >
