---------------------------- First crawl 
-----------------------------------------

In the processDocument method the following code is triggered on the 
parentIdendifier:

activities.addDocumentReference(childIdentifier, parentIdentifier, null, new 
String[] { "content" }, new String[][] { { "someContent" } });

Then the childIdentifier is processed and the following code is triggered in 
the processDocument method:

final String[] contentArray = activities.retrieveParentData(childIdentifier, 
"content");

At this point, the childIdentifier correctly retrieve a contentArray containing 
1 value which is "someContent"

---------------------------- Second crawl 
-----------------------------------------

In the processDocument method the following code is triggered on the 
parentIdendifier:

activities.addDocumentReference(childIdentifier, parentIdentifier, null, new 
String[] { "content" }, new String[][] { { "newContent" } });

Then the childIdentifier is processed and the following code is triggered in 
the processDocument method:

final String[] contentArray = activities.retrieveParentData(childIdentifier, 
"content");

At this point, the childIdentifier retrieves a contentArray containing 2 
values, the old one "someContent", and the new one "newContent"

I can guarantee that the parentIdentifier between the two crawls is the same 
and that on the second crawl, only the "newContent" is added, I debugged the 
code to confirm everything.



Julien


-----Message d'origine-----
De : Karl Wright <daddy...@gmail.com> 
Envoyé : dimanche 21 mars 2021 16:05
À : dev <dev@manifoldcf.apache.org>
Objet : Re: How to override carry down data

Can you give me a code example?
The carry-down information is set by the parent, as you say.  The specific 
information is keyed to the parent so when the child is added to the queue, all 
old carrydown information from the same parent is deleted at that time, and 
until that happens the carrydown information is preserved for every child.  As 
you say, it can be augmented by other parents that refer to the same child, but 
it is never *replaced* by carrydown info from a different parent, just 
augmented.

If it didn't work this way, MCF would have horrendous order dependencies in 
what documents got processed first.  As it is, when the carrydown information 
changes because another parent is discovered, the children are queued for 
processing to achieve stable results.

Karl


On Sun, Mar 21, 2021 at 10:45 AM <julien.massi...@francelabs.com> wrote:

> Hi Karl,
>
>
>
> I am using carry-down data in a repository connector but I have 
> figured out that I am unable to update/override a value that already have 
> been set.
> Indeed, despite I am using the same key and the same parent 
> identifier, the values are stacked. So, when I retrieve carry-down 
> data through the key I get more and more values in the array instead of only 
> one that is updated.
> It seems I misunderstood the documentation, I was believing that the 
> carry-down data values are stacked only if there are several parent 
> identifiers for the same key.
> What can I do to maintain only one carry-down data value for a given 
> key and a given parent identifier ?
>
>
>
> Regards,
>
> Julien
>
>
>
>

Reply via email to