Hi Markus,

Your example looks correct.  I suspect there may be a bug.  I'll open a
ticket.  CONNECTORS-1110.


Karl


On Fri, Nov 21, 2014 at 7:59 AM, Markus Schuch <[email protected]> wrote:

> Hi,
>
> is there any example implementation of the new document component feature
> invented with CONNECTORS-989?
>
> I read the section Document components in [0] but i still do not know how
> to actually write a repository connector
> that ingests multiple documents originating from a single document of a
> repository.
>
> My first guess was to call the method
> "activities.ingestDocumentWithException" multiple times with the same
> identifier but distinct component identifiers during document processing.
>
> I wrote a simple TestConnector:
> The processDocuments method looks like:
>
>     public void processDocuments(String[] documentIdentifiers,
>             String[] versions, IProcessActivity activities,
>             DocumentSpecification spec, boolean[] scanOnly)
>             throws ManifoldCFException, ServiceInterruption {
>
>         int i = 0;
>         for (String identifier : documentIdentifiers) {
>
>             byte[] content1 = "test content 1".getBytes();
>             byte[] content2 = "test content 2".getBytes();
>             byte[] content3 = "test content 3".getBytes();
>
>
>             RepositoryDocument rd1 = new RepositoryDocument();
>             rd1.setBinary(new ByteArrayInputStream(content1),
> content1.length);
>
>             RepositoryDocument rd2 = new RepositoryDocument();
>             rd2.setBinary(new ByteArrayInputStream(content2),
> content2.length);
>
>
>             RepositoryDocument rd3 = new RepositoryDocument();
>             rd3.setBinary(new ByteArrayInputStream(content3),
> content3.length);
>
>
>             System.out.println("process " + identifier);
>
>             try {
>                 activities.ingestDocumentWithException(identifier,
> "comp1", versions[i], identifier+"/comp1", rd1);
>                 activities.ingestDocumentWithException(identifier,
> "comp2", versions[i], identifier+"/comp2", rd2);
>                 activities.ingestDocumentWithException(identifier,
> "comp3", versions[i], identifier+"/comp3", rd3);
>             } catch (IOException e) {
>                 e.printStackTrace();
>             }
>
>             i++;
>         }
>
>     }
>
> For seeding the method getDocumentIdentifiers() returns a stream with a
> single document identifier "testidentifier1".
> Full Code available at [1].
>
> But subsequent calls of ingestDocumentWithException result in deletions of
> a previously added component.
>
> job end 1416573910333(copmtest1) 0      1
> document ingest         testidentifier1/comp3 OK 12     8
> document deletion       testidentifier1/comp1 OK 0      2
> document ingest         testidentifier1/comp2 OK 12     10
> document deletion       testidentifier1/comp1 OK 0      3
> document ingest         testidentifier1/comp1 OK 12     13
> job start       1416573910333(copmtest1) 0      1
>
> Only testidentifier1/comp2 and testidentifier1/comp3 exist in the output
> connection after the job is finished.
>
> I feed i might have a false understanding of the concept...
>
> Any help is appreciated.
>
> Thanks in advance
> Markus
>
> --
> Using ManifoldCF 1.7.1
> [0]
> http://manifoldcf.apache.org/release/release-1.7.2/en_US/writing-repository-connectors.html
> [1] https://gist.github.com/schuch/43809594ad8f81ddc625
>

Reply via email to