Hi Markus, Your example looks correct. I suspect there may be a bug. I'll open a ticket. CONNECTORS-1110.
Karl On Fri, Nov 21, 2014 at 7:59 AM, Markus Schuch <[email protected]> wrote: > Hi, > > is there any example implementation of the new document component feature > invented with CONNECTORS-989? > > I read the section Document components in [0] but i still do not know how > to actually write a repository connector > that ingests multiple documents originating from a single document of a > repository. > > My first guess was to call the method > "activities.ingestDocumentWithException" multiple times with the same > identifier but distinct component identifiers during document processing. > > I wrote a simple TestConnector: > The processDocuments method looks like: > > public void processDocuments(String[] documentIdentifiers, > String[] versions, IProcessActivity activities, > DocumentSpecification spec, boolean[] scanOnly) > throws ManifoldCFException, ServiceInterruption { > > int i = 0; > for (String identifier : documentIdentifiers) { > > byte[] content1 = "test content 1".getBytes(); > byte[] content2 = "test content 2".getBytes(); > byte[] content3 = "test content 3".getBytes(); > > > RepositoryDocument rd1 = new RepositoryDocument(); > rd1.setBinary(new ByteArrayInputStream(content1), > content1.length); > > RepositoryDocument rd2 = new RepositoryDocument(); > rd2.setBinary(new ByteArrayInputStream(content2), > content2.length); > > > RepositoryDocument rd3 = new RepositoryDocument(); > rd3.setBinary(new ByteArrayInputStream(content3), > content3.length); > > > System.out.println("process " + identifier); > > try { > activities.ingestDocumentWithException(identifier, > "comp1", versions[i], identifier+"/comp1", rd1); > activities.ingestDocumentWithException(identifier, > "comp2", versions[i], identifier+"/comp2", rd2); > activities.ingestDocumentWithException(identifier, > "comp3", versions[i], identifier+"/comp3", rd3); > } catch (IOException e) { > e.printStackTrace(); > } > > i++; > } > > } > > For seeding the method getDocumentIdentifiers() returns a stream with a > single document identifier "testidentifier1". > Full Code available at [1]. > > But subsequent calls of ingestDocumentWithException result in deletions of > a previously added component. > > job end 1416573910333(copmtest1) 0 1 > document ingest testidentifier1/comp3 OK 12 8 > document deletion testidentifier1/comp1 OK 0 2 > document ingest testidentifier1/comp2 OK 12 10 > document deletion testidentifier1/comp1 OK 0 3 > document ingest testidentifier1/comp1 OK 12 13 > job start 1416573910333(copmtest1) 0 1 > > Only testidentifier1/comp2 and testidentifier1/comp3 exist in the output > connection after the job is finished. > > I feed i might have a false understanding of the concept... > > Any help is appreciated. > > Thanks in advance > Markus > > -- > Using ManifoldCF 1.7.1 > [0] > http://manifoldcf.apache.org/release/release-1.7.2/en_US/writing-repository-connectors.html > [1] https://gist.github.com/schuch/43809594ad8f81ddc625 >
