[
https://issues.apache.org/jira/browse/CONNECTORS-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16621019#comment-16621019
]
Karl Wright commented on CONNECTORS-1532:
-----------------------------------------
As I suspected, there is no code difference in the framework between MODEL_ADD
and MODEL_ADD_CHANGE:
{code}
./build/crawler-ui/java/org/apache/jsp/editjob_jsp.java: int model =
IRepositoryConnector.MODEL_ADD_CHANGE_DELETE;
./build/crawler-ui/java/org/apache/jsp/editjob_jsp.java: if (model != -1 &&
model != IRepositoryConnector.MODEL_ADD_CHANGE_DELETE && model !=
IRepositoryConnector.MODEL_CHAINED_ADD_CHANGE_DELETE)
./build/crawler-ui/java/org/apache/jsp/viewjob_jsp.java: if (model != -1 &&
model != IRepositoryConnector.MODEL_ADD_CHANGE_DELETE)
./pull-agent/src/main/java/org/apache/manifoldcf/crawler/interfaces/IRepositoryConnector.java:*
is the most restrictive that is still accurate. For example, if
MODEL_ADD_CHANGE_DELETE applies, you would
./pull-agent/src/main/java/org/apache/manifoldcf/crawler/interfaces/IRepositoryConnector.java:*
return that value rather than MODEL_ADD.
./pull-agent/src/main/java/org/apache/manifoldcf/crawler/interfaces/IRepositoryConnector.java:
public static final int
MODEL_ADD = 1;
./pull-agent/src/main/java/org/apache/manifoldcf/crawler/interfaces/IRepositoryConnector.java:
public static final int
MODEL_ADD_CHANGE = 2;
./pull-agent/src/main/java/org/apache/manifoldcf/crawler/interfaces/IRepositoryConnector.java:
public static final int
MODEL_ADD_CHANGE_DELETE = 3;
./pull-agent/src/main/java/org/apache/manifoldcf/crawler/interfaces/IRepositoryConnector.java:
/** Like MODEL_ADD, except considering document discovery */
./pull-agent/src/main/java/org/apache/manifoldcf/crawler/interfaces/IRepositoryConnector.java:
/** Like MODEL_ADD_CHANGE, except considering document discovery */
./pull-agent/src/main/java/org/apache/manifoldcf/crawler/interfaces/IRepositoryConnector.java:
/** Like MODEL_ADD_CHANGE_DELETE, except considering document discovery */
./pull-agent/src/main/java/org/apache/manifoldcf/crawler/jobs/JobManager.java:
// (1) If the connector has MODEL_ADD_CHANGE_DELETE, then
./pull-agent/src/main/java/org/apache/manifoldcf/crawler/jobs/JobManager.java:
if (connectorModel == IRepositoryConnector.MODEL_ADD_CHANGE_DELETE)
{code}
I may have found the reason you see this behavior, though. If the folder
affinity is versioned information, and I believe it is, then the seeding query
will pick up the last version of the document that was in the right folder.
That's because the seeding query uses the chronicle_id, which is really a
specific document version:
{code}
String strDQLstart = "select for READ distinct i_chronicle_id from ";
{code}
I wouldn't know the DQL for checking to be sure that the particular version of
the document was the last one, unfortunately.
> Moving a file outside of the job's Paths is not the same as deleting it
> -----------------------------------------------------------------------
>
> Key: CONNECTORS-1532
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1532
> Project: ManifoldCF
> Issue Type: Bug
> Components: Documentum connector
> Affects Versions: ManifoldCF 2.10
> Environment: Manifold 2.10 patched for #1512, #1517
> Reporter: James Thomas
> Assignee: Karl Wright
> Priority: Major
> Attachments: 2018-09-19_1758.png
>
>
> If I have a MF job which is connecting a specific folder, F, in Documentum to
> a File System output then:
> 1. deleting files in Documentum shows them as zero size in the file system
> 2. moving files out of F does not remove them or zero them in the file system
> Note that moving a file from another folder (which the job is not looking at)
> to F has the same effect as adding it to F by e.g. importing it in DM or
> POSTing it to DM via the REST interface.
> Intuitively, I expect that moving a file out of the "view" of the Documentum
> connector would have the same effect on the File System as deleting it. (My
> model here is of MF synchronising content between the Paths (DM) and the
> Output Path (File System) that I have specified in the job.)
> Starting point, I have run the MF job to fetch a bunch of files from a folder
> - call it F - in DM (i.e. I have configured Paths in the job to be F). This
> is what 'ls -l' on the file system looks like:
> {code:java}
> -rw-r--r--. 1 root i2e 12541 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c0
> -rw-r--r--. 1 root i2e 26 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7be
> -rw-r--r--. 1 root i2e 85772 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c7
> -rw-r--r--. 1 root i2e 8790 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c2
> -rw-r--r--. 1 root i2e 101888 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c3
> -rw-r--r--. 1 root i2e 32783 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c4
> -rw-r--r--. 1 root i2e 23040 Sep 19 07:22
> drl?versionLabel=CURRENT&objectId=090000018000f7c1
> -rw-r--r--. 1 root i2e 26112 Sep 19 07:22
> drl?versionLabel=CURRENT&objectId=090000018000f7bf{code}
> In DM, I delete one of the files in F and it shows as zero size, and the
> modification date has changed:
> {code:java}
> -rw-r--r--. 1 root i2e 12541 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c0
> -rw-r--r--. 1 root i2e 26 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7be
> -rw-r--r--. 1 root i2e 8790 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c2
> -rw-r--r--. 1 root i2e 101888 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c3
> -rw-r--r--. 1 root i2e 32783 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c4
> -rw-r--r--. 1 root i2e 23040 Sep 19 07:22
> drl?versionLabel=CURRENT&objectId=090000018000f7c1
> -rw-r--r--. 1 root i2e 26112 Sep 19 07:22
> drl?versionLabel=CURRENT&objectId=090000018000f7bf
> -rw-r--r--. 1 root i2e 0 Sep 19 07:23
> drl?versionLabel=CURRENT&objectId=090000018000f7c7{code}
> In DM, I move a file from F to another folder. (Right click, add to
> clipboard, go to new folder, Edit> Move here).
> The file shows as modified (07:25), but is still apparently in F (i.e. in the
> Path my MF job is looking at):
> {code:java}
> -rw-r--r--. 1 root i2e 12541 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c0
> -rw-r--r--. 1 root i2e 26 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7be
> -rw-r--r--. 1 root i2e 8790 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c2
> -rw-r--r--. 1 root i2e 101888 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c3
> -rw-r--r--. 1 root i2e 23040 Sep 19 07:22
> drl?versionLabel=CURRENT&objectId=090000018000f7c1
> -rw-r--r--. 1 root i2e 26112 Sep 19 07:22
> drl?versionLabel=CURRENT&objectId=090000018000f7bf
> -rw-r--r--. 1 root i2e 0 Sep 19 07:23
> drl?versionLabel=CURRENT&objectId=090000018000f7c7
> -rw-r--r--. 1 root i2e 32783 Sep 19 07:25
> drl?versionLabel=CURRENT&objectId=090000018000f7c4{code}
> In DM, I move a file from another folder to F and it shows up with the
> timestamp of the move (07:28):
> {code:java}
> -rw-r--r--. 1 root i2e 12541 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c0
> -rw-r--r--. 1 root i2e 26 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7be
> -rw-r--r--. 1 root i2e 8790 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c2
> -rw-r--r--. 1 root i2e 101888 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c3
> -rw-r--r--. 1 root i2e 23040 Sep 19 07:22
> drl?versionLabel=CURRENT&objectId=090000018000f7c1
> -rw-r--r--. 1 root i2e 26112 Sep 19 07:22
> drl?versionLabel=CURRENT&objectId=090000018000f7bf
> -rw-r--r--. 1 root i2e 0 Sep 19 07:23
> drl?versionLabel=CURRENT&objectId=090000018000f7c7
> -rw-r--r--. 1 root i2e 32783 Sep 19 07:25
> drl?versionLabel=CURRENT&objectId=090000018000f7c4
> -rw-r--r--. 1 root i2e 191513 Sep 19 07:28
> drl?versionLabel=CURRENT&objectId=09000001800045b9{code}
> But if I immediately move it out in DM then, again, the timestamp (07:30)
> alters but the file apparently remains:
> {code:java}
> -rw-r--r--. 1 root i2e 12541 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c0
> -rw-r--r--. 1 root i2e 26 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7be
> -rw-r--r--. 1 root i2e 8790 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c2
> -rw-r--r--. 1 root i2e 101888 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c3
> -rw-r--r--. 1 root i2e 23040 Sep 19 07:22
> drl?versionLabel=CURRENT&objectId=090000018000f7c1
> -rw-r--r--. 1 root i2e 26112 Sep 19 07:22
> drl?versionLabel=CURRENT&objectId=090000018000f7bf
> -rw-r--r--. 1 root i2e 0 Sep 19 07:23
> drl?versionLabel=CURRENT&objectId=090000018000f7c7
> -rw-r--r--. 1 root i2e 32783 Sep 19 07:25
> drl?versionLabel=CURRENT&objectId=090000018000f7c4
> -rw-r--r--. 1 root i2e 191513 Sep 19 07:30
> drl?versionLabel=CURRENT&objectId=09000001800045b9{code}
> In DM, I now delete all visible content in F. The files that were moved out
> of F, and are not visible in F in DM, remain on the file system:
> {code:java}
> -rw-r--r--. 1 root i2e 0 Sep 19 07:23
> drl?versionLabel=CURRENT&objectId=090000018000f7c7
> -rw-r--r--. 1 root i2e 32783 Sep 19 07:25
> drl?versionLabel=CURRENT&objectId=090000018000f7c4
> -rw-r--r--. 1 root i2e 191513 Sep 19 07:30
> drl?versionLabel=CURRENT&objectId=09000001800045b9
> -rw-r--r--. 1 root i2e 0 Sep 19 07:31
> drl?versionLabel=CURRENT&objectId=090000018000f7c2
> -rw-r--r--. 1 root i2e 0 Sep 19 07:31
> drl?versionLabel=CURRENT&objectId=090000018000f7be
> -rw-r--r--. 1 root i2e 0 Sep 19 07:31
> drl?versionLabel=CURRENT&objectId=090000018000f7c0
> -rw-r--r--. 1 root i2e 0 Sep 19 07:31
> drl?versionLabel=CURRENT&objectId=090000018000f7c1
> -rw-r--r--. 1 root i2e 0 Sep 19 07:31
> drl?versionLabel=CURRENT&objectId=090000018000f7bf
> -rw-r--r--. 1 root i2e 0 Sep 19 07:31
> drl?versionLabel=CURRENT&objectId=090000018000f7c3{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)