[
https://issues.apache.org/jira/browse/CONNECTORS-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16633333#comment-16633333
]
James Thomas commented on CONNECTORS-1532:
------------------------------------------
{quote}I have attached a patch for you to try. Please let me know if it
addresses the folder issue
{quote}
[[email protected]] I don't see any significant change in behaviour using
the same repro steps as comment-16623888.
The file is still shown on the file system at full size after the second run of
the job. Here are the log and file system details of this run:
{code:java}
INFO 2018-09-30T09:34:38.276Z (Startup thread) - Preparing non-continuous
non-partial, either MODEL_ALL or fromBeginningOfTime, 1538299958323 for run:
prepareFullScan
INFO 2018-09-30T09:35:58.444Z (Startup thread) - Preparing incremental scan for
1538299958323: prepareIncrementalScan
-rw-r--r--. 1 root root 27 Sep 30 11:34
drl?versionLabel=CURRENT&objectId=090000018000fc6b
-rw-r--r--. 1 root root 27 Sep 30 11:36
drl?versionLabel=CURRENT&objectId=090000018000fc6b{code}
I went on and repeated the repro in the same state just to see what might
happen (essentially the same), then I reset seeding on the job and ran it
again. Here's the file system and logs for that:
{code:java}
INFO 2018-09-30T09:48:51.303Z (Startup thread) - Preparing incremental scan
for 1538299958323: prepareIncrementalScan
INFO 2018-09-30T09:50:06.483Z (Startup thread) - Preparing incremental scan
for 1538299958323: prepareIncrementalScan
INFO 2018-09-30T09:51:06.800Z (Startup thread) - Preparing non-continuous
non-partial, either MODEL_ALL or fromBeginningOfTime, 1538299958323 for run:
prepareFullScan
$ ### after adding another file
-rw-r--r--. 1 root root 27 Sep 30 11:36
drl?versionLabel=CURRENT&objectId=090000018000fc6b
-rw-r--r--. 1 root root 27 Sep 30 11:49
drl?versionLabel=CURRENT&objectId=090000018000fc6c
$ ### after running the job again
-rw-r--r--. 1 root root 27 Sep 30 11:36
drl?versionLabel=CURRENT&objectId=090000018000fc6b
-rw-r--r--. 1 root root 27 Sep 30 11:50
drl?versionLabel=CURRENT&objectId=090000018000fc6c
$ ## after resetting seeding and running the job
-rw-r--r--. 1 root root 0 Sep 30 11:51
drl?versionLabel=CURRENT&objectId=090000018000fc6b
-rw-r--r--. 1 root root 0 Sep 30 11:51
drl?versionLabel=CURRENT&objectId=090000018000fc6c
{code}
So it appears that reseeding can give the desired outcome.
FYI, I applied this patch on top of the logging patch from this ticket, which
is itself on top of 2.10 patched for #1512, #1517:
{code:java}
$ wget
https://issues.apache.org/jira/secure/attachment/12940883/CONNECTORS-1532.patch
$ dos2unix CONNECTORS-1532.patch
$ dos2unix
connectors/documentum/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/DCTM/DCTM.java
$ patch -p0 -i CONNECTORS-1532.patch
$ ant build
$ find dist -type f -exec ls -l {} \; > /tmp/diff{code}
Here's the set of files that changed after I applied your patch:
{code:java}
$ grep Sep\ 30 /tmp/diff | grep jar
-rw-rw-r-- 1 james staff 12544 Sep 30 10:01
dist/connector-lib/mcf-documentum-connector-rmistub.jar
-rw-rw-r-- 1 james staff 100864 Sep 30 10:01
dist/connector-lib/mcf-documentum-connector.jar
-rw-rw-r-- 1 james staff 6292 Sep 30 10:01
dist/connector-lib/mcf-filenet-connector-rmistub.jar
-rw-rw-r-- 1 james staff 3916082 Sep 30 10:02
dist/connector-lib/mcf-meridio-connector.jar
-rw-rw-r-- 1 james staff 838582 Sep 30 10:02
dist/connector-lib/mcf-sharepoint-connector.jar
-rw-rw-r-- 1 james staff 8567 Sep 30 10:01
dist/processes/documentum-server/lib/mcf-documentum-connector-rmiskel.jar
-rw-rw-r-- 1 james staff 4494 Sep 30 10:01
dist/processes/filenet-server/lib/mcf-filenet-connector-rmiskel.jar
{code}
I stopped my MFC instance and the DM processes, applied the changed files, and
restarted DM processes and MFC server. Then attempted the repro I described
above.
It's possible that I haven't applied the patch correctly. Is there something I
can do to check?
> Moving a file outside of the job's Paths is not the same as deleting it
> -----------------------------------------------------------------------
>
> Key: CONNECTORS-1532
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1532
> Project: ManifoldCF
> Issue Type: Bug
> Components: Documentum connector
> Affects Versions: ManifoldCF 2.10
> Environment: Manifold 2.10 patched for #1512, #1517
> Reporter: James Thomas
> Assignee: Karl Wright
> Priority: Major
> Fix For: ManifoldCF 2.12
>
> Attachments: 2018-09-19_1758.png, CONNECTORS-1532.patch,
> logging_patch.diff
>
>
> If I have a MF job which is connecting a specific folder, F, in Documentum to
> a File System output then:
> 1. deleting files in Documentum shows them as zero size in the file system
> 2. moving files out of F does not remove them or zero them in the file system
> Note that moving a file from another folder (which the job is not looking at)
> to F has the same effect as adding it to F by e.g. importing it in DM or
> POSTing it to DM via the REST interface.
> Intuitively, I expect that moving a file out of the "view" of the Documentum
> connector would have the same effect on the File System as deleting it. (My
> model here is of MF synchronising content between the Paths (DM) and the
> Output Path (File System) that I have specified in the job.)
> Starting point, I have run the MF job to fetch a bunch of files from a folder
> - call it F - in DM (i.e. I have configured Paths in the job to be F). This
> is what 'ls -l' on the file system looks like:
> {code:java}
> -rw-r--r--. 1 root i2e 12541 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c0
> -rw-r--r--. 1 root i2e 26 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7be
> -rw-r--r--. 1 root i2e 85772 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c7
> -rw-r--r--. 1 root i2e 8790 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c2
> -rw-r--r--. 1 root i2e 101888 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c3
> -rw-r--r--. 1 root i2e 32783 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c4
> -rw-r--r--. 1 root i2e 23040 Sep 19 07:22
> drl?versionLabel=CURRENT&objectId=090000018000f7c1
> -rw-r--r--. 1 root i2e 26112 Sep 19 07:22
> drl?versionLabel=CURRENT&objectId=090000018000f7bf{code}
> In DM, I delete one of the files in F and it shows as zero size, and the
> modification date has changed:
> {code:java}
> -rw-r--r--. 1 root i2e 12541 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c0
> -rw-r--r--. 1 root i2e 26 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7be
> -rw-r--r--. 1 root i2e 8790 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c2
> -rw-r--r--. 1 root i2e 101888 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c3
> -rw-r--r--. 1 root i2e 32783 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c4
> -rw-r--r--. 1 root i2e 23040 Sep 19 07:22
> drl?versionLabel=CURRENT&objectId=090000018000f7c1
> -rw-r--r--. 1 root i2e 26112 Sep 19 07:22
> drl?versionLabel=CURRENT&objectId=090000018000f7bf
> -rw-r--r--. 1 root i2e 0 Sep 19 07:23
> drl?versionLabel=CURRENT&objectId=090000018000f7c7{code}
> In DM, I move a file from F to another folder. (Right click, add to
> clipboard, go to new folder, Edit> Move here).
> The file shows as modified (07:25), but is still apparently in F (i.e. in the
> Path my MF job is looking at):
> {code:java}
> -rw-r--r--. 1 root i2e 12541 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c0
> -rw-r--r--. 1 root i2e 26 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7be
> -rw-r--r--. 1 root i2e 8790 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c2
> -rw-r--r--. 1 root i2e 101888 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c3
> -rw-r--r--. 1 root i2e 23040 Sep 19 07:22
> drl?versionLabel=CURRENT&objectId=090000018000f7c1
> -rw-r--r--. 1 root i2e 26112 Sep 19 07:22
> drl?versionLabel=CURRENT&objectId=090000018000f7bf
> -rw-r--r--. 1 root i2e 0 Sep 19 07:23
> drl?versionLabel=CURRENT&objectId=090000018000f7c7
> -rw-r--r--. 1 root i2e 32783 Sep 19 07:25
> drl?versionLabel=CURRENT&objectId=090000018000f7c4{code}
> In DM, I move a file from another folder to F and it shows up with the
> timestamp of the move (07:28):
> {code:java}
> -rw-r--r--. 1 root i2e 12541 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c0
> -rw-r--r--. 1 root i2e 26 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7be
> -rw-r--r--. 1 root i2e 8790 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c2
> -rw-r--r--. 1 root i2e 101888 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c3
> -rw-r--r--. 1 root i2e 23040 Sep 19 07:22
> drl?versionLabel=CURRENT&objectId=090000018000f7c1
> -rw-r--r--. 1 root i2e 26112 Sep 19 07:22
> drl?versionLabel=CURRENT&objectId=090000018000f7bf
> -rw-r--r--. 1 root i2e 0 Sep 19 07:23
> drl?versionLabel=CURRENT&objectId=090000018000f7c7
> -rw-r--r--. 1 root i2e 32783 Sep 19 07:25
> drl?versionLabel=CURRENT&objectId=090000018000f7c4
> -rw-r--r--. 1 root i2e 191513 Sep 19 07:28
> drl?versionLabel=CURRENT&objectId=09000001800045b9{code}
> But if I immediately move it out in DM then, again, the timestamp (07:30)
> alters but the file apparently remains:
> {code:java}
> -rw-r--r--. 1 root i2e 12541 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c0
> -rw-r--r--. 1 root i2e 26 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7be
> -rw-r--r--. 1 root i2e 8790 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c2
> -rw-r--r--. 1 root i2e 101888 Sep 19 07:21
> drl?versionLabel=CURRENT&objectId=090000018000f7c3
> -rw-r--r--. 1 root i2e 23040 Sep 19 07:22
> drl?versionLabel=CURRENT&objectId=090000018000f7c1
> -rw-r--r--. 1 root i2e 26112 Sep 19 07:22
> drl?versionLabel=CURRENT&objectId=090000018000f7bf
> -rw-r--r--. 1 root i2e 0 Sep 19 07:23
> drl?versionLabel=CURRENT&objectId=090000018000f7c7
> -rw-r--r--. 1 root i2e 32783 Sep 19 07:25
> drl?versionLabel=CURRENT&objectId=090000018000f7c4
> -rw-r--r--. 1 root i2e 191513 Sep 19 07:30
> drl?versionLabel=CURRENT&objectId=09000001800045b9{code}
> In DM, I now delete all visible content in F. The files that were moved out
> of F, and are not visible in F in DM, remain on the file system:
> {code:java}
> -rw-r--r--. 1 root i2e 0 Sep 19 07:23
> drl?versionLabel=CURRENT&objectId=090000018000f7c7
> -rw-r--r--. 1 root i2e 32783 Sep 19 07:25
> drl?versionLabel=CURRENT&objectId=090000018000f7c4
> -rw-r--r--. 1 root i2e 191513 Sep 19 07:30
> drl?versionLabel=CURRENT&objectId=09000001800045b9
> -rw-r--r--. 1 root i2e 0 Sep 19 07:31
> drl?versionLabel=CURRENT&objectId=090000018000f7c2
> -rw-r--r--. 1 root i2e 0 Sep 19 07:31
> drl?versionLabel=CURRENT&objectId=090000018000f7be
> -rw-r--r--. 1 root i2e 0 Sep 19 07:31
> drl?versionLabel=CURRENT&objectId=090000018000f7c0
> -rw-r--r--. 1 root i2e 0 Sep 19 07:31
> drl?versionLabel=CURRENT&objectId=090000018000f7c1
> -rw-r--r--. 1 root i2e 0 Sep 19 07:31
> drl?versionLabel=CURRENT&objectId=090000018000f7bf
> -rw-r--r--. 1 root i2e 0 Sep 19 07:31
> drl?versionLabel=CURRENT&objectId=090000018000f7c3{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)