[
https://issues.apache.org/jira/browse/CONNECTORS-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16578357#comment-16578357
]
James Thomas commented on CONNECTORS-1517:
------------------------------------------
Hi Karl, I have now had a chance to try out the patches. I'll attach a
transcript which shows the queries executed (from manifoldcf.log) when I ran a
job with particular configuration in the Content Types tab of the Documentum
Connector.
My observations and thoughts:
* The core bug that I reported - that editing the Content Types tab and then
resetting it results in different semantics at search time appears fixed.
* The default search is still unconstrained.
* It is surprising to be able to have both "No content type restriction" and
any other checkbox checked at the same time. I wonder whether "No content type
restriction" checked should disable all of the others?
* It is surprising to be able to submit with no checkboxes checked and have a
query constrained with "1<0" as it looks like this can never succeed
* Generalising the above, I think I'd prefer to see some more restriction on
what combinations of box can be checked at the same time.
* It would be convenient as a user to have a control for check all/uncheck all
* I haven't been using ManifoldCF/Documentum long enough to know whether there
are likely to be backwards compatibility issues in changing the UI this way
[^Notes.txt]
> Documentum Connector uses different "unconstrained" a_content_type filters
> depending on whether the Content Types tab has been edited
> -------------------------------------------------------------------------------------------------------------------------------------
>
> Key: CONNECTORS-1517
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1517
> Project: ManifoldCF
> Issue Type: Bug
> Components: Documentum connector
> Affects Versions: ManifoldCF 2.10
> Reporter: James Thomas
> Assignee: Karl Wright
> Priority: Major
> Fix For: ManifoldCF 2.11
>
> Attachments: CONNECTORS-1517-2.patch, CONNECTORS-1517.patch, Notes.txt
>
>
> I am using Manifold 2.10 patched for issue
> https://issues.apache.org/jira/browse/CONNECTORS-1512
> I find that the "unconstrained" query submitted to Documentum differs
> depending on whether the Content Types in the job have been edited or not.
> This can dramatically affect which files are fetched. After editing, there
> are likely to be fewer.
> For example, having simply created a job connecting to DM and setting only
> the Paths value to Administrator/james the following request is generated.
> (Taken from manifoldcf.log).
> Note that there are no a_content_type constraints (and my line break for
> readibility):
> {code:java}
> DEBUG 2018-07-26T05:52:56,422 (Startup thread) - DCTM: About to execute
> query= (select for READ distinct i_chronicle_id from dm_document where
> r_modify_date >= date('01/01/1970 01:00:00','mm/dd/yyyy hh:mi:ss') and
> r_modify_date<=date('07/26/2018 05:52:56','mm/dd/yyyy hh:mi:ss') AND
> (i_is_deleted=TRUE Or (i_is_deleted=FALSE AND a_full_text=TRUE AND
> r_content_size>0))
> AND ( Folder('/Administrator/james', DESCEND) ))
> {code}
> Once the Content Types tab has been edited (e.g. to remove the 123w type) it
> looks like this, i.e. the search constrains to only the selected types (my
> ellipsis for readibility):
> {code:java}
> DEBUG 2018-07-26T05:58:36,755 (Startup thread) - DCTM: About to execute
> query= (select for READ distinct i_chronicle_id from dm_document where
> r_modify_date >= date('01/01/1970 01:00:00','mm/dd/yyyy hh:mi:ss') and
> r_modify_date<=date('07/26/2018 05:58:36','mm/dd/yyyy hh:mi:ss') AND
> (i_is_deleted=TRUE Or (i_is_deleted=FALSE AND a_full_text=TRUE AND
> r_content_size>0
> AND a_content_type IN ('acad', ... 'zip_pub_html')))
> AND ( Folder('/Administrator/james', DESCEND) ))
> {code}
> If the 123w type is now reselected in the Content Types tab, the search adds
> it to the list of a_content_type entries, but doesn't return to the
> unconstrained initial search:
> {code:java}
> DEBUG 2018-07-26T05:59:16,863 (Startup thread) - DCTM: About to execute
> query= (select for READ distinct i_chronicle_id from dm_document where
> r_modify_date >= date('01/01/1970 01:00:00','mm/dd/yyyy hh:mi:ss') and
> r_modify_date<=date('07/26/2018 05:59:16','mm/dd/yyyy hh:mi:ss') AND
> (i_is_deleted=TRUE Or (i_is_deleted=FALSE AND a_full_text=TRUE AND
> r_content_size>0
> AND a_content_type IN ('123w', ... 'zip_pub_html')))
> AND ( Folder('/Administrator/james', DESCEND) ))
> {code}
> This means that running what appears to be an equivalent job several times
> may not fetch the same set of documents from Documentum.
> I expect that the same configuration in the UI produces the same search to
> Documentum, regardless of how the configuration was arrived at.
> If the selected items in the Content Types list is treated as the only set of
> files to fetch (i,.e. the initial unconstrained search is considered
> incorrect here) then I guess I might also like to have flexibility to fetch
> file types not on the checklist in the Content Types tab.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)