[
https://issues.apache.org/jira/browse/CONNECTORS-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575661#comment-14575661
]
Karl Wright commented on CONNECTORS-1203:
-----------------------------------------
Hi Dale,
In order to actually see what's going on, we will need to change a header in
the request that HttpClient is making to the server.
It turns out that httpclient's current default behavior is now to push an
"Accept-Encoding: bzip,deflate" header on all requests unless told otherwise.
So we will need to tell it otherwise.
In the file:
framework/connector-common/src/main/java/org/apache/manifoldcf/connectorcommon/common/CommonsHTTPSender.java
around line 312, you will see:
{code}
method.setHeader(new BasicHeader("Accept","*/*"));
{code}
Please add a line:
{code}
method.setHeader(new BasicHeader("Accept-Encoding",""));
{code}
You will then need to rebuild, and repeat your data gathering. You should see
XML going back and forth then, rather than gobbledegook.
Thanks!
> Erratic handling of Sharepoint 2010 _ModerationStatus metadata
> --------------------------------------------------------------
>
> Key: CONNECTORS-1203
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1203
> Project: ManifoldCF
> Issue Type: Bug
> Components: SharePoint connector
> Affects Versions: ManifoldCF 1.7.2
> Reporter: Dale Dreiske
> Assignee: Karl Wright
> Fix For: ManifoldCF 1.10, ManifoldCF 2.2
>
> Attachments: debug7.log
>
>
> The ManifoldCF Sharepoint 2010 connector handles the Approval Status metadata
> inconsistently. In some cases it does not provide access to Approval Status
> at all.
> On /mcf-crawler-ui/execute.jsp#metapathwidget :
> * The field name appears in the drop down list as "Approval Status" when
> adding a new rule.
> * The field name is NOT available in the drop down list for top level sites.
> * The field name is listed as "_ModerationStatus" for existing rules.
> With connector debug turned on, the ManifoldCF logs show the field coming
> from Sharepoint as "ows__ModerationStatus". This is consistent across all
> pages, even when the the field is not added to the metadata rules.
> When sent to Solr, it appears in one of these 4 forms:
> * "ows__ModerationStatus"
> * "_ModerationStatus"
> * "_moderationstatus"
> * In some cases it does not get passed at all.
> This issue is most troublesome when this field is not displayed for creating
> new metadata rules. It appears it is only available when creating rules for
> pages in low level sites. Example paths:
> * /abc - does not work for top level sharepoint sites
> * /abc/xyz - works but passes name inconsistently;
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)