[ 
https://issues.apache.org/jira/browse/CONNECTORS-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569211#comment-15569211
 ] 

Konstantin Avdeev commented on CONNECTORS-1325:
-----------------------------------------------

hi Karl,

I think, the issue can be reproduced easily, by putting an emoji (e.g. 😀) into 
a field of a task list:

{code}
DEBUG 2016-10-12 18:32:47,521 (Worker thread '72') - SharePoint: getListItems 
FileRef value 'sites/test-team/Lists/Main Task List/5_.000', xml response: 
'<ns1:listitems xmlns:s="uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882" 
xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" 
xmlns:rs="urn:schemas-microsoft-com:rowset" xmlns:z="#RowsetSchema" 
xmlns:ns1="http://schemas.microsoft.com/sharepoint/soap/";>
<rs:data ItemCount="1">
   <z:row ows_Modified="2016-10-12 17:30:55" ows_Created="2016-10-12 17:30:55" 
ows_ID="5" ows_GUID="{E583E8D8-52A7-4CD8-8A5F-6354D57D1E40}" ows_MetaInfo="5;#" 
ows__ModerationStatus="0" ows__Level="1" ows_Title="Task emoji >>>😀<<<" 
ows_UniqueId="5;#{8F6DF977-9814-4AA0-B7AE-E29838C508CF}" 
ows_owshiddenversion="1" ows_FSObjType="5;#0" ows_PermMask="0x7fffffffffffffff" 
ows_FileRef="5;#sites/test-team/Lists/Main Task List/5_.000"/>
</rs:data>
</ns1:listitems>'
DEBUG 2016-10-12 18:32:47,522 (Worker thread '72') - SharePoint: Can't get 
version of '/Main Task List///5_.000' because of bad XML characters(?)
{code}

Thanks!

> Invalid XML character causing job to abort
> ------------------------------------------
>
>                 Key: CONNECTORS-1325
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1325
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: SharePoint connector
>    Affects Versions: ManifoldCF 2.3
>            Reporter: Phil
>            Assignee: Karl Wright
>            Priority: Blocker
>             Fix For: ManifoldCF 2.5
>
>         Attachments: CONNECTORS-1325-2.patch, CONNECTORS-1325-3.patch, 
> CONNECTORS-1325.patch
>
>
> The following error is causing the Manifold job to abort, and subsequently 
> the job not being able to finish.
> It would be good to have the crawler log this error, but not throw an 
> exception which causes the entire job to stop.
> {code}
> ERROR 2016-06-21 19:01:54,562 (Worker thread '6') system.WorkerThread - 
> Exception tossed: XML parsing error: Character reference "&#xD83D" is an 
> invalid XML character.
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: XML parsing error: 
> Character reference "&#xD83D" is an invalid XML character.
>         at org.apache.manifoldcf.core.common.XMLDoc.init(XMLDoc.java:390)
>         at org.apache.manifoldcf.core.common.XMLDoc.<init>(XMLDoc.java:286)
>         at 
> org.apache.manifoldcf.crawler.connectors.sharepoint.SPSProxyHelper.getFieldValues(SPSProxyHelper.java:2039)
>         at 
> org.apache.manifoldcf.crawler.connectors.sharepoint.SharePointRepository.processDocuments(SharePointRepository.java:974)
>         at 
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399)
> Caused by: org.xml.sax.SAXParseException; lineNumber: 18; columnNumber: 64; 
> Character reference "&#xD83D" is an invalid XML character.
>         at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
>         at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
>         at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121)
>         at org.apache.manifoldcf.core.common.XMLDoc.init(XMLDoc.java:359)
>         ... 4 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to