Hi Marisol Could you mind to create a ticket and provide a patch?
This way we can test it in our ends and include it for the next Manifold release. Thanks Regards 2017-06-01 16:28 GMT+02:00 Marisol Redondo <[email protected] >: > I fixed the problem. > > The problem is that the Confluence connector is getting the entity of the > request with the default encoding ("ISO-8859-1"), and not UTF-8. > > To fix that, I made a change in the Confluence connector, and each time is > reading the request's entity I use EntityUtils.toString(entity,*"UTF-8"*) > > Thanks > > > On 31 May 2017 at 10:13, Marisol Redondo <[email protected] > > wrote: > >> Hi. >> >> I'm having problems with the encoding when injecting in Solr 6 in >> standalone mode from a Confluence wiki. >> >> I have Manifold 2.5 with Tomcat-8. >> >> The repository connector from the job take the information from a >> Confluence wiki and the output connector is Solr, using the Tika >> transformation, a custom transformation and a Metadata adjuster. >> >> When the document is injected into solr, the content of the document has >> some character that shouldn't be there because are not in the confluence >> page, mainly a  character. >> >> I have checked that confluence, the tomcat server when manifold is >> running, the http request to confluence has the Accept-Charset header set >> to UTF-8, the solr server is acepting UTF8. >> >> In the log, I have seen that when retrieving the information from >> confluence, the content is fine, and when it's sending the information to >> solr, it has the character. I have tried without using any transfomer and >> getting the same log entry. >> >> Is this a bug or how can I resolve this? >> >> Thanks for your help >> >> >> >> >> >
