There's already a ticket for this, assigned to me. CONNECTORS-1251. I'll freshen it up.
Karl On Mon, Jun 12, 2017 at 2:52 PM, Furkan KAMACI <[email protected]> wrote: > Hi Marisol, > > You can create a ticket from here: https://issues.apache. > org/jira/projects/CONNECTORS > > Kind Regards, > Furkan KAMACI > > > 12 Haz 2017 Pzt, saat 18:25 tarihinde Marisol Redondo < > [email protected]> şunu yazdı: > >> How can I do that? >> >> On 1 June 2017 at 16:43, Antonio David Pérez Morales < >> [email protected]> wrote: >> >>> Hi Marisol >>> >>> Could you mind to create a ticket and provide a patch? >>> >>> This way we can test it in our ends and include it for the next Manifold >>> release. >>> >>> Thanks >>> >>> Regards >>> >>> 2017-06-01 16:28 GMT+02:00 Marisol Redondo < >>> [email protected]>: >>> >>>> I fixed the problem. >>>> >>>> The problem is that the Confluence connector is getting the entity of >>>> the request with the default encoding ("ISO-8859-1"), and not UTF-8. >>>> >>>> To fix that, I made a change in the Confluence connector, and each time >>>> is reading the request's entity I use EntityUtils.toString(entity, >>>> *"UTF-8"*) >>>> >>>> Thanks >>>> >>>> >>>> On 31 May 2017 at 10:13, Marisol Redondo <marisol.redondo.garcia@gmail. >>>> com> wrote: >>>> >>>>> Hi. >>>>> >>>>> I'm having problems with the encoding when injecting in Solr 6 in >>>>> standalone mode from a Confluence wiki. >>>>> >>>>> I have Manifold 2.5 with Tomcat-8. >>>>> >>>>> The repository connector from the job take the information from a >>>>> Confluence wiki and the output connector is Solr, using the Tika >>>>> transformation, a custom transformation and a Metadata adjuster. >>>>> >>>>> When the document is injected into solr, the content of the document >>>>> has some character that shouldn't be there because are not in the >>>>> confluence page, mainly a  character. >>>>> >>>>> I have checked that confluence, the tomcat server when manifold is >>>>> running, the http request to confluence has the Accept-Charset header set >>>>> to UTF-8, the solr server is acepting UTF8. >>>>> >>>>> In the log, I have seen that when retrieving the information from >>>>> confluence, the content is fine, and when it's sending the information to >>>>> solr, it has the character. I have tried without using any transfomer and >>>>> getting the same log entry. >>>>> >>>>> Is this a bug or how can I resolve this? >>>>> >>>>> Thanks for your help >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>> >>
