Committed a fix. Karl
On Mon, Jun 12, 2017 at 7:27 PM, Karl Wright <[email protected]> wrote: > There's already a ticket for this, assigned to me. CONNECTORS-1251. I'll > freshen it up. > > Karl > > > > > On Mon, Jun 12, 2017 at 2:52 PM, Furkan KAMACI <[email protected]> > wrote: > >> Hi Marisol, >> >> You can create a ticket from here: https://issues.apache.or >> g/jira/projects/CONNECTORS >> >> Kind Regards, >> Furkan KAMACI >> >> >> 12 Haz 2017 Pzt, saat 18:25 tarihinde Marisol Redondo < >> [email protected]> şunu yazdı: >> >>> How can I do that? >>> >>> On 1 June 2017 at 16:43, Antonio David Pérez Morales < >>> [email protected]> wrote: >>> >>>> Hi Marisol >>>> >>>> Could you mind to create a ticket and provide a patch? >>>> >>>> This way we can test it in our ends and include it for the next >>>> Manifold release. >>>> >>>> Thanks >>>> >>>> Regards >>>> >>>> 2017-06-01 16:28 GMT+02:00 Marisol Redondo < >>>> [email protected]>: >>>> >>>>> I fixed the problem. >>>>> >>>>> The problem is that the Confluence connector is getting the entity of >>>>> the request with the default encoding ("ISO-8859-1"), and not UTF-8. >>>>> >>>>> To fix that, I made a change in the Confluence connector, and each >>>>> time is reading the request's entity I use EntityUtils.toString(entit >>>>> y,*"UTF-8"*) >>>>> >>>>> Thanks >>>>> >>>>> >>>>> On 31 May 2017 at 10:13, Marisol Redondo < >>>>> [email protected]> wrote: >>>>> >>>>>> Hi. >>>>>> >>>>>> I'm having problems with the encoding when injecting in Solr 6 in >>>>>> standalone mode from a Confluence wiki. >>>>>> >>>>>> I have Manifold 2.5 with Tomcat-8. >>>>>> >>>>>> The repository connector from the job take the information from a >>>>>> Confluence wiki and the output connector is Solr, using the Tika >>>>>> transformation, a custom transformation and a Metadata adjuster. >>>>>> >>>>>> When the document is injected into solr, the content of the document >>>>>> has some character that shouldn't be there because are not in the >>>>>> confluence page, mainly a  character. >>>>>> >>>>>> I have checked that confluence, the tomcat server when manifold is >>>>>> running, the http request to confluence has the Accept-Charset header set >>>>>> to UTF-8, the solr server is acepting UTF8. >>>>>> >>>>>> In the log, I have seen that when retrieving the information from >>>>>> confluence, the content is fine, and when it's sending the information to >>>>>> solr, it has the character. I have tried without using any transfomer and >>>>>> getting the same log entry. >>>>>> >>>>>> Is this a bug or how can I resolve this? >>>>>> >>>>>> Thanks for your help >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >
