Re: [Dspace-tech] Encoding problem in discovery search results, xmlui

2013-09-16 Thread Alcides Carlos de Moraes Neto
A successful execution of update-discovery-index -b with the proper LANG environment variable (pt_BR, UTF-8) seems to have fixed the issue. Ats, Alcides Carlos de Moraes Neto Sometimes I think we're alone. Sometimes I think we're not. In either case, the thought is staggering. - R. Buckminster

Re: [Dspace-tech] Encoding problem in discovery search results, xmlui

2013-09-11 Thread Alcides Carlos de Moraes Neto
As I suspected, it's the SOLR index that's messed up. Executing this SOLR query:

Re: [Dspace-tech] Encoding problem in discovery search results, xmlui

2013-09-10 Thread Alcides Carlos de Moraes Neto
I ran update-discovery-index -f, but the results still show encoding issues. http://www2.senado.leg.br/bdsf/discover?filtertype_0=typefilter_relational_operator_0=notequalsfilter_0=not%C3%ADcia+de+jornalsubmit_apply_filter=Aplicarquery=andr%C3%A9+luiz+lopes+de+alcantara I'm stumped right now,

Re: [Dspace-tech] Encoding problem in discovery search results, xmlui

2013-09-06 Thread Alcides Carlos de Moraes Neto
Just a follow up. filter-media -f seems to have fixed the issue with the OCR txt. But some search results still show encoding issues. I believe I need to regenerate the solr index. Ats, Alcides Carlos de Moraes Neto Sometimes I think we're alone. Sometimes I think we're not. In either case, the

Re: [Dspace-tech] Encoding problem in discovery search results, xmlui

2013-09-03 Thread Tiago Rodrigo Marçal Murakami
Hi Alcides, We edit the Tomcat file server.xml to force UTF-8: Connector port=8080 protocol=HTTP/1.1 connectionTimeout=2 *URIEncoding=UTF-8* redirectPort=8443 / Att, Tiago R. M. Murakami Comunicação Científica e Acadêmica Departamento Técnico -

Re: [Dspace-tech] Encoding problem in discovery search results, xmlui

2013-09-03 Thread Alcides Carlos de Moraes Neto
Hello helix, thank you for your input. Indeed, it is a problem with the filter-media generated txt. A filter-media -f resolved the issue for this specific item. I scheduled a full filter-media -f of the repository tonight. Ats, Alcides Carlos de Moraes Neto 2013/9/3 helix84

Re: [Dspace-tech] Encoding problem in discovery search results, xmlui

2013-09-03 Thread Alcides Carlos de Moraes Neto
Thank you all, Tomcat is set to URIEncoding=UTF-8, so that's not the issue. I'm suspecting that filter-media is generating invalid .txt but haven't found anything yet. Ats, Alcides Carlos de Moraes Neto 2013/9/3 Tiago Rodrigo Marçal Murakami tiago.murak...@dt.sibi.usp.br Hi Alcides, We

Re: [Dspace-tech] Encoding problem in discovery search results, xmlui

2013-09-03 Thread helix84
On Tue, Sep 3, 2013 at 1:24 AM, Alcides Carlos de Moraes Neto alcides.n...@gmail.com wrote: I have checked the .txt media-filter generates, they are all UTF-8. What I see (see attachment) looks like double-encoded UTF-8 (it happens when a charset converter is told that a file is to be encoded

[Dspace-tech] Encoding problem in discovery search results, xmlui

2013-09-02 Thread Alcides Carlos de Moraes Neto
Hello all, We have this problem with our current dspace 3.1 installation. Discovery search results show some invalid characters due to encoding issues. Only the full text search/highlight portion of the results has this problem. Example:

Re: [Dspace-tech] Encoding problem in discovery search results, xmlui

2013-09-02 Thread Keir Vaughan-Taylor
Is it a problem that the extension is pdf.txt ? On Mon, 2013-09-02 at 20:24 -0300, Alcides Carlos de Moraes Neto wrote: Hello all, We have this problem with our current dspace 3.1 installation. Discovery search results show some invalid characters due to encoding issues. Only the full