I ran update-discovery-index -f, but the results still show encoding issues.

http://www2.senado.leg.br/bdsf/discover?filtertype_0=type&filter_relational_operator_0=notequals&filter_0=not%C3%ADcia+de+jornal&submit_apply_filter=Aplicar&query=andr%C3%A9+luiz+lopes+de+alcantara

I'm stumped right now, what else can I do?

Ats,

Alcides Carlos de Moraes Neto
"Sometimes I think we're alone. Sometimes I think we're not. In either
case, the thought is staggering."
- R. Buckminster Fuller


2013/9/6 Alcides Carlos de Moraes Neto <[email protected]>

> Just a follow up.
> filter-media -f seems to have fixed the issue with the OCR txt.
> But some search results still show encoding issues.
>
> I believe I need to regenerate the solr index.
>
> Ats,
>
> Alcides Carlos de Moraes Neto
> "Sometimes I think we're alone. Sometimes I think we're not. In either
> case, the thought is staggering."
> - R. Buckminster Fuller
>
>
> 2013/9/3 Alcides Carlos de Moraes Neto <[email protected]>
>
>> Hello helix, thank you for your input.
>>
>> Indeed, it is a problem with the filter-media generated txt.
>> A filter-media -f resolved the issue for this specific item. I scheduled
>> a full filter-media -f of the repository tonight.
>>
>>
>> Ats,
>>
>>
>> Alcides Carlos de Moraes Neto
>>
>>
>> 2013/9/3 helix84 <[email protected]>
>>
>>> On Tue, Sep 3, 2013 at 1:24 AM, Alcides Carlos de Moraes Neto
>>> <[email protected]> wrote:
>>> > I have checked the .txt media-filter generates, they are all UTF-8.
>>>
>>> What I see (see attachment) looks like double-encoded UTF-8 (it
>>> happens when a charset converter is told that a file is to be encoded
>>> from one character set to UTF-8, but it actually already was UTF-8) -
>>> which would seem like valid UTF-8 to a machine, but has nonsensical
>>> characters. What do you see?
>>>
>>> I don't have a solution yet, though.
>>>
>>> On Tue, Sep 3, 2013 at 2:28 AM, Keir Vaughan-Taylor <[email protected]>
>>> wrote:
>>> > Is it a problem that the extension is pdf.txt   ?
>>>
>>> No, that's normal.
>>>
>>>
>>> Regards,
>>> ~~helix84
>>>
>>> Compulsory reading: DSpace Mailing List Etiquette
>>> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
>>>
>>
>>
>
------------------------------------------------------------------------------
How ServiceNow helps IT people transform IT departments:
1. Consolidate legacy IT systems to a single system of record for IT
2. Standardize and globalize service processes across IT
3. Implement zero-touch automation to replace manual, redundant tasks
http://pubads.g.doubleclick.net/gampad/clk?id=51271111&iu=/4140/ostg.clktrk
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Reply via email to