Just a follow up.
filter-media -f seems to have fixed the issue with the OCR txt.
But some search results still show encoding issues.

I believe I need to regenerate the solr index.

Ats,

Alcides Carlos de Moraes Neto
"Sometimes I think we're alone. Sometimes I think we're not. In either
case, the thought is staggering."
- R. Buckminster Fuller


2013/9/3 Alcides Carlos de Moraes Neto <[email protected]>

> Hello helix, thank you for your input.
>
> Indeed, it is a problem with the filter-media generated txt.
> A filter-media -f resolved the issue for this specific item. I scheduled a
> full filter-media -f of the repository tonight.
>
>
> Ats,
>
>
> Alcides Carlos de Moraes Neto
>
>
> 2013/9/3 helix84 <[email protected]>
>
>> On Tue, Sep 3, 2013 at 1:24 AM, Alcides Carlos de Moraes Neto
>> <[email protected]> wrote:
>> > I have checked the .txt media-filter generates, they are all UTF-8.
>>
>> What I see (see attachment) looks like double-encoded UTF-8 (it
>> happens when a charset converter is told that a file is to be encoded
>> from one character set to UTF-8, but it actually already was UTF-8) -
>> which would seem like valid UTF-8 to a machine, but has nonsensical
>> characters. What do you see?
>>
>> I don't have a solution yet, though.
>>
>> On Tue, Sep 3, 2013 at 2:28 AM, Keir Vaughan-Taylor <[email protected]>
>> wrote:
>> > Is it a problem that the extension is pdf.txt   ?
>>
>> No, that's normal.
>>
>>
>> Regards,
>> ~~helix84
>>
>> Compulsory reading: DSpace Mailing List Etiquette
>> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
>>
>
>
------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58041391&iu=/4140/ostg.clktrk
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Reply via email to