Hi,

we have an Oracle database where we store Rtf content into a Clob column.
Now we try to index those records but we just want to get the plain text,
same as Tika does. I tried to use the TikaEntityProcessor but I’m getting
the following error message:

ClassCastException: java.io.StringReader cannot be cast to
java.io.InputStream

The configuration looks like this:

<dataSource name="f1" type="FieldReaderDataSource"/>

<entity name="SV_SOLVE_TXT" onError="continue" transformer="ClobTransformer"
query="select SOLUTION_ID, SOLUTION_TXT SOLUTION_TXT from IT_SOLUTION where
SOLUTION_ID = '${ts3_it_solution_text_search.SOLUTION_ID}'">
        <field name="text_4" column="SOLUTION_TXT" clob="true" />
        <entity name="tika_SOLUTION_TXT" onError="continue"
processor="TikaEntityProcessor" url="${SV_SOLVE_TXT.text_4}"
dataField="SV_SOLVE_TXT.text_4"  dataSource="f1" >
                <field name="text_1" column="text"/>
        </entity>
</entity>

Thx & Regards,
Torsten




--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Reply via email to