Re: DIH render html entities

2011-06-01 Thread Alexey Serba
Maybe HTMLStripTransformer is what you are looking for.

* http://wiki.apache.org/solr/DataImportHandler#HTMLStripTransformer

On Tue, May 31, 2011 at 5:35 PM, Erick Erickson erickerick...@gmail.com wrote:
 Convert them to what? Individual fields in your docs? Text?

 If the former, you might get some joy from the XpathEntityProcessor.
 If you want to just strip the markup and index all the content you
 might get some joy from the various *html* analyzers listed here:
 http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

 Best
 Erick

 On Fri, May 27, 2011 at 5:19 AM, anass talby anass.ta...@gmail.com wrote:
 Sorry my question was not clear.
 when I get data from database, some field contains some html special chars,
 and what i want to do is just convert them automatically.

 On Fri, May 27, 2011 at 1:00 PM, Gora Mohanty g...@mimirtech.com wrote:

 On Fri, May 27, 2011 at 3:50 PM, anass talby anass.ta...@gmail.com
 wrote:
  Is there any way to render html entities in DIH for a specific field?
 [...]

 This does not make too much sense: What do you mean by
 rendering HTML entities. DIH just indexes, so where would
 it render HTML to, even if it could?

 Please take a look at http://wiki.apache.org/solr/UsingMailingLists

 Regards,
 Gora




 --
       Anass




Re: DIH render html entities

2011-05-31 Thread Erick Erickson
Convert them to what? Individual fields in your docs? Text?

If the former, you might get some joy from the XpathEntityProcessor.
If you want to just strip the markup and index all the content you
might get some joy from the various *html* analyzers listed here:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

Best
Erick

On Fri, May 27, 2011 at 5:19 AM, anass talby anass.ta...@gmail.com wrote:
 Sorry my question was not clear.
 when I get data from database, some field contains some html special chars,
 and what i want to do is just convert them automatically.

 On Fri, May 27, 2011 at 1:00 PM, Gora Mohanty g...@mimirtech.com wrote:

 On Fri, May 27, 2011 at 3:50 PM, anass talby anass.ta...@gmail.com
 wrote:
  Is there any way to render html entities in DIH for a specific field?
 [...]

 This does not make too much sense: What do you mean by
 rendering HTML entities. DIH just indexes, so where would
 it render HTML to, even if it could?

 Please take a look at http://wiki.apache.org/solr/UsingMailingLists

 Regards,
 Gora




 --
       Anass



DIH render html entities

2011-05-27 Thread anass talby
Is there any way to render html entities in DIH for a specific field?

Thanks
-- 
   Anass


Re: DIH render html entities

2011-05-27 Thread anass talby
Sorry my question was not clear.
when I get data from database, some field contains some html special chars,
and what i want to do is just convert them automatically.

On Fri, May 27, 2011 at 1:00 PM, Gora Mohanty g...@mimirtech.com wrote:

 On Fri, May 27, 2011 at 3:50 PM, anass talby anass.ta...@gmail.com
 wrote:
  Is there any way to render html entities in DIH for a specific field?
 [...]

 This does not make too much sense: What do you mean by
 rendering HTML entities. DIH just indexes, so where would
 it render HTML to, even if it could?

 Please take a look at http://wiki.apache.org/solr/UsingMailingLists

 Regards,
 Gora




-- 
   Anass