Re: How to index correctly a text save with tinyMCE

2011-06-23 Thread Ariel
I'am sorry I bother you again but this doesn't work, I have written this configuration in my schema.xml file: charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter

RE: How to index correctly a text save with tinyMCE

2011-06-23 Thread Steven A Rowe
Hi Ariel, On 6/23/2011 at 12:34 PM, Ariel wrote: But it still doesn't convert the code to the correct character, for instance: Espaamp;ntilde;a must be converted to España but it still remains as Espaamp;ntilde;a. So it looks like your text processing tool(s) escape markup meta-characters

Re: How to index correctly a text save with tinyMCE

2011-06-23 Thread Marek Tichy
Or fix the problem at it's source, i think you need to google for entity_encoding : raw on tinyMCE. Hi Ariel, On 6/23/2011 at 12:34 PM, Ariel wrote: But it still doesn't convert the code to the correct character, for instance: Espaamp;ntilde;a must be converted to España but it still

Re: How to index correctly a text save with tinyMCE

2011-06-23 Thread Ariel
Steven A Rowe the solution you have proposed doesn't work, thanks anyway. Regards On 6/23/11, Steven A Rowe sar...@syr.edu wrote: Hi Ariel, On 6/23/2011 at 12:34 PM, Ariel wrote: But it still doesn't convert the code to the correct character, for instance: Espaamp;ntilde;a must be converted

Re: How to index correctly a text save with tinyMCE

2011-06-16 Thread Ariel
I have the following problem: I am using the spanish analyzer to index and query, but due to I am using tinymce some charactes of the text are changed codified in html, for example the text: En españa ... it is changed to En espantilde;a so I need a way to recodify that text to make queries

RE: How to index correctly a text save with tinyMCE

2011-06-16 Thread Steven A Rowe
Hi Ariel, On 6/16/2011 at 10:45 AM, Ariel wrote: I have the following problem: I am using the spanish analyzer to index and query, but due to I am using tinymce some charactes of the text are changed codified in html, for example the text: En españa ... it is changed to En espantilde;a so I

Re: How to index correctly a text save with tinyMCE

2011-06-16 Thread Ariel
Thanks for your answer, I have just put the filter in my schema.xml but it doesn't work I am using solr 1.4 and my conf is: code analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter

Re: How to index correctly a text save with tinyMCE

2011-06-16 Thread Shawn Heisey
On 6/16/2011 11:12 AM, Ariel wrote: Thanks for your answer, I have just put the filter in my schema.xml but it doesn't work I am using solr 1.4 and my conf is: code analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true

RE: How to index correctly a text save with tinyMCE

2011-06-16 Thread Steven A Rowe
Hi Ariel, As Shawn says, char filters come before tokenizers. You need to use a charFilter tag instead of filter tag. I've updated the HTMLStripCharFilter documentation on the Solr wiki to include this information:

Re: How to index correctly a text save with tinyMCE

2011-06-15 Thread Erick Erickson
Please review this page: http://wiki.apache.org/solr/UsingMailingLists You haven't stated what your problem is. Some examples of what your inputs and desired outputs are would be helpful Meanwhile, see this page: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters but that's a wild