Hi All,
I have a problem when indexing documents on Zend Search Lucene index.
Sometimes I get this notice
Notice: iconv() [function.iconv]: Detected an illegal character in input
string in xxx\library\Zend\Search\Lucene\Analysis\Analyzer\Common\Utf8Num.php
on line 77
Notice: iconv() [function.iconv]: Detected an illegal character in input
string inxxx\library\Zend\Search\Lucene\Field.php on line 222
My database charset is UTF8 ( and table fields too ), I use the query "SET
NAMES 'utf8'" before any other and I use the analyzer
Zend_Search_Lucene_Analysis_Analyzer_Common_Utf8Num()
If I have a long text data extracted from my database, and I want to store it
both as UnStored and UnIndexed field ( just to have both example ),
what is the correct way to add it? Do I have to specify the third parameter as
utf8 ( like in the example) ? Do I have to do any other kind of
encoding/filtering etc?
$this->addField( Zend_Search_Lucene_Field::UnStored( 'contents',
$data['page_contents'], 'utf-8' ) );
$this->addField( Zend_Search_Lucene_Field::UnIndexed( 'abstract',
$data['page_contents'], 'utf-8' ) );
Any help it is very appreciated!
Sergio