Hi!, I am developing an application using Zend_Lucene_Search Framework.
it consists in some XML files that I index with Lucene.
Part of the code I use for index is this:
//Field path to the XML file
$this->addField(Zend_Search_Lucene_Field::Text('path',
$fileName/*,'utf-8'*/));
//Name of the XML file
$this->addField(Zend_Search_Lucene_Field::Text('nombre',$nombre_libro/*,'utf-8'*/));
//Category of the XML file
$this->addField(Zend_Search_Lucene_Field::Text('categoria',$categoria/*,'utf-8'*/));
...
There are some code to extract the text of the XML file (using SimpleXML)
...
//Field Unstored to store the words of the xml file.
$this->addField(Zend_Search_Lucene_Field::UnStored('contents',$palabras,'utf-8'));
The structure is simple, I store the words of the XML file and the path to
the
file, then when I make a search I now which file open.
After the indexation I use Lucene Index Toolbox 0.7.1 to inspect the index.
Is a Java
app that permits me see my indexes, and searchs on it.
The problem I have is that some words (with low rank) are not returned by my
search code
in PHP using Zend_Lucene_Search framework.
I don't know if it helps but my index has 29684 words
Part of the code I use is this (is a simple code that try to find a single
word):
setlocale(LC_CTYPE, 'es_ES.utf-8');
$index = new Zend_Search_Lucene($pathToTheIndex);
$hits = $index->find(strtolower($buscar));
For example in my app devolped in PHP 5 with that code to search I try to
find the
word "lucas" and get no results, but in the Lucene Index Toolbox I make the
same
search and I get 7 hits.
I don't know if is a encoding problem (I think not) or I'am doing a bad
search code.
greetings from Argentina, Gustavo.
Sorry for my English!!
--
View this message in context:
http://www.nabble.com/Problem-searching-low-rank-words-with-zend-lucene-tf4627557s16154.html#a13212923
Sent from the Zend Framework mailing list archive at Nabble.com.