Partout wrote:
Alexander,
My script section is below:
......
Zend_Search_Lucene_Analysis_Analyzer::setDefault(new
Zend_Search_Lucene_Analysis_Analyzer_Commo
n_TextNum_CaseInsensitive());
Zend_Search_Lucene_Search_QueryParser::setDefaultOperator(Zend_Search_Lucene_Search_QueryParser
::B_AND);
$search = "R\&D"; //I have tried "R\\&D",
'"R\&D"',"R&D",'"R&D"'
$query =
Zend_Search_Lucene_Search_QueryParser::parse($search);
echo $query->__toString() . "\n";
$hits = $index->find($query);
......
Result output:
"R\&D" --> +(+r +d)
"R\\&D" --> +(+r +d)
'"R\&D"' --> +("r d")
"R&D" --> 'Zend_Search_Lucene_Search_QueryParserException'
with message 'Two chars lexeme expected. Position is 2.'
'"R&D"' --> +("r d")
That's correct behavior. '"R&D"' should give you a result you need.
If you prefer to consider R&D as one word, make your own analyzer. Take
Zend_Search_Lucene_Analysis_Analyzer_Common_TextNum_CaseInsensitive and
Zend_Search_Lucene_Analysis_Analyzer_Common_TextNum as an examples.
Actually you need:
------------------------------------------------------------------
class MyAnalyzer extends Zend_Search_Lucene_Analysis_Analyzer_Common
{
private $_position;
const NONBROKE_CHARS = '&!|\/';
public function __construct()
{
$this->addFilter(new
Zend_Search_Lucene_Analysis_TokenFilter_LowerCase());
}
public function reset()
{
$this->_position = 0;
if ($this->_input === null) {
return;
}
// convert input into ascii
$this->_input = iconv($this->_encoding, 'ASCII//TRANSLIT',
$this->_input);
$this->_encoding = 'ASCII';
}
/**
* Tokenization stream API
* Get next token
* Returns null at the end of stream
*
* @return Zend_Search_Lucene_Analysis_Token|null
*/
public function nextToken()
{
if ($this->_input === null) {
return null;
}
while ($this->_position < strlen($this->_input)) {
// skip white space
while ($this->_position < strlen($this->_input) &&
!ctype_alnum( $this->_input[$this->_position] ) &&
(strpos( self::NONBROKE_CHARS,
$this->_input[$this->_position] ) === false) ) {
$this->_position++;
}
$termStartPosition = $this->_position;
// read token
while ($this->_position < strlen($this->_input) &&
(ctype_alnum( $this->_input[$this->_position] ) ||
(strpos( self::NONBROKE_CHARS,
$this->_input[$this->_position] ) !== false)) ) {
$this->_position++;
}
// Empty token, end of stream.
if ($this->_position == $termStartPosition) {
return null;
}
$token = new Zend_Search_Lucene_Analysis_Token(
substr($this->_input,
$termStartPosition,
$this->_position -
$termStartPosition),
$termStartPosition,
$this->_position);
$token = $this->normalize($token);
if ($token !== null) {
return $token;
}
// Continue if token is skipped
}
return null;
}
}
-----------------------------------------
The same analyzer should be used for indexing and searching.
PS This analyzer will also give you a possibility to search with Luke
using "R\&D" query.
With best regards,
Alexander Veremyev.
Besides, I also built the index with
Zend_Search_Lucene_Analysis_Analyzer_Common_TextNum_CaseInsensitive()
option.
Is there still other way to solve my problem? Thank you very much.
Best regards,
David
Alexander Veremyev wrote:
Hi,
Are you sure that escaping "\&" wasn't translated into '&' before
sending it to query parser?
Please try "R\\&D".
In addition to this you need special analyzer to consider R&D as one
word (default analyzer translates it into phrase "r d").
Take a look on
Zend_Search_Lucene_Analysis_Analyzer_Common_TextNum_CaseInsensitive
class and Zend_Search_Lucene_Analysis_Analyzer::setDefault() method.
http://framework.zend.com/manual/en/zend.search.extending.html#zend.search.extending.analysis
With best regards,
Alexander Veremyev.
Partout wrote:
Hi All, I am using Zend_Search, and glad to see many new enhancements
were added. Many thanks to all of you. But still a question, that is, I
need to search some words just like "R&D", "J2EE" .... Who can tell me
how to get it? I have used "R\&D", but it throw exception below: /Fatal
error: Uncaught exception
'Zend_Search_Lucene_Search_QueryParserException' with message 'Two chars
lexeme expected. Position is 4.' in
/opt/system/Zend/Search/Lucene/Search/QueryLexer.php:397 Stack trace: #0
/opt/system/Zend/Search/Lucene/FSMAction.php(62):
Zend_Search_Lucene_Search_QueryLexer->addQuerySyntaxLexeme() ...../
Thanks in advance. David
------------------------------------------------------------------------
View this message in context: How could Zend_Search be used to search
word like "R&D"?
<http://www.nabble.com/How-could-Zend_Search-be-used-to-search-word-like-%22R-D%22--tf3734766s16154.html#a10454147>
Sent from the Zend Framework mailing list archive
<http://www.nabble.com/Zend-Framework-f15440.html> at Nabble.com.