Partout wrote:
Alexander,
My script section is below:

                ......
                Zend_Search_Lucene_Analysis_Analyzer::setDefault(new
Zend_Search_Lucene_Analysis_Analyzer_Commo
n_TextNum_CaseInsensitive());
Zend_Search_Lucene_Search_QueryParser::setDefaultOperator(Zend_Search_Lucene_Search_QueryParser
::B_AND);
                $search = "R\&D";   //I have tried "R\\&D",
'"R\&D"',"R&D",'"R&D"' $query =
Zend_Search_Lucene_Search_QueryParser::parse($search);
                echo $query->__toString() . "\n";
                $hits = $index->find($query);
                ......

Result output:

"R\&D" --> +(+r +d) "R\\&D" --> +(+r +d) '"R\&D"' --> +("r d") "R&D" --> 'Zend_Search_Lucene_Search_QueryParserException'
with message 'Two chars lexeme expected. Position is 2.'
'"R&D"' --> +("r d")

That's correct behavior. '"R&D"' should give you a result you need.

If you prefer to consider R&D as one word, make your own analyzer. Take Zend_Search_Lucene_Analysis_Analyzer_Common_TextNum_CaseInsensitive and Zend_Search_Lucene_Analysis_Analyzer_Common_TextNum as an examples.

Actually you need:
------------------------------------------------------------------
class MyAnalyzer extends Zend_Search_Lucene_Analysis_Analyzer_Common
{
    private $_position;

    const NONBROKE_CHARS = '&!|\/';

    public function __construct()
    {
$this->addFilter(new Zend_Search_Lucene_Analysis_TokenFilter_LowerCase());
    }

    public function reset()
    {
        $this->_position = 0;

        if ($this->_input === null) {
            return;
        }

        // convert input into ascii
$this->_input = iconv($this->_encoding, 'ASCII//TRANSLIT', $this->_input);
        $this->_encoding = 'ASCII';
    }

    /**
     * Tokenization stream API
     * Get next token
     * Returns null at the end of stream
     *
     * @return Zend_Search_Lucene_Analysis_Token|null
     */
    public function nextToken()
    {
        if ($this->_input === null) {
            return null;
        }

        while ($this->_position < strlen($this->_input)) {
            // skip white space
            while ($this->_position < strlen($this->_input) &&
                   !ctype_alnum( $this->_input[$this->_position] ) &&
(strpos( self::NONBROKE_CHARS, $this->_input[$this->_position] ) === false) ) {
                $this->_position++;
            }

            $termStartPosition = $this->_position;

            // read token
            while ($this->_position < strlen($this->_input) &&
                   (ctype_alnum( $this->_input[$this->_position] ) ||
(strpos( self::NONBROKE_CHARS, $this->_input[$this->_position] ) !== false)) ) {
                $this->_position++;
            }

            // Empty token, end of stream.
            if ($this->_position == $termStartPosition) {
                return null;
            }

            $token = new Zend_Search_Lucene_Analysis_Token(
                                      substr($this->_input,
                                             $termStartPosition,
$this->_position - $termStartPosition),
                                      $termStartPosition,
                                      $this->_position);
            $token = $this->normalize($token);
            if ($token !== null) {
                return $token;
            }
            // Continue if token is skipped
        }

        return null;
    }
}
-----------------------------------------

The same analyzer should be used for indexing and searching.

PS This analyzer will also give you a possibility to search with Luke using "R\&D" query.

With best regards,
   Alexander Veremyev.



Besides, I also built the index with
Zend_Search_Lucene_Analysis_Analyzer_Common_TextNum_CaseInsensitive()
option.

Is there still other way to solve my problem? Thank you very much.

Best regards,

David


Alexander Veremyev wrote:
Hi,

Are you sure that escaping "\&" wasn't translated into '&' before sending it to query parser?

Please try "R\\&D".


In addition to this you need special analyzer to consider R&D as one word (default analyzer translates it into phrase "r d").

Take a look on Zend_Search_Lucene_Analysis_Analyzer_Common_TextNum_CaseInsensitive class and Zend_Search_Lucene_Analysis_Analyzer::setDefault() method.
http://framework.zend.com/manual/en/zend.search.extending.html#zend.search.extending.analysis


With best regards,
    Alexander Veremyev.



Partout wrote:
Hi All, I am using Zend_Search, and glad to see many new enhancements were added. Many thanks to all of you. But still a question, that is, I need to search some words just like "R&D", "J2EE" .... Who can tell me how to get it? I have used "R\&D", but it throw exception below: /Fatal error: Uncaught exception 'Zend_Search_Lucene_Search_QueryParserException' with message 'Two chars lexeme expected. Position is 4.' in /opt/system/Zend/Search/Lucene/Search/QueryLexer.php:397 Stack trace: #0 /opt/system/Zend/Search/Lucene/FSMAction.php(62): Zend_Search_Lucene_Search_QueryLexer->addQuerySyntaxLexeme() ...../ Thanks in advance. David
------------------------------------------------------------------------
View this message in context: How could Zend_Search be used to search word like "R&D"? <http://www.nabble.com/How-could-Zend_Search-be-used-to-search-word-like-%22R-D%22--tf3734766s16154.html#a10454147> Sent from the Zend Framework mailing list archive <http://www.nabble.com/Zend-Framework-f15440.html> at Nabble.com.




Reply via email to