Re: [fw-general] Zend_Search_Lucene More Questions

Alexander Veremyev Tue, 09 Jan 2007 12:40:15 -0800

Hi,

Sebi wrote:

Does anyone read this message? Can anyone give an answer?


Sorry for the delay, I had a flu and didn't have a chance to answer you.

I have some new questions:

1. I have problems when I search with phrase queries using query parser or direct 
binding to the $index->find function.  But If I construct a phrase query with 
Zend_search_lucene API, $index->find function will return a result.

The following will return no results:

    $result = $index->find ('title:"kirchner kurt"')    (this is identical with 
the following query)
    or
    $query = Zend_Search_Lucene_Search_QueryParser::parse('title:"kirchner 
kurt"');
    $result = $index->find ($query);
    print $query->__toString(); //outputs (title:"kirchner kurt")

But with API construnction the search works fine:

    $query = new Zend_Search_Lucene_Search_Query_Phrase(array('kirchner', 
'kurt'), null, 'title');
    $result = $index->find($query);
    print $query->__toString(); //outputs title:"kirchner kurt"

Please Alexander, can you check if it is bug or something? Thank you.


I just checked it and found a bug in phrase query construction.

'true' parameter was used instead of optional integer "term position". Ijust fixed this and committed into SVN.

2. What is the difference between a Binary field and an UnIndexed Field?

> Does UnIndexed type change the stored text? I ask this because

> I saved a text with some latin non-ascii characters as UnIndexedfield and

> this text was changed. Maybe iconv() function was applied on
> stored text? What is the difference after all?

Yes. Only Binary fields are stored "as is".
UnIndexed non-binary field may be transformed by iconv().

Text is transformed to "ASCII//TRANSLIT" now and will be transformed to"UTF-8" in future.

3. In the documentation is mentioned that, by default, the search will look through all fileds.

> What types must have this fields? Keyword? Text? UnStored?

Search is performed through all indexed fields (Keyword, Text, UnStored).

4. I understand how the automatic optimize process merge segments into new larger ones. But I'm curious about the call of optimize function. I read that it does not functiuon on the same algorithm like auto-optimize. It just merge all segments, no metter of their size or their group into new one. Is it true? In this case the first danger is to reach a maximum size for the segment file.


optimize() call merges all segments into new one.

Yes, if it exceed 2GB, then index will be damaged. I reopened "indexsize" issue some time ago(http://framework.zend.com/issues/browse/ZF-527), so it's in a queue.

5. I met some classes which I don't understand because of the lack of 
explications. I need a very short explication for each:
        - Zend_Search_Lucene_Search_QueryEntry


Query entry :)
It may be phrase, term or subquery.
Actually it's a boolean query elements.

Each boolean query/subquery has its own context.

        - Zend_Search_Lucene_Search_Lexer


It splits query into lexemes (query syntax lexemes, words and phrases).



6. Does Zend_Search_Lucene support steming like Java Lucene? Or maybe this 
options will be added in the future?

Yes. Filters are intended for this (take a look onZend_Search_Lucene_Analysis_TokenFilter class).



With best regards,
   Alexander Veremyev.

Thank you for all your answers. All my respect,

Sebi.




__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection aroundhttp://mail.yahoo.com

Re: [fw-general] Zend_Search_Lucene More Questions

Reply via email to