I'm trying to build a concordance engine and would like to use Zend Lucene to
get all the terms in the index, and then the document frquency for them.

I've hacked around in the source to no avail.  For example, in Lucene.php, i
have pu this:

    public function terms()
    {
        $result = array();
        foreach( $this->_segmentInfos as $segmentInfo ) {

            $result = array_merge($result, $segmentInfo->getTerms());
        }
        return $result;
}

and in SegmentInfo.php, I put this:

    public function getTerms()
    {
         $this->_loadDictionary();
         return $this->_termDictionary;

    }


but if I instantiate an index and try to get the terms, like this:

<?php
    require_once('Zend/Search/Lucene.php');
    $indexPath = '/DATA/sites/langdata.potowski.org/indexes';
    $index = new Zend_Search_Lucene($indexPath);
 ?><pre>
 <?php
        print_r($index->terms());
?>
</pre>



All I get is the following output ( there should be many more terms in the
index, because I indexed some sample docs and can succesfully return hits on
a number of terms):

 Array
(
    [0] => Zend_Search_Lucene_Index_Term Object
        (
            [field] => -1
            [text] => 
        )

    [1] => Zend_Search_Lucene_Index_Term Object
        (
            [field] => 2
            [text] => nac
        )

)


Anyone have any hints to help me out?



-- 
View this message in context: 
http://www.nabble.com/get-all-index-terms-in-Zend-Lucene-tf2335312.html#a6498005
Sent from the Zend Framework mailing list archive at Nabble.com.

Reply via email to