Yes, it's "undocumented" function now.
It's only mentioned in API for future use.
If you or someone need it, it may be implemented earlier.
$_termDictionary is only an index of full dictionary (.tii index file).
It contains each X-th entry of full dictionary (.tis index file).
Usually each 128th entry.
Index optimization, on which I work on now, uses terms retrieving, thus
it will be included into SegmentInfo class API and can be used in
Lucene::terms() method.
With best regards,
Alexander Veremyev.
meecect wrote:
I'm trying to build a concordance engine and would like to use Zend Lucene to
get all the terms in the index, and then the document frquency for them.
I've hacked around in the source to no avail. For example, in Lucene.php, i
have pu this:
public function terms()
{
$result = array();
foreach( $this->_segmentInfos as $segmentInfo ) {
$result = array_merge($result, $segmentInfo->getTerms());
}
return $result;
}
and in SegmentInfo.php, I put this:
public function getTerms()
{
$this->_loadDictionary();
return $this->_termDictionary;
}
but if I instantiate an index and try to get the terms, like this:
<?php
require_once('Zend/Search/Lucene.php');
$indexPath = '/DATA/sites/langdata.potowski.org/indexes';
$index = new Zend_Search_Lucene($indexPath);
?><pre>
<?php
print_r($index->terms());
?>
</pre>
All I get is the following output ( there should be many more terms in the
index, because I indexed some sample docs and can succesfully return hits on
a number of terms):
Array
(
[0] => Zend_Search_Lucene_Index_Term Object
(
[field] => -1
[text] =>
)
[1] => Zend_Search_Lucene_Index_Term Object
(
[field] => 2
[text] => nac
)
)
Anyone have any hints to help me out?