Trying to convert some Lucene 3 code to Lucene 4,
I want to use termEnums.docs(ir.getLiveDocs()) to only return docs that
have not been deleted for a particular term. However getLiveDocs() is
only available for AtomicReaders, and although I just have a single
index it is file based and uses
Hi there,
Looking at my index (about 1M docs) i see lot of unique terms, more
than 8M which is a significant part of my total term count. These are very
likely useless terms, binaries or other meaningless numbers that come with
few of my docs.
I am totally fine with deleting them so these terms
Hi Manuel,
On Thu, Apr 25, 2013 at 12:29 AM, Manuel LeNormand
manuel.lenorm...@gmail.com wrote:
Hi there,
Looking at my index (about 1M docs) i see lot of unique terms, more
than 8M which is a significant part of my total term count. These are very
likely useless terms, binaries or other
Hi Paul
On Wed, Apr 24, 2013 at 1:35 PM, Paul Taylor paul_t...@fastmail.fm wrote:
Trying to convert some Lucene 3 code to Lucene 4,
I want to use termEnums.docs(ir.getLiveDocs()) to only return docs that have
not been deleted for a particular term. However getLiveDocs() is only
available for
Hi Alexey,
On Tue, Apr 23, 2013 at 3:28 PM, Alexey Anatolevitch
alexeyl...@gmail.com wrote:
I was trying it with 4.2.1 and SimpleNaiveBayesClassifier seems to have a
bug - the local copy of BytesRef referenced by foundClass is affected by
subsequent TermsEnum.iterator.next() calls as the