Hi,
I am currently using the demo class IndexFiles to index some corpus. I have
replaced the Standard by a GermanAnalyzer. Here, indexing works fine.
But if i specify a different stopword list that should be used, the
tokenization doesn't seem to work properly. Mostly some letters are missing at
Hi all,
I am currently using a (slightly modified) version of the IndexFiles demo class
of Lucene to index a corpus. As I understand it, the index lists for each term
the documents it occurs in.
My question is now, if this is in terms of frequency counts (the term occurs x
times within the docum