Hi

While migrating code to Lucene 4.0, I noticed that I have an assert on a
field that is indexed with DOCS_ONLY that DocsEnum.freq() == 1. This got me
thinking ... why?

If you index w/ DOCS_ONLY, or ask for DocsEnum with FLAG_NONE, why do we
"lie" to the consumer? Rather, we could just return 0 or -1?

I personally don't mind if we continue to return 1, if there's a real
reason to. I don't think that anyone should call freq() if he asked for
DocsEnum with FLAG_NONE. But if we do keep the current behavior, can we at
least document it?

E.g., something like this patch:

Index: lucene/core/src/java/org/apache/lucene/index/DocsEnum.java
===================================================================
--- lucene/core/src/java/org/apache/lucene/index/DocsEnum.java  (revision
1422804)
+++ lucene/core/src/java/org/apache/lucene/index/DocsEnum.java  (working
copy)
@@ -47,10 +47,16 @@
   protected DocsEnum() {
   }

-  /** Returns term frequency in the current document.  Do
-   *  not call this before {@link #nextDoc} is first called,
-   *  nor after {@link #nextDoc} returns NO_MORE_DOCS.
-   **/
+  /**
+   * Returns term frequency in the current document, or 1 if the
+   * {@link DocsEnum} was obtained with {@link #FLAG_NONE}. Do not call
this
+   * before {@link #nextDoc} is first called, nor after {@link #nextDoc}
returns
+   * {@link DocIdSetIterator#NO_MORE_DOCS}.
+   *
+   * <p>
+   * <b>NOTE:</b> if the {@link DocsEnum} was obtain with {@link
#FLAG_NONE},
+   * this method returns 1.
+   */
   public abstract int freq() throws IOException;

   /** Returns the related attributes. */

Shai

Reply via email to