Hi Hoss,

Thanks for the reply. I've created a JIRA issue to track this -- https://issues.apache.org/jira/browse/LUCENE-1494

the initial thought was that just removing the term1.field=term2.field assertion would allow something liek this to work, but i don't think anyone every tried creating a patch w/tests to verify it.

I think it would be a great idea.

Great. I've implemented this in the first patch attached to the JIRA issue, including a test case. Rather than removing the assertion, I've brought in a specialized (very lightweight) subclass of SpanNearQuery -- I think the Javadoc should make it clear why (supporting multiple fields does screw with the semantics a little).

couldn't this be solved by an Analyzer that counts the token per fieldname and implements getPositionIncrementGap as..

        int result - SOME_BIG_NUM - tokensSeenMap.get(fieldname);
        tokensSeenMap.put(fieldname, 0);
        return result;

It could, and we could always fall back to this. I've taken my approach and put that, also, as a patch against LUCENE-1494. If you're not happy with the implementation (it's quite lightweight, and shouldn't break Analyzer implementors) then we can do this in our analyzer, as you suggest above.

The question is, though (I can't find any Javadoc etc. on this) -- is there an implicit assumption that, once set up, Analyzers are (or should be) thread-safe? Your suggestion would be hard to do in a threadsafe fashion without ThreadLocal maps or some such fun. Most Analyzers seem to be 'semi-threadsafe' or better -- i.e. Analyzer itself uses a ThreadLocal for the tokenStreams, KeywordAnalyzer keeps no state, StandardAnalyzer has state but it's once set up it stays static (though there are no publication guarantees around it... hmm), etc. Bringing that level of state into an Analyzer seems risky.

Anyway, please do check out the JIRA issue and let me know what you think. I think both issues are addressed relatively cleanly.

Cheers,

Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to