[jira] Resolved: (LUCENE-1887) o.a.l.messages should be moved to core

2009-09-10 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved LUCENE-1887. --- Resolution: Fixed Revision: 813268 > o.a.l.messages should be moved to core > -

Re: Question regarding the index files

2009-09-10 Thread Michael McCandless
I answered on java-user. I think it should be able to be done w/o source code changes to Lucene. Mike On Thu, Sep 10, 2009 at 2:39 AM, Dvora wrote: > > Hello, > > I'm coping a question I've asked in the Users lists, but I think it requires > some patching effort, so maybe that list will be more

September 2009 Hadoop/Lucene/Solr/UIMA/katta/Mahout Get Together Berlin

2009-09-10 Thread Uwe Schindler
Hi, I cross-post this here, Isabel Drost is managing the meetup. This time it is more about Hadoop, but there is also a talk about the new Lucene 2.9 release (presented by me). As far as I know, Simon Willnauer will also be there: --

Re: svn commit: r813268 - /lucene/java/trunk/build.xml

2009-09-10 Thread Robert Muir
uwe, thanks for catching this! can we do this too? otherwise, analysis.tokenattributes shows up as being in contrib/analyzers :) Index: build.xml === --- build.xml (revision 813385) +++ build.xml (working copy) @@ -337,7 +337,7 @

RE: September 2009 Hadoop/Lucene/Solr/UIMA/katta/Mahout Get Together Berlin

2009-09-10 Thread Uwe Schindler
Hi again, By the way, if somebody of the other involved developers want to provide me some PPT Slides about the other new features in Lucene 2.9 (NRT, future Flexible Indexing), I would be happy! Uwe > Uwe Schindler, Lucene 2.9 Developments: Numeric Search, Per-Segment- and > Near-Real-Time Sear

RE: svn commit: r813268 - /lucene/java/trunk/build.xml

2009-09-10 Thread Uwe Schindler
Yes, correct, I will commit! - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Robert Muir [mailto:rcm...@gmail.com] > Sent: Thursday, September 10, 2009 1:58 PM > To: java-dev@lucene.apache.org > Subject:

[jira] Commented: (LUCENE-1370) Patch to make ShingleFilter output a unigram if no ngrams can be generated

2009-09-10 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753598#action_12753598 ] Robert Muir commented on LUCENE-1370: - Chris, just a small comment: {code} if (outp

[jira] Commented: (LUCENE-1370) Patch to make ShingleFilter output a unigram if no ngrams can be generated

2009-09-10 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753621#action_12753621 ] Uwe Schindler commented on LUCENE-1370: --- bq. even though it doesnt have to recompute

[jira] Commented: (LUCENE-1370) Patch to make ShingleFilter output a unigram if no ngrams can be generated

2009-09-10 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753623#action_12753623 ] Robert Muir commented on LUCENE-1370: - Uwe, I see... i had not looked far enough yet t

LowerCaseFilter, is there a reason why the class is final?

2009-09-10 Thread Daniel Shane
Hi all, I was wondering why the LowerCaseFilter is declared final? In my code, I would like to extend it but apparently its not possible. I'm just wondering why extending this type of class is considered evil? Daniel Shane -

RE: LowerCaseFilter, is there a reason why the class is final?

2009-09-10 Thread Uwe Schindler
See https://issues.apache.org/jira/browse/LUCENE-1753 In general, if you want to add functionality plug another filter into the chain. At least the implementations should be final (next/incrementToken). - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@th

RE: LowerCaseFilter, is there a reason why the class is final?

2009-09-10 Thread Uwe Schindler
I forget, this known as "Decorator Pattern": http://en.wikipedia.org/wiki/Decorator_pattern - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Uwe Schindler [mailto:u...@thetaphi.de] > Sent: Thursday, Septe

Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Uwe Schindler
When reviewing the new CharStream code added to Tokenizers, I found a serious problem with backwards compatibility and other Tokenizers, that do not override reset(CharStream). The problem is, that e.g. CharTokenizer only overrides reset(Reader): public void reset(Reader input) throws IOExcepti

RE: Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Uwe Schindler
I tested the attached patch, all tests still compile and work as exspected (as CharStream extends Reader). I think I should open an issue? Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Uwe Schind

Re: Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Mark Miller
Yeah, lets open an issue and mark it blocker - I'll hold RC4 for it (was just about to push it when I caught this email). Uwe Schindler wrote: > I tested the attached patch, all tests still compile and work as exspected > (as CharStream extends Reader). > > I think I should open an issue? > > Uwe

[jira] Created: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Uwe Schindler (JIRA)
Problem with CharStream and Tokenizers with custom reset(Reader) method --- Key: LUCENE-1906 URL: https://issues.apache.org/jira/browse/LUCENE-1906 Project: Lucene - Java Is

[jira] Updated: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1906: -- Description: When reviewing the new CharStream code added to Tokenizers, I found a serious pro

[jira] Commented: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753664#action_12753664 ] Uwe Schindler commented on LUCENE-1906: --- I will now check, if the change of the "inp

[jira] Updated: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1906: -- Attachment: LUCENE-1906.patch > Problem with CharStream and Tokenizers with custom reset(Reade

[jira] Commented: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753666#action_12753666 ] Yonik Seeley commented on LUCENE-1906: -- +1, this looks like the best fix. > Problem

[jira] Commented: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753679#action_12753679 ] Michael McCandless commented on LUCENE-1906: +1, good catch Uwe! > Problem wi

[jira] Commented: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753687#action_12753687 ] Mark Miller commented on LUCENE-1906: - Ready Uwe? > Problem with CharStream and Token

[jira] Commented: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753689#action_12753689 ] Uwe Schindler commented on LUCENE-1906: --- bq. I will now check, if the change of the

[jira] Updated: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1906: -- Attachment: backwards-break.patch Here is the patch for backwards-branch, that fails. It rever

[jira] Updated: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1906: -- Attachment: (was: backwards-break.patch) > Problem with CharStream and Tokenizers with cus

[jira] Updated: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1906: -- Attachment: backwards-break.patch Sorry, wrong patch, this one is correct. Other one was a eve

[jira] Commented: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753709#action_12753709 ] Uwe Schindler commented on LUCENE-1906: --- One possibility to prevent this break would

[jira] Commented: (LUCENE-1887) o.a.l.messages should be moved to core

2009-09-10 Thread Luis Alves (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753710#action_12753710 ] Luis Alves commented on LUCENE-1887: I was not able to read the thread late yesterday

[jira] Issue Comment Edited: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753709#action_12753709 ] Uwe Schindler edited comment on LUCENE-1906 at 9/10/09 10:15 AM: ---

[jira] Commented: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753713#action_12753713 ] Mark Miller commented on LUCENE-1906: - What about using an introspection cache again?

[jira] Commented: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753715#action_12753715 ] Robert Muir commented on LUCENE-1906: - bq. (only some old Tokenizers not calling corre

[jira] Commented: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753718#action_12753718 ] Uwe Schindler commented on LUCENE-1906: --- bq. What about using an introspection cache

[jira] Commented: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753720#action_12753720 ] Michael McCandless commented on LUCENE-1906: bq. We also have a not-in-CHANGES

[jira] Commented: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753722#action_12753722 ] Robert Muir commented on LUCENE-1906: - bq. Correct, this is always the problem with th

[jira] Commented: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753724#action_12753724 ] Michael McCandless commented on LUCENE-1906: bq. Correct, this is always the p

[jira] Commented: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753725#action_12753725 ] Mark Miller commented on LUCENE-1906: - bq. A cache for what? I do not understand The

[jira] Updated: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1906: -- Attachment: LUCENE-1906.patch Here the patch for core. Contrib is unchanged. In principle jus

Using attributes after end of stream

2009-09-10 Thread David Kaelbling
Hi, After incrementToken() returns false, what are attribute getters supposed to return? I thought that any such calls would be erroneous (the equivalent of NullPointerExceptions), but org.apache.lucene.index.DocInverterPerField.processFields() makes just such a call at line 194. Thanks,

[jira] Commented: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753733#action_12753733 ] Michael McCandless commented on LUCENE-1906: I'm still nervous about inserting

[jira] Commented: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753735#action_12753735 ] Uwe Schindler commented on LUCENE-1906: --- "instanceof" is one of the operators direct

Re: Using attributes after end of stream

2009-09-10 Thread Michael McCandless
I think it depends on the attribute? In that case (a call to offsetAttribute.endOffset(), after TokenStream.end() has been called), that method returns the final offset increment. This is necessary so multi-valued fields keep the right offsets (see https://issues.apache.org/jira/browse/LUCENE-144

[jira] Updated: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-1906: Attachment: LUCENE-1906_contrib.patch contrib changes. > Problem with CharStream and Tokenizers w

[jira] Commented: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753738#action_12753738 ] Mark Miller commented on LUCENE-1906: - bq. I think breaking back compat here is OK?

[jira] Commented: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753742#action_12753742 ] Yonik Seeley commented on LUCENE-1906: -- bq. "instanceof" is one of the operators dire

[jira] Commented: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753746#action_12753746 ] Mark Miller commented on LUCENE-1906: - bq. Hmmm, I had missed that 2.9 required a reco

[jira] Commented: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753750#action_12753750 ] Uwe Schindler commented on LUCENE-1906: --- bq. Yes, it's relatively fast, but it's per

[jira] Issue Comment Edited: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753750#action_12753750 ] Uwe Schindler edited comment on LUCENE-1906 at 9/10/09 11:24 AM: ---

[jira] Commented: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753751#action_12753751 ] Mark Miller commented on LUCENE-1906: - bq. In my opinion, e.g. external language Token

[jira] Commented: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753762#action_12753762 ] Michael McCandless commented on LUCENE-1906: bq. A recompile is only needed is

[jira] Commented: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753766#action_12753766 ] Uwe Schindler commented on LUCENE-1906: --- bq. Maybe for 3.0 we can declare that this

[jira] Commented: (LUCENE-1370) Patch to make ShingleFilter output a unigram if no ngrams can be generated

2009-09-10 Thread Chris Harris (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753789#action_12753789 ] Chris Harris commented on LUCENE-1370: -- {quote} here i think you could save a clone()

[jira] Commented: (LUCENE-1370) Patch to make ShingleFilter output a unigram if no ngrams can be generated

2009-09-10 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753795#action_12753795 ] Robert Muir commented on LUCENE-1370: - Chris, yeah I was thinking something like your

[jira] Commented: (LUCENE-1370) Patch to make ShingleFilter output a unigram if no ngrams can be generated

2009-09-10 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753798#action_12753798 ] Uwe Schindler commented on LUCENE-1370: --- AttributeSource.State objects are unmodifia

[jira] Updated: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1906: -- Attachment: LUCENE-1906-bw.patch LUCENE-1906.patch Here the updated patches fo

Issue with multifieldqueryparser and searching multiple texts

2009-09-10 Thread theDude_2
http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/queryParser/MultiFieldQueryParser.html#parse(java.lang.String[], java.lang.String[], org.apache.lucene.analysis.Analyzer) if I call the parse method inherited from QueryParser, multiple queries don't seem supported. I'm trying to do a sea

[jira] Commented: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753830#action_12753830 ] Robert Muir commented on LUCENE-1906: - uwe, i like your patch. what was that Standard

[jira] Commented: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753834#action_12753834 ] Uwe Schindler commented on LUCENE-1906: --- It was never used and seems to be a relict

Re: Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Jason Rutherglen
I've been seeing strange behavior perhaps related to this? Where sometimes a query is parsed and analyzed using Solr analyzers to it's first clause fairly randomly, and other times the same exact query is parsed and analyzed to the full correct query with all clauses. It's so baffling I haven't rea

[jira] Commented: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753875#action_12753875 ] Mark Miller commented on LUCENE-1906: - I say we go with it - 'instance of' will have n

[jira] Commented: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753877#action_12753877 ] Michael McCandless commented on LUCENE-1906: Patch looks good Uwe! > Problem

[jira] Commented: (LUCENE-1781) Large distances in Spatial go beyond Prime MEridian

2009-09-10 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753881#action_12753881 ] Michael McCandless commented on LUCENE-1781: So the anti-meridian test is expe

[jira] Updated: (LUCENE-1370) Patch to make ShingleFilter output a unigram if no ngrams can be generated

2009-09-10 Thread Chris Harris (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Harris updated LUCENE-1370: - Attachment: LUCENE-1370.patch Update patch to avoid an unnecessary State.clone(), as suggested b

[jira] Commented: (LUCENE-1896) Modify confusing javadoc for queryNorm

2009-09-10 Thread Doron Cohen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753892#action_12753892 ] Doron Cohen commented on LUCENE-1896: - I kinda like the modification of queryNorm to g

[jira] Created: (LUCENE-1907) sumOfSquared weights should be calculated as part of queryNorm

2009-09-10 Thread Mark Miller (JIRA)
sumOfSquared weights should be calculated as part of queryNorm -- Key: LUCENE-1907 URL: https://issues.apache.org/jira/browse/LUCENE-1907 Project: Lucene - Java Issue Type: Bug

[jira] Updated: (LUCENE-1907) sumOfSquared weights should be calculated as part of queryNorm

2009-09-10 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated LUCENE-1907: Attachment: LUCENE-1907.patch > sumOfSquared weights should be calculated as part of queryNorm > -

[jira] Commented: (LUCENE-1906) Problem with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Koji Sekiguchi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753906#action_12753906 ] Koji Sekiguchi commented on LUCENE-1906: +1, patch looks good, thanks Uwe! > Prob

[jira] Updated: (LUCENE-1907) sumOfSquared weights should be calculated as part of queryNorm

2009-09-10 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated LUCENE-1907: Attachment: LUCENE-1907.patch fix a test and improve a bit > sumOfSquared weights should be calcu

[jira] Commented: (LUCENE-1781) Large distances in Spatial go beyond Prime MEridian

2009-09-10 Thread Bill Bell (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753956#action_12753956 ] Bill Bell commented on LUCENE-1781: --- Not exactly. testPrimeM() should be fixed by the n

Question bout I/O monitoring

2009-09-10 Thread edwardyf
Is there any means to monitor the I/O of lucene? i.e. say i am searching in a FSDirectory, i wanna know the number of pages read from disk -- View this message in context: http://www.nabble.com/Question-bout-I-O-monitoring-tp25394566p25394566.html Sent from the Lucene - Java Developer mailing li

Re: Question bout I/O monitoring

2009-09-10 Thread Ted Dunning
Try this: http://lucene.grantingersoll.com/2009/08/25/lucid-imagination-%C2%BB-lucid-gaze-for-lucene/ On Thu, Sep 10, 2009 at 8:39 PM, edwardyf wrote: > > Is there any means to monitor the I/O of lucene? > i.e. say i am searching in a FSDirectory, i wanna know the number of pages > read from di

[jira] Issue Comment Edited: (LUCENE-1781) Large distances in Spatial go beyond Prime MEridian

2009-09-10 Thread Bill Bell (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753956#action_12753956 ] Bill Bell edited comment on LUCENE-1781 at 9/10/09 9:03 PM: No

[jira] Issue Comment Edited: (LUCENE-1781) Large distances in Spatial go beyond Prime MEridian

2009-09-10 Thread Bill Bell (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753956#action_12753956 ] Bill Bell edited comment on LUCENE-1781 at 9/10/09 9:18 PM: No

[jira] Updated: (LUCENE-1781) Large distances in Spatial go beyond Prime MEridian

2009-09-10 Thread Bill Bell (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bell updated LUCENE-1781: -- Attachment: TestCartesian.java.patch TestCartesian.java This tests box prime meridien

[jira] Updated: (LUCENE-1781) Large distances in Spatial go beyond Prime MEridian

2009-09-10 Thread Bill Bell (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bell updated LUCENE-1781: -- Attachment: (was: TestCartesian.java) > Large distances in Spatial go beyond Prime MEridian >

[jira] Issue Comment Edited: (LUCENE-1781) Large distances in Spatial go beyond Prime MEridian

2009-09-10 Thread Bill Bell (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753992#action_12753992 ] Bill Bell edited comment on LUCENE-1781 at 9/10/09 9:34 PM: Po

[jira] Updated: (LUCENE-1781) Large distances in Spatial go beyond Prime MEridian

2009-09-10 Thread Bill Bell (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bell updated LUCENE-1781: -- Attachment: (was: TestCases.diff) > Large distances in Spatial go beyond Prime MEridian >

[jira] Updated: (LUCENE-1781) Large distances in Spatial go beyond Prime MEridian

2009-09-10 Thread Bill Bell (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bell updated LUCENE-1781: -- Attachment: (was: LLRect.java) > Large distances in Spatial go beyond Prime MEridian > ---

Re: Question bout I/O monitoring

2009-09-10 Thread edwardyf
Thanks for the reply, just checked the Lucid Gaze package, it only collects the stats at function call level, no I/O stats Ted Dunning wrote: > > Try this: > > http://lucene.grantingersoll.com/2009/08/25/lucid-imagination-%C2%BB-lucid-gaze-for-lucene/ > > On Thu, Sep 10, 2009 at 8:39 PM, ed

Re: Question bout I/O monitoring

2009-09-10 Thread Brian Pinkerton
If you're on a Mac or Solaris, dtrace will tell you everything you want to know (and more.) If you're not familiar with dtrace, iosnoop.d is a good start for this kind of measurement. At Technorati, I used dtrace to build a trace file of all the read requests made by a big lucene app, reco

Re: Question bout I/O monitoring

2009-09-10 Thread edwardyf
unfortunately i am running a Red hat enterprise version, i am doing an academic experiment which need number of disk I/Os. I am now looking at SystemTap, which is said to be the iosnoop for linux. thanks for the guide Brian Pinkerton-2 wrote: > > If you're on a Mac or Solaris, dtrace will tel

[jira] Updated: (LUCENE-1906) Backwards problems with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1906: -- Summary: Backwards problems with CharStream and Tokenizers with custom reset(Reader) method (

[jira] Updated: (LUCENE-1906) Backwards problems with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1906: -- Attachment: LUCENE-1906.patch Some tweaks in JavaDocs. I committed this patch. > Backwards pr

[jira] Resolved: (LUCENE-1906) Backwards problems with CharStream and Tokenizers with custom reset(Reader) method

2009-09-10 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved LUCENE-1906. --- Resolution: Fixed Committed revision: 813671 There may be some changes needed in Solr (corr

Re: Question bout I/O monitoring

2009-09-10 Thread Ted Dunning
You can also integrate the results from iostat on an otherwise idle machine. On Thu, Sep 10, 2009 at 10:24 PM, edwardyf wrote: > > unfortunately i am running a Red hat enterprise version, i am doing an > academic experiment which need number of disk I/Os. > > I am now looking at SystemTap, which