Re: losing history

2009-06-16 Thread Simon Willnauer
Uwe is right! As long as you us diffs (patches) and have any kind of svn cp / svn mv done to you repository the will not be reflected in the diff. I don't think that there is any way of doing this currently except of the committer is doing it by hand (again) when applying the patch. This is related

RE: losing history

2009-06-16 Thread Uwe Schindler
The problem is, when you applied the patch, the files are already deleted/created by "patch2 and the SVN client is loosing the move operation (he only sees a new unversioned file and one missing file). As your link notes, you cannot replay the changes already done (by the patch command). So the com

[jira] Updated: (LUCENE-1630) Mating Collector and Scorer on doc Id orderness

2009-06-16 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-1630: --- Attachment: LUCENE-1630.patch Changed Query.createQueryWeight to public, as was suggested by Yonik.

[jira] Assigned: (LUCENE-1504) SerialChainFilter should use DocSet API rather then deprecated BitSet API

2009-06-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler reassigned LUCENE-1504: - Assignee: Uwe Schindler Hllo Ryan, I will try to get this into 2.9, but before some comm

[jira] Commented: (LUCENE-1504) SerialChainFilter should use DocSet API rather then deprecated BitSet API

2009-06-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720014#action_12720014 ] Uwe Schindler commented on LUCENE-1504: --- And other things: - Use a o.a.l.util.Parame

[jira] Created: (LUCENE-1693) AttributeSource/TokenStream API improvements

2009-06-16 Thread Michael Busch (JIRA)
AttributeSource/TokenStream API improvements Key: LUCENE-1693 URL: https://issues.apache.org/jira/browse/LUCENE-1693 Project: Lucene - Java Issue Type: Improvement Components: Analysis

[jira] Updated: (LUCENE-1693) AttributeSource/TokenStream API improvements

2009-06-16 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Busch updated LUCENE-1693: -- Attachment: lucene-1693.patch Patch that includes all mentioned improvements, but needs cleanu

Re: losing history

2009-06-16 Thread Michael McCandless
I'm afraid this was my bad -- I blindly applied the patch and svn deleted the 0 byte files and failed to manually do the svn move instead. I believe the trunk version of svn includes an "svn patch" command (that is sorely needed). It'd fix this as well as eg forgetting to svn add new files in a p

[jira] Assigned: (LUCENE-1630) Mating Collector and Scorer on doc Id orderness

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-1630: -- Assignee: Michael McCandless > Mating Collector and Scorer on doc Id orderness

[jira] Commented: (LUCENE-1693) AttributeSource/TokenStream API improvements

2009-06-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720031#action_12720031 ] Uwe Schindler commented on LUCENE-1693: --- Why do you add a new class "SmallToken"? I

Re: Field.tokenStreamValue

2009-06-16 Thread Michael McCandless
Seems reasonable? So you're saying that if a Field has both TokenStream and some other value, the TokenStream gets indexed into postings & term vectors, but the other value gets stored? Mike On Mon, Jun 15, 2009 at 9:48 PM, Yonik Seeley wrote: > The JavaDoc suggests that one can't have a tokenSt

RE: Field.tokenStreamValue

2009-06-16 Thread Uwe Schindler
Yes, I exactly need this for NumericField! The numeric value gets indexed using the tokenStream, but an optional stored field value (e.g. the number as plain text or even prefixEncoded) would also be good. Currently the user must index both types separate (but can use the same field name). As far a

[jira] Commented: (LUCENE-1693) AttributeSource/TokenStream API improvements

2009-06-16 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720046#action_12720046 ] Michael Busch commented on LUCENE-1693: --- {quote} Why do you add a new class "SmallTo

Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Michael Busch
Probably everyone is thinking right now "Oh no! Not again!". I admit I didn't fully read the incredibly long recent thread about backwards-compatibility, so maybe what I'm about to propose has been proposed already. In that case my apologies in advance. Rather than discussing our current backward

[jira] Commented: (LUCENE-1693) AttributeSource/TokenStream API improvements

2009-06-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720052#action_12720052 ] Uwe Schindler commented on LUCENE-1693: --- {quote} bq. What was your concusion about m

[jira] Commented: (LUCENE-1673) Move TrieRange to core

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720054#action_12720054 ] Michael McCandless commented on LUCENE-1673: Patch looks good Uwe! The only t

[jira] Issue Comment Edited: (LUCENE-1693) AttributeSource/TokenStream API improvements

2009-06-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720052#action_12720052 ] Uwe Schindler edited comment on LUCENE-1693 at 6/16/09 3:43 AM:

Re: Field.tokenStreamValue

2009-06-16 Thread Michael McCandless
OK let's do it then... Yonik do you want to open issue, patch, etc.? We should spell this out clearly in the javadocs that this case (tokenStream + string/binary value) is handled "specially", because this does break from Field's "normal" semantics. Mike On Tue, Jun 16, 2009 at 6:18 AM, Uwe Schi

[jira] Commented: (LUCENE-1673) Move TrieRange to core

2009-06-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720056#action_12720056 ] Uwe Schindler commented on LUCENE-1673: --- What do you think about deprecating DateToo

RE: Field.tokenStreamValue

2009-06-16 Thread Uwe Schindler
Maybe we should also add ctors to Field, with TokenStream and String/binary that set Field.Store.YES (compress is deprecated, so no need to support). - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Micha

[jira] Commented: (LUCENE-1693) AttributeSource/TokenStream API improvements

2009-06-16 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720060#action_12720060 ] Michael Busch commented on LUCENE-1693: --- {quote} - If somebody implements the new AP

[jira] Commented: (LUCENE-1693) AttributeSource/TokenStream API improvements

2009-06-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720066#action_12720066 ] Uwe Schindler commented on LUCENE-1693: --- {quote} What if you currently have a filter

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Shai Erera
Since I proposed the same changes ( http://www.nabble.com/Re%3A-Lucene%27s-default-settings---back-compatibility-p23792927.html), I can only give my +1 to all 4 :). On the other thread I also proposed to change the policy around changing default settings. But maybe we should take it one step at a

[jira] Created: (LUCENE-1694) Query#mergeBooleanQueries argument should be of type BooleanQuery[] instead of Query[]

2009-06-16 Thread Simon Willnauer (JIRA)
Query#mergeBooleanQueries argument should be of type BooleanQuery[] instead of Query[] -- Key: LUCENE-1694 URL: https://issues.apache.org/jira/browse/LUCENE-1694 Proj

[jira] Commented: (LUCENE-1693) AttributeSource/TokenStream API improvements

2009-06-16 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720076#action_12720076 ] Shai Erera commented on LUCENE-1693: I have a couple of TokenFilters that work that wa

[jira] Updated: (LUCENE-1694) Query#mergeBooleanQueries argument should be of type BooleanQuery[] instead of Query[]

2009-06-16 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-1694: Priority: Minor (was: Major) > Query#mergeBooleanQueries argument should be of type Boole

[jira] Commented: (LUCENE-1693) AttributeSource/TokenStream API improvements

2009-06-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720081#action_12720081 ] Uwe Schindler commented on LUCENE-1693: --- If you clone, you would not fall into the m

Re: Field.tokenStreamValue

2009-06-16 Thread Michael McCandless
That sounds good. Mike On Tue, Jun 16, 2009 at 6:53 AM, Uwe Schindler wrote: > Maybe we should also add ctors to Field, with TokenStream and String/binary > that set Field.Store.YES (compress is deprecated, so no need to support). > > - > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen >

[jira] Commented: (LUCENE-1673) Move TrieRange to core

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720082#action_12720082 ] Michael McCandless commented on LUCENE-1673: I think deprecating DateTools mak

[jira] Commented: (LUCENE-1673) Move TrieRange to core

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720084#action_12720084 ] Michael McCandless commented on LUCENE-1673: bq. NumericField would only work

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Michael McCandless
+1 to all 4. Mike On Tue, Jun 16, 2009 at 6:37 AM, Michael Busch wrote: > Probably everyone is thinking right now "Oh no! Not again!". I admit I > didn't fully read the incredibly long recent thread about > backwards-compatibility, so maybe what I'm about to propose has been > proposed already. I

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Simon Willnauer
+1 to all 4. On Tue, Jun 16, 2009 at 2:07 PM, Michael McCandless wrote: > +1 to all 4. > > Mike > > On Tue, Jun 16, 2009 at 6:37 AM, Michael Busch wrote: >> Probably everyone is thinking right now "Oh no! Not again!". I admit I >> didn't fully read the incredibly long recent thread about >> backwa

[jira] Commented: (LUCENE-1673) Move TrieRange to core

2009-06-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720089#action_12720089 ] Uwe Schindler commented on LUCENE-1673: --- bq. Actually, this need not be a limitation

[jira] Updated: (LUCENE-1694) Query#mergeBooleanQueries argument should be of type BooleanQuery[] instead of Query[]

2009-06-16 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-1694: Attachment: Query_mergeBooleanQueries.patch Attached patch + testcase. The patch passes al

[jira] Created: (LUCENE-1695) Update the Highlighter to use the new TokenStream API

2009-06-16 Thread Mark Miller (JIRA)
Update the Highlighter to use the new TokenStream API - Key: LUCENE-1695 URL: https://issues.apache.org/jira/browse/LUCENE-1695 Project: Lucene - Java Issue Type: Improvement Comp

[jira] Updated: (LUCENE-1695) Update the Highlighter to use the new TokenStream API

2009-06-16 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated LUCENE-1695: Attachment: LUCENE-1695.patch Rough, non backward compat patch. There is still an issue with test

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Mark Miller
Just to cat call from the corner over here: So unless you update on *every* minor release, from a users perspective, this is the same as tossing out API back compat (though still with the option to keep what we want around as long as we want) ? Michael Busch wrote: Probably everyone is think

Re: Field.tokenStreamValue

2009-06-16 Thread Yonik Seeley
Yep, it's also useful for pre-analyzing text. Wish I had it way back when I started Solr (to avoid an unneccessary pass through the analyzer, I actually stored and indexed the number in transformed but untokenized form... not great for Luke :-) -Yonik http://www.lucidimagination.com On Tue, Jun 1

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Grant Ingersoll
+1 on everything. This is the sanity we need, especially #2. Thanks for bringing this up again. I'd add a slight mod to #2 that I think helps further communicate to users our expectations (marked by my initials GSI) by employing some convention in our @deprecated comments: 2. Deprecated

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Mark Miller
I'd be interested in what the users list has to say. With this many +1's, seems reasonable to take it over there. - Mark Grant Ingersoll wrote: +1 on everything. This is the sanity we need, especially #2. Thanks for bringing this up again. I'd add a slight mod to #2 that I think helps fur

[jira] Assigned: (LUCENE-1694) Query#mergeBooleanQueries argument should be of type BooleanQuery[] instead of Query[]

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-1694: -- Assignee: Michael McCandless > Query#mergeBooleanQueries argument should be of

[jira] Commented: (LUCENE-1694) Query#mergeBooleanQueries argument should be of type BooleanQuery[] instead of Query[]

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720139#action_12720139 ] Michael McCandless commented on LUCENE-1694: Patch looks good, thanks Simon.

[jira] Resolved: (LUCENE-1694) Query#mergeBooleanQueries argument should be of type BooleanQuery[] instead of Query[]

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1694. Resolution: Fixed Thank Simon! > Query#mergeBooleanQueries argument should be of

[jira] Commented: (LUCENE-1673) Move TrieRange to core

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720143#action_12720143 ] Michael McCandless commented on LUCENE-1673: bq. I only wanted to hear one mor

[jira] Assigned: (LUCENE-1692) Contrib analyzers need tests

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-1692: -- Assignee: Michael McCandless > Contrib analyzers need tests >

[jira] Updated: (LUCENE-1692) Contrib analyzers need tests

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1692: --- Fix Version/s: 2.9 > Contrib analyzers need tests > > >

[jira] Commented: (LUCENE-1692) Contrib analyzers need tests

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720144#action_12720144 ] Michael McCandless commented on LUCENE-1692: These are much needed... thanks R

Re: madvise(ptr, len, MADV_SEQUENTIAL)

2009-06-16 Thread Michael McCandless
Lucene could really make use of this method. When a segment merge takes place, we can read & write many GB of data, which without madvise on many OSs would effectively flush the IO cache (thus hurting our search performance). Mike On Mon, Jun 15, 2009 at 6:01 PM, Jason Rutherglen wrote: > Thanks

[jira] Commented: (LUCENE-1313) Near Realtime Search

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720147#action_12720147 ] Michael McCandless commented on LUCENE-1313: {quote} I think this is highlight

[jira] Commented: (LUCENE-1313) Near Realtime Search

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720148#action_12720148 ] Michael McCandless commented on LUCENE-1313: bq. conditionalize them to run on

[jira] Commented: (LUCENE-1673) Move TrieRange to core

2009-06-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720152#action_12720152 ] Uwe Schindler commented on LUCENE-1673: --- With a NumericTermQuery you would only hit

RE: madvise(ptr, len, MADV_SEQUENTIAL)

2009-06-16 Thread Uwe Schindler
But to use it, we should change MMapDirectory to also use the mapping when writing to files. I thought about it, it is very simple to implement (just copy the IndexInput and change all gets() to sets()) - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@the

[jira] Commented: (LUCENE-1692) Contrib analyzers need tests

2009-06-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720154#action_12720154 ] Robert Muir commented on LUCENE-1692: - Michael: LUCENE-973 would save me from having t

[jira] Updated: (LUCENE-1696) Added New Token API impl for ASCIIFoldingFilter

2009-06-16 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-1696: Attachment: ASCIIFoldingFilter._newTokenAPI.patch all tests pass > Added New Token API im

[jira] Created: (LUCENE-1696) Added New Token API impl for ASCIIFoldingFilter

2009-06-16 Thread Simon Willnauer (JIRA)
Added New Token API impl for ASCIIFoldingFilter --- Key: LUCENE-1696 URL: https://issues.apache.org/jira/browse/LUCENE-1696 Project: Lucene - Java Issue Type: Improvement Components: Anal

[jira] Commented: (LUCENE-1673) Move TrieRange to core

2009-06-16 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720163#action_12720163 ] Yonik Seeley commented on LUCENE-1673: -- bq. We could easily add "numeric"; then Fiel

[jira] Resolved: (LUCENE-1681) DocValues infinite loop caused by - a call to getMinValue | getMaxValue | getAverageValue

2009-06-16 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved LUCENE-1681. - Resolution: Fixed Thanks Simon! > DocValues infinite loop caused by - a call to getMinValue | g

[jira] Commented: (LUCENE-973) Token of "" returns in CJKTokenizer + new TestCJKTokenizer

2009-06-16 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720169#action_12720169 ] Mark Miller commented on LUCENE-973: So the latest patch is ready to go in? I guess I c

[jira] Commented: (LUCENE-1377) Add HTMLStripReader and WordDelimiterFilter from SOLR

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720171#action_12720171 ] Michael McCandless commented on LUCENE-1377: Robert, would ICUTokenizer (LUCNE

[jira] Commented: (LUCENE-1696) Added New Token API impl for ASCIIFoldingFilter

2009-06-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720173#action_12720173 ] Robert Muir commented on LUCENE-1696: - Simon, I think if you want to handle accents in

[jira] Commented: (LUCENE-1377) Add HTMLStripReader and WordDelimiterFilter from SOLR

2009-06-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720175#action_12720175 ] Robert Muir commented on LUCENE-1377: - they are a bit different. for example: wordde

[jira] Commented: (LUCENE-973) Token of "" returns in CJKTokenizer + new TestCJKTokenizer

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720177#action_12720177 ] Michael McCandless commented on LUCENE-973: --- I'll take it Mark! Fixes a bug and

[jira] Assigned: (LUCENE-973) Token of "" returns in CJKTokenizer + new TestCJKTokenizer

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-973: - Assignee: Michael McCandless > Token of "" returns in CJKTokenizer + new TestCJK

[jira] Commented: (LUCENE-1696) Added New Token API impl for ASCIIFoldingFilter

2009-06-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720183#action_12720183 ] Robert Muir commented on LUCENE-1696: - i uploaded a testcase under LUCENE-1581 showing

[jira] Commented: (LUCENE-1673) Move TrieRange to core

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720187#action_12720187 ] Michael McCandless commented on LUCENE-1673: bq. With a NumericTermQuery you w

[jira] Commented: (LUCENE-1696) Added New Token API impl for ASCIIFoldingFilter

2009-06-16 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720189#action_12720189 ] Simon Willnauer commented on LUCENE-1696: - bq. i don't see an alternative, otherwi

[jira] Assigned: (LUCENE-1696) Added New Token API impl for ASCIIFoldingFilter

2009-06-16 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reassigned LUCENE-1696: --- Assignee: Mark Miller > Added New Token API impl for ASCIIFoldingFilter > --

[jira] Assigned: (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2009-06-16 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reassigned LUCENE-1486: --- Assignee: Mark Miller > Wildcards, ORs etc inside Phrase queries > -

[jira] Updated: (LUCENE-1696) Added New Token API impl for ASCIIFoldingFilter

2009-06-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-1696: Attachment: TestGermanCollation.java show how to do this with german... its a bit more involved si

[jira] Created: (LUCENE-1697) MoreLikeThis should use the new Token API

2009-06-16 Thread Grant Ingersoll (JIRA)
MoreLikeThis should use the new Token API - Key: LUCENE-1697 URL: https://issues.apache.org/jira/browse/LUCENE-1697 Project: Lucene - Java Issue Type: Improvement Reporter: Grant Ingersoll

[jira] Commented: (LUCENE-1696) Added New Token API impl for ASCIIFoldingFilter

2009-06-16 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720192#action_12720192 ] Simon Willnauer commented on LUCENE-1696: - Thanks robert, I did know about collati

[jira] Commented: (LUCENE-1673) Move TrieRange to core

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720191#action_12720191 ] Michael McCandless commented on LUCENE-1673: {quote} bq. We could easily add "

[jira] Commented: (LUCENE-1696) Added New Token API impl for ASCIIFoldingFilter

2009-06-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720193#action_12720193 ] Robert Muir commented on LUCENE-1696: - simon, actually i think its documented you can

[jira] Commented: (LUCENE-1693) AttributeSource/TokenStream API improvements

2009-06-16 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720196#action_12720196 ] Michael Busch commented on LUCENE-1693: --- But, the additional copying would affect pe

[jira] Commented: (LUCENE-1696) Added New Token API impl for ASCIIFoldingFilter

2009-06-16 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720197#action_12720197 ] Simon Willnauer commented on LUCENE-1696: - bq. simon, actually i think its docume

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Michael Busch
Wow this is *very* similar! :) On 6/16/09 4:29 AM, Shai Erera wrote: Since I proposed the same changes (http://www.nabble.com/Re%3A-Lucene%27s-default-settings---back-compatibility-p23792927.html), I can only give my +1 to all 4 :). On the other thread I also proposed to change the policy aro

Re: madvise(ptr, len, MADV_SEQUENTIAL)

2009-06-16 Thread Earwin Burrfoot
Except, you don't know the size of the file to be written upfront. One probable solution is to map output file in pages. As a complementary solution you can map a huge area of the file, and hope few real memory is allocated by OS unless you actually write all over that area. Dunno. The idea of usin

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Michael Busch
Sounds good, Grant. I'll open a task to change the policy with target release=3.0. Michael On 6/16/09 6:53 AM, Grant Ingersoll wrote: +1 on everything. This is the sanity we need, especially #2. Thanks for bringing this up again. I'd add a slight mod to #2 that I think helps further comm

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Earwin Burrfoot
Oh yes! Again! +1 One point is missing. What about incompatible behavioral changes that do not touch API and file format? Like posIncr=0 at the first token in stream, or analyzer fixes, or something along these lines. Are we free to introduce them in a minor release without warning, or are we goi

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Michael Busch
Fair enough. We certainly want our users to understand our reasons for these changes, and keep their trust that we're making our best efforts to keep upgrading as effortless as possible. However, there will always be someone who is not happy with such a change. But if the vast majority of the

[jira] Commented: (LUCENE-973) Token of "" returns in CJKTokenizer + new TestCJKTokenizer

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720200#action_12720200 ] Michael McCandless commented on LUCENE-973: --- Does anyone know if the added recurs

[jira] Commented: (LUCENE-1696) Added New Token API impl for ASCIIFoldingFilter

2009-06-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720201#action_12720201 ] Robert Muir commented on LUCENE-1696: - since this seems to be a recurring theme maybe

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Yonik Seeley
So under this proposal, what's the difference between a major and minor release? -Yonik http://www.lucidimagination.com On Tue, Jun 16, 2009 at 6:37 AM, Michael Busch wrote: > Probably everyone is thinking right now "Oh no! Not again!". I admit I > didn't fully read the incredibly long recent t

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Michael Busch
I'd suggest to treat a runtime change like an API change (unless it's fixing a bug of course), i.e. giving a warning, providing a switch, switching the default behavior only after a major or minor release was around that had the warning/switch. Michael On 6/16/09 8:54 AM, Earwin Burrfoot wro

[jira] Commented: (LUCENE-973) Token of "" returns in CJKTokenizer + new TestCJKTokenizer

2009-06-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720204#action_12720204 ] Robert Muir commented on LUCENE-973: sounds like another good test case, add a few thou

Re: madvise(ptr, len, MADV_SEQUENTIAL)

2009-06-16 Thread Michael McCandless
Hmm... posix_fadvise lets you do this with a file descriptor; this would be better for Lucene (per descriptor not per mapped region of RAM) since we could "advise" independent of which FSDir impl is in use... Mike On Tue, Jun 16, 2009 at 10:32 AM, Uwe Schindler wrote: > But to use it, we should c

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Shai Erera
Index back-compat is guaranteed to hold within minor releases. On Tue, Jun 16, 2009 at 6:59 PM, Yonik Seeley wrote: > So under this proposal, what's the difference between a major and minor > release? > > -Yonik > http://www.lucidimagination.com > > > > On Tue, Jun 16, 2009 at 6:37 AM, Michael Bu

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Mark Miller
Right - I'm not saying that the users should trump the devs, just curious what the response will be, if any. I also think that when we update the back compat policy, there should be wording that stresses where we should use our new powers carefully (eg common API's and such). And we should u

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Michael Busch
From a backwards-compatibility point of view, nothing really. Michael On 6/16/09 8:59 AM, Yonik Seeley wrote: So under this proposal, what's the difference between a major and minor release? -Yonik http://www.lucidimagination.com On Tue, Jun 16, 2009 at 6:37 AM, Michael Busch wrote:

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Michael Busch
Except regarding file format compatibility, see 1. On 6/16/09 9:04 AM, Michael Busch wrote: >From a backwards-compatibility point of view, nothing really. Michael On 6/16/09 8:59 AM, Yonik Seeley wrote: So under this proposal, what's the difference between a major and minor release? -Yonik

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Shai Erera
Ahh ... I wish I had finished http://www.nabble.com/Re%3A-Lucene%27s-default-settings---back-compatibility-p23792927.htmlwith +1 of my own. Guess that's what was missing to get it to closure :). Shai On Tue, Jun 16, 2009 at 7:03 PM, Michael Busch wrote: > I'd suggest to treat a runtime change

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Mark Miller
Yeah, the only difference now is that we can remove deprecated APIs. And I guess we add nothing. Which is, as Micahel has said, is goofy. 3.0 will be 2.9 like 1.9 was 2.0. Without deprecations. Not a big deal at all, but I find it goofy too. - Mark Michael Busch wrote: From a backwards-comp

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Michael Busch
Well I'd actually hope that there will be significantly less need to do these "tricks" to get around the new policy. I'll open a JIRA issue and we can use it to work on the exact wording. Michael On 6/16/09 9:03 AM, Mark Miller wrote: Right - I'm not saying that the users should trump the dev

[jira] Commented: (LUCENE-973) Token of "" returns in CJKTokenizer + new TestCJKTokenizer

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720207#action_12720207 ] Michael McCandless commented on LUCENE-973: --- Well, my question is: is there any i

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Mark Miller
I would guess you hit what I call "thread fatigue" by the time you summed that up :) Michael hasn't been around for a bit - perhaps it was easier for him to spawn a new thread. Also, much shorter text to read :) Shai Erera wrote: Ahh ... I wish I had finished http://www.nabble.com/Re%3A-Luc

[jira] Created: (LUCENE-1698) Change backwards-compatibility policy

2009-06-16 Thread Michael Busch (JIRA)
Change backwards-compatibility policy - Key: LUCENE-1698 URL: https://issues.apache.org/jira/browse/LUCENE-1698 Project: Lucene - Java Issue Type: Task Reporter: Michael Busch Assig

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Shai Erera
Also, much shorter text to read :) You're right, Michael's is 484 words, mine was 691. But in my defense, I did offer two more changes, that were later brought up on this thread (summing to 563 words) :). Anyway, I'm glad it's kept alive and hopefully things will change. Shai On Tue, Jun 16, 20

[jira] Commented: (LUCENE-973) Token of "" returns in CJKTokenizer + new TestCJKTokenizer

2009-06-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720212#action_12720212 ] Robert Muir commented on LUCENE-973: Michael i don't see anything obvious, but a test c

[jira] Updated: (LUCENE-973) Token of "" returns in CJKTokenizer + new TestCJKTokenizer

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-973: -- Attachment: LUCENE-973.patch Or... how about we just switch to iteration not recursion?

[jira] Commented: (LUCENE-1693) AttributeSource/TokenStream API improvements

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720222#action_12720222 ] Michael McCandless commented on LUCENE-1693: bq. What do you or others think a

  1   2   >