[jira] Updated: (LUCENE-1084) increase default maxFieldLength?

2008-01-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1084: --- Fix Version/s: 2.4 > increase default maxFieldLength? >

[jira] Commented: (LUCENE-1084) increase default maxFieldLength?

2008-01-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559440#action_12559440 ] Michael McCandless commented on LUCENE-1084: +1 Users frequently trip up on t

[jira] Assigned: (LUCENE-1084) increase default maxFieldLength?

2008-01-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-1084: -- Assignee: Michael McCandless > increase default maxFieldLength? >

[jira] Commented: (LUCENE-205) [PATCH] Patches for RussianAnalyzer

2008-01-16 Thread Vladimir Yuryev (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559443#action_12559443 ] Vladimir Yuryev commented on LUCENE-205: Thanks you for the attentiveness to so sma

counting sub tasks in contrib/benchmark

2008-01-16 Thread Michael McCandless
I'd like to run an alg like this: ResetSystemErase { "BuildIndex" CreateIndex { "AddDocs" AddDoc > : 20 CloseIndex } RepSumByPrefRound BuildIndex But in the report, for rec/s, I'd like to see the total BuildIndex time divided by 200,000, ie, the net time per document to

Re: counting sub tasks in contrib/benchmark

2008-01-16 Thread Grant Ingersoll
I think you can do: RepSumByPref AddDocs And it will report on just that, for instance, in the standard.alg, this is done inside the round to report out info on that rounds AddDocs. I think you could even do it outside the round, just by substituting BuildIndex for "AddDocs". In general,

Re: counting sub tasks in contrib/benchmark

2008-01-16 Thread Doron Cohen
On Jan 16, 2008 1:25 PM, Michael McCandless <[EMAIL PROTECTED]> wrote: > I'd like to run an alg like this: > > ResetSystemErase > { "BuildIndex" > CreateIndex > { "AddDocs" AddDoc > : 20 > CloseIndex > } > > RepSumByPrefRound BuildIndex > > But in the report, for rec/s, I'd

[jira] Commented: (LUCENE-1084) increase default maxFieldLength?

2008-01-16 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559490#action_12559490 ] Grant Ingersoll commented on LUCENE-1084: - Does this break back-compatibility if w

Re: counting sub tasks in contrib/benchmark

2008-01-16 Thread Doron Cohen
On Jan 16, 2008 3:29 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > I think you can do: > > RepSumByPref AddDocs > > And it will report on just that, for instance, in the standard.alg, > this is done inside the round to report out info on that rounds AddDocs. > > I think you could even do it out

Re: counting sub tasks in contrib/benchmark

2008-01-16 Thread Grant Ingersoll
On Jan 16, 2008, at 8:33 AM, Doron Cohen wrote: On Jan 16, 2008 3:29 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: I think you can do: RepSumByPref AddDocs And it will report on just that, for instance, in the standard.alg, this is done inside the round to report out info on that rounds

[jira] Assigned: (LUCENE-1133) WikipediaTokenizer needs a way of not tokenizing certain parts of the text

2008-01-16 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll reassigned LUCENE-1133: --- Assignee: Grant Ingersoll > WikipediaTokenizer needs a way of not tokenizing certain

[jira] Updated: (LUCENE-1133) WikipediaTokenizer needs a way of not tokenizing certain parts of the text

2008-01-16 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-1133: Attachment: LUCENE-1133.patch Here's a completely back-compatible (if there is such a thin

Re: counting sub tasks in contrib/benchmark

2008-01-16 Thread Michael McCandless
Doron Cohen wrote: On Jan 16, 2008 1:25 PM, Michael McCandless <[EMAIL PROTECTED]> wrote: I'd like to run an alg like this: ResetSystemErase { "BuildIndex" CreateIndex { "AddDocs" AddDoc > : 20 CloseIndex } RepSumByPrefRound BuildIndex But in the report, for rec/s

[jira] Created: (LUCENE-1136) add ability to not count sub-task doLogic increment to contri/benchmark

2008-01-16 Thread Michael McCandless (JIRA)
add ability to not count sub-task doLogic increment to contri/benchmark --- Key: LUCENE-1136 URL: https://issues.apache.org/jira/browse/LUCENE-1136 Project: Lucene - Java Is

[jira] Commented: (LUCENE-1084) increase default maxFieldLength?

2008-01-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559549#action_12559549 ] Michael McCandless commented on LUCENE-1084: {quote} Does this break back-com

[jira] Created: (LUCENE-1137) Token type as BitSet: typeBits()

2008-01-16 Thread Grant Ingersoll (JIRA)
Token type as BitSet: typeBits() Key: LUCENE-1137 URL: https://issues.apache.org/jira/browse/LUCENE-1137 Project: Lucene - Java Issue Type: New Feature Components: Analysis Reporter: Gra

[jira] Updated: (LUCENE-1137) Token type as BitSet: typeBits()

2008-01-16 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-1137: Attachment: LUCENE-1137.patch Added get/setTypeBits() method and underlying storage and co

[jira] Commented: (LUCENE-1084) increase default maxFieldLength?

2008-01-16 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559583#action_12559583 ] Grant Ingersoll commented on LUCENE-1084: - I don't know the answer to those questi

[jira] Commented: (LUCENE-1137) Token type as BitSet: typeBits()

2008-01-16 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559588#action_12559588 ] Yonik Seeley commented on LUCENE-1137: -- Gack! I recommended a bitset on Token previo

[jira] Commented: (LUCENE-1137) Token type as BitSet: typeBits()

2008-01-16 Thread Steven Rowe (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559589#action_12559589 ] Steven Rowe commented on LUCENE-1137: - I see two problems with this patch: 1. Althoug

[jira] Commented: (LUCENE-1137) Token type as BitSet: typeBits()

2008-01-16 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559595#action_12559595 ] Grant Ingersoll commented on LUCENE-1137: - {quote} The information encoded by BitS

[jira] Commented: (LUCENE-1137) Token type as BitSet: typeBits()

2008-01-16 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559606#action_12559606 ] Grant Ingersoll commented on LUCENE-1137: - Never mind on the isClaimed() idea, I d

[jira] Commented: (LUCENE-1137) Token type as BitSet: typeBits()

2008-01-16 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559607#action_12559607 ] Yonik Seeley commented on LUCENE-1137: -- If we go with the bitset (int or long!!!), "t

[jira] Updated: (LUCENE-1137) Token type as BitSet: typeBits()

2008-01-16 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-1137: Attachment: LUCENE-1137.patch Per feedback from Yonik, changes this to use an int. The cl

[jira] Commented: (LUCENE-1137) Token type as BitSet: typeBits()

2008-01-16 Thread Steven Rowe (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559637#action_12559637 ] Steven Rowe commented on LUCENE-1137: - Looks like the constructors still take a BitSet

[jira] Updated: (LUCENE-1137) Token type as BitSet: typeBits()

2008-01-16 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-1137: Attachment: LUCENE-1137.patch Let's try a patch that actually compiles > Token type as Bi

[jira] Updated: (LUCENE-1133) WikipediaTokenizer needs a way of not tokenizing certain parts of the text

2008-01-16 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-1133: Attachment: LUCENE_1133_1137.patch Here's a patch that also includes LUCENE-1137 and sets

[jira] Commented: (LUCENE-390) Contribution: LuceneIndexAccessor

2008-01-16 Thread vivek (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559694#action_12559694 ] vivek commented on LUCENE-390: -- Is there any plan to incorporate this IndexAccessor package in

[jira] Commented: (LUCENE-390) Contribution: LuceneIndexAccessor

2008-01-16 Thread vivek (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559694#action_12559694 ] vivek commented on LUCENE-390: -- Is there any plan to incorporate this IndexAccessor package in

[jira] Issue Comment Edited: (LUCENE-390) Contribution: LuceneIndexAccessor

2008-01-16 Thread vivek (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559694#action_12559694 ] vivash edited comment on LUCENE-390 at 1/16/08 2:33 PM: --- Is there any

[jira] Issue Comment Edited: (LUCENE-390) Contribution: LuceneIndexAccessor

2008-01-16 Thread vivek (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559694#action_12559694 ] vivash edited comment on LUCENE-390 at 1/16/08 2:33 PM: --- Is there any

[jira] Assigned: (LUCENE-1136) add ability to not count sub-task doLogic increment to contri/benchmark

2008-01-16 Thread Doron Cohen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doron Cohen reassigned LUCENE-1136: --- Assignee: Doron Cohen > add ability to not count sub-task doLogic increment to contri/benchm

[jira] Updated: (LUCENE-1136) add ability to not count sub-task doLogic increment to contri/benchmark

2008-01-16 Thread Doron Cohen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doron Cohen updated LUCENE-1136: Attachment: lucene-1136.patch This patch should do it. Give it a try Mike? (patch also fixing a m

[jira] Commented: (LUCENE-1084) increase default maxFieldLength?

2008-01-16 Thread Doug Cutting (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559730#action_12559730 ] Doug Cutting commented on LUCENE-1084: -- This kind of limit is common on web search en

[jira] Commented: (LUCENE-1084) increase default maxFieldLength?

2008-01-16 Thread Steven Rowe (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559748#action_12559748 ] Steven Rowe commented on LUCENE-1084: - An alternative to changing the default setting

[jira] Commented: (LUCENE-1050) SimpleFSLockFactory ignores error on deleting the lock file

2008-01-16 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559780#action_12559780 ] Grant Ingersoll commented on LUCENE-1050: - I'm getting an exception here, when usi

[jira] Commented: (LUCENE-1050) SimpleFSLockFactory ignores error on deleting the lock file

2008-01-16 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559795#action_12559795 ] Hoss Man commented on LUCENE-1050: -- Grant: my take on this is that SpellChecker.clearInde

[jira] Created: (LUCENE-1138) SpellChecker.clearIndex calls unlock inappropriately

2008-01-16 Thread Hoss Man (JIRA)
SpellChecker.clearIndex calls unlock inappropriately Key: LUCENE-1138 URL: https://issues.apache.org/jira/browse/LUCENE-1138 Project: Lucene - Java Issue Type: Bug Components: co

[jira] Commented: (LUCENE-1050) SimpleFSLockFactory ignores error on deleting the lock file

2008-01-16 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559803#action_12559803 ] Yonik Seeley commented on LUCENE-1050: -- FYI, I just verified that Solr does this corr

Re: A bit of planning

2008-01-16 Thread Chris Hostetter
: If I remember right, the file format changed in 2.1, such that 2.0 could not : read a 2.1 index. that is totally within the bounds of the compatibility statement... http://wiki.apache.org/lucene-java/BackwardsCompatibility >>Note that older releases are never guaranteed to be able to read inde