Re: Proposal about Version API "relaxation"

2010-04-14 Thread Marvin Humphrey
s in the library to users via version numbers more accurately... * There should not be a Lucene 3.9. * Lucene 4.0 should do more than remove deprecations. Marvin Humphrey [1] Thanks to Robert and Mark Miller for reminding me just what the Solr/Lucene-

Re: Proposal about Version API "relaxation"

2010-04-14 Thread Marvin Humphrey
rm my recollection, but I seem to remember back compat for Analyzers coming up every once in a while -- say, in the context of modifying StandardAnalyzer's stoplist -- and changes not being made because they would change search results. Marvin Humphrey --

Re: Proposal about Version API "relaxation"

2010-04-13 Thread Marvin Humphrey
ng is crying out for a rethink in the Lucene back compat policy, IMO that's it: make major version breaks act like major version breaks and change stuff that needs changin'. Marvin Humphrey - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Proposal about Version API "relaxation"

2010-04-13 Thread Marvin Humphrey
ncrement individual Analyzer version numbers instead. This wouldn't solve the problem for good defaults elsewhere in the library. For that, I see no remedy other than more frequent major version increments. Marvin Humphrey - To u

Re: Changing the subject for a JIRA-issue (Was: [jira] Created: (LUCENE-2335) optimization: when sorting by field, if index has one segment and field values are not needed, do not load String[] into f

2010-04-06 Thread Marvin Humphrey
posts become harder to discover via search. In this case, that would mean either closing this issue and opening a new one, or taking the discussion to the mailing list where subject headers may be modified as the conversation evol

[jira] Commented: (LUCENE-2345) Make it possible to subclass SegmentReader

2010-03-26 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12850356#action_12850356 ] Marvin Humphrey commented on LUCENE-2345: - > Is there a ticket or wiki pa

Re: Baby steps towards making Lucene's scoring more flexible...

2010-03-26 Thread Marvin Humphrey
ty (aggressive index-time data reduction) by putting Lucy's Similarity under Lucy::Index instead of Lucy::Search. Marvin Humphrey - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Baby steps towards making Lucene's scoring more flexible...

2010-03-25 Thread Marvin Humphrey
It's a nice > > feature, > > but I don't think we've worked out all the problems yet. If we can, I might > > switch to +1 (FWIW). > > What problems remain, for Lucene? Storage, formatting, and compression of boosts. I'm also concerned about making significant changes to the file format when you've indicated they're "for starters". IMO, file format changes ought to clear a higher bar than that. But I expect to to dissent on that point. Marvin Humphrey - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: #lucene IRC log [was: RE: lucene and solr trunk]

2010-03-23 Thread Marvin Humphrey
eparing a summary. Marvin Humphrey - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Baby steps towards making Lucene's scoring more flexible...

2010-03-22 Thread Marvin Humphrey
nderstand low level stuff" when > setting their field types during indexing. > > You don't think flexible scoring is that important ("just reindex") > and that's it's not great to have users understand low level stats for > indexing. I&#x

Re: Baby steps towards making Lucene's scoring more flexible...

2010-03-15 Thread Marvin Humphrey
, and you don't need scores. You've gotta be able to fake it at least. > (Ie I can't change the field to MatchOnlySim, but, I have a some workaround > that lets me achieve the same functionality...?). It's not a workaround. Things just work that way. Without get

Re: Baby steps towards making Lucene's scoring more flexible...

2010-03-14 Thread Marvin Humphrey
MatchOnlySim knows how to pull a > docID only postings iterator from that field. You seem to be fixated on the notion of swapping in a MatchOnlySim object at search time. You can't do that in KS/Lucy, because you can&#

[jira] Commented: (LUCENE-2316) Define clear semantics for Directory.fileLength

2010-03-13 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844906#action_12844906 ] Marvin Humphrey commented on LUCENE-2316: - Is it really necessary to obtain

Re: Baby steps towards making Lucene's scoring more flexible...

2010-03-12 Thread Marvin Humphrey
> "congruent" with what's stored in the index ie that downgrading to > MatchOnlySim is allowed, but swapping to a different scoring model is > not (because norms are committed at indexing time). I'm not sure that e.g. TermScorer would even know what S

[jira] Commented: (LUCENE-2308) Separately specify a field's type

2010-03-12 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844688#action_12844688 ] Marvin Humphrey commented on LUCENE-2308: - > Also creating a FieldType wi

Re: [jira] Commented: (LUCENE-2308) Separately specify a field's type

2010-03-12 Thread Marvin Humphrey
out in the "baby steps" thread, omitTFAP() is often misunderstood. Marvin Humphrey - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2308) Separately specify a field's type

2010-03-12 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844659#action_12844659 ] Marvin Humphrey commented on LUCENE-2308: - I'm simply suggesting

[jira] Commented: (LUCENE-2308) Separately specify a field's type

2010-03-12 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844637#action_12844637 ] Marvin Humphrey commented on LUCENE-2308: - If you disable term freq, you

[jira] Commented: (LUCENE-2308) Separately specify a field's type

2010-03-12 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844626#action_12844626 ] Marvin Humphrey commented on LUCENE-2308: - I think we might consider match

Re: Baby steps towards making Lucene's scoring more flexible...

2010-03-11 Thread Marvin Humphrey
ing in advance so that core can find it. If that's this object's main role, I'd suggest "CodecRegistry". > Naming is the hardest part!! For me, the hardest parts of API design are... A) Designing public abstract classes / interfaces. B) Compensating for the curse of

Re: Baby steps towards making Lucene's scoring more flexible...

2010-03-09 Thread Marvin Humphrey
want to use the stronger, more constrictive check, right? > > You mean single inheritance? No. Because then we hardwire the attrs > to the Codec. Standard codec should encode whatever attrs the app > hands us... I think. I might approach things the same way if Clownfish supported interface method dispatch. :) As it is, though, I'm not sure that the single inheritance requirement is an important liability. Marvin Humphrey - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Multi-node stats within individual nodes (was "Baby steps...")

2010-03-09 Thread Marvin Humphrey
ing with vbyte, PFOR or whatever... then divide at search time to get average term frequency. That way, you also avoid committing to a float encoding, which I don't think Lucene has standardized yet. Marvin Humphrey - To

Re: Baby steps towards making Lucene's scoring more flexible...

2010-03-09 Thread Marvin Humphrey
he positions value as a member variable (direct struct access) rather than invoking a method. By default, struct definitions are opaque and thus member vars are inaccessible (to encourage loose coupling), but we override that in certain cases for performance. However, direct struct access requires

Re: Multi-node stats within individual nodes (was "Baby steps...")

2010-03-08 Thread Marvin Humphrey
d in later calculations. BTW, I think we should refer to these bytes as "boost bytes" rather than "norms". Their purpose is not simply to convey length normalization; they also include document boost and field boost. And the length normalization multiplier is a kind of

Re: Baby steps towards making Lucene's scoring more flexible...

2010-03-08 Thread Marvin Humphrey
to use new compression techniques to write new segments. > > Just a thought: why not make positions an attribute on a DocsEnum? > > Maybe... though I think the double method call (enum.next() then > posAttr.get()) is too much added cost. Why wouldn't it work to hav

Re: Baby steps towards making Lucene's scoring more flexible...

2010-03-07 Thread Marvin Humphrey
via a method, that's up in the air. Hmm. >From a class-design perspective, it would probably be best to go with an attribute, since Lucy has only single-inheritance and no interfaces. A rigid class hierarchy is going to cause problems when you need an iterator that combines unrelated conc

Multi-node stats within individual nodes (was "Baby steps...")

2010-03-07 Thread Marvin Humphrey
ot;5" as a shortcut, or the consolidated segment's averages will be wrong from the get-go. That's what I was getting at earlier. However, I'd thought that we could get around the problem by fudging with maxDoc(), and I no longer believe that. I think full regeneration is the only way. Marvin Humphrey - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Baby steps towards making Lucene's scoring more flexible...

2010-03-05 Thread Marvin Humphrey
s > > situation than a subclass situation. > > match-only decoder is handled on flex now by asking for the DocsEnum > and then while iterating only using the .doc() (even if underlyingly > the codec spent effort decoding freq and maybe other things). > > If you want positions

Composing posts for both JIRA and email (was a JIRA post)

2010-03-04 Thread Marvin Humphrey
1; int bar = 2; {code} (Gag. I vastly prefer wikis that automatically apply fixed-width styling to any indented text.) One last tip for Lucy developers (and other non-Java devs). JIRA has limited syntax highlighting support -- Java, JavaScript, ActionScript, XML and SQL only -- and default

Re: Baby steps towards making Lucene's scoring more flexible...

2010-03-02 Thread Marvin Humphrey
se stats. Not sufficient, but it's probably a prerequisite. Since it's a common feature request anyway, I think it's a great place to start: http://lucene.markmail.org/message/ln2xkesici6aksbi http://lucene.markmail.org/thread/46vxibpubogtcy3g http://lucene.markmail.org/message/56bk6wrbwallyjvr https://issues.apache.org/jira/browse/LUCENE-2236 Marvin Humphrey - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Baby steps towards making Lucene's scoring more flexible...

2010-02-28 Thread Marvin Humphrey
hink adding half-baked features that require index-format changes is bad policy. If you're looking for small steps, my suggestion would be to focus on per-field Similarity support. Marvin Humphrey - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2282) Expose IndexFileNames as public, and make use of its methods in the code

2010-02-24 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12837988#action_12837988 ] Marvin Humphrey commented on LUCENE-2282: - > As the API is now

[jira] Commented: (LUCENE-2282) Expose IndexFileNames as public, and make use of its methods in the code

2010-02-23 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12837499#action_12837499 ] Marvin Humphrey commented on LUCENE-2282: - > Any application that extends

[jira] Commented: (LUCENE-2282) Expose IndexFileNames as public, and make use of its methods in the code

2010-02-23 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12837446#action_12837446 ] Marvin Humphrey commented on LUCENE-2282: - It seems to me that identifying

[jira] Commented: (LUCENE-2271) Function queries producing scores of -inf or NaN (e.g. 1/x) return incorrect results with TopScoreDocCollector

2010-02-20 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12836197#action_12836197 ] Marvin Humphrey commented on LUCENE-2271: - An awful lot of thought went

[jira] Commented: (LUCENE-1941) MinPayloadFunction returns 0 when only one payload is present

2010-02-12 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832989#action_12832989 ] Marvin Humphrey commented on LUCENE-1941: - > off on "vacation"

Re: Having a default constructor in Analyzers

2010-02-07 Thread Marvin Humphrey
the deployment of Version plays out with the user base. However, Lucy's approach won't work for Lucene because Lucene allows you to have fields with the same name and completely different semantics. Marvin Humphrey

[jira] Commented: (LUCENE-2213) Small improvements to ArrayUtil.getNextSize

2010-01-17 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12801433#action_12801433 ] Marvin Humphrey commented on LUCENE-2213: - > if it starts getting used ver

[jira] Commented: (LUCENE-2213) Small improvements to ArrayUtil.getNextSize

2010-01-17 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12801432#action_12801432 ] Marvin Humphrey commented on LUCENE-2213: - Seems like the one permutatio

[jira] Commented: (LUCENE-2213) Small improvements to ArrayUtil.getNextSize

2010-01-15 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12800828#action_12800828 ] Marvin Humphrey commented on LUCENE-2213: - Algorithm looks good. The additio

Re: Dynamic array reallocation algorithms

2010-01-13 Thread Marvin Humphrey
nto getNextSize() would be less of a problem if Lucene's implementation was less convoluted. It's only one line and one comment, but it's deceptively difficult to grok. Looks like some Perl golfer wrote it. ;) Marvin Humphrey

Re: Dynamic array reallocation algorithms

2010-01-13 Thread Marvin Humphrey
sible to ask for a specific array size and get that exact array size. (I think this is a bigger problem in Lucy than in Lucene, because we have to simulate bounded arrays with classes). Marvin Humphrey - To unsubscribe, e-mai

Re: Dynamic array reallocation algorithms

2010-01-13 Thread Marvin Humphrey
t is that important if in most cases where it will grow incrementally, you've already overallocated manually? Marvin Humphrey - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Dynamic array reallocation algorithms

2010-01-12 Thread Marvin Humphrey
If you compile Perl with -DUSE_MY_MALLOC it uses its own allocator, otherwise it uses the system's malloc. KinoSearch actually has a dedicated allocator it uses for a very targeted purpose, and this allocator has its own strategy for avoiding fragmentation. The golden mean issue is rele

Dynamic array reallocation algorithms

2010-01-12 Thread Marvin Humphrey
ently makes more sense to me than the current behavior. IMO, just overallocating by some multiplier between 1.125 and 1.5 achieves our primary goal of avoiding pathological reallocation behavior, and that's enough. How about simplifying ArrayUtil.getNextSize() down to this? r

Re: Compound File Default

2010-01-12 Thread Marvin Humphrey
e like hardware stats. Go ahead and change the default, but I've got a feeling you're about to relearn old lessons. Marvin Humphrey - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Compound File Default

2010-01-12 Thread Marvin Humphrey
re all Mac Book Pros. The exception was our DBA, who had high numbers (thousands) on both his MBP and his desktop. Marvin Humphrey - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands

Re: Compound File Default

2010-01-11 Thread Marvin Humphrey
-n 256 > If so, why not have them turn it on, instead of everyone else having to turn > it off. Can you up the file descriptor limit from within a running JVM? If not, you're setting yourself up with a non-portable def

[jira] Commented: (LUCENE-2026) Refactoring of IndexWriter

2009-12-23 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12794137#action_12794137 ] Marvin Humphrey commented on LUCENE-2026: - > we can't give hints

[jira] Commented: (LUCENE-2026) Refactoring of IndexWriter

2009-12-22 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793918#action_12793918 ] Marvin Humphrey commented on LUCENE-2026: - > Very interesting - thanks

[jira] Issue Comment Edited: (LUCENE-2026) Refactoring of IndexWriter

2009-12-22 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793431#action_12793431 ] Marvin Humphrey edited comment on LUCENE-2026 at 12/23/09 3:5

[jira] Commented: (LUCENE-2026) Refactoring of IndexWriter

2009-12-21 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793431#action_12793431 ] Marvin Humphrey commented on LUCENE-2026: - > I guess my confusion is what

[jira] Commented: (LUCENE-2026) Refactoring of IndexWriter

2009-12-19 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792939#action_12792939 ] Marvin Humphrey commented on LUCENE-2026: - > But, that's where Luc

[jira] Commented: (LUCENE-2026) Refactoring of IndexWriter

2009-12-18 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792638#action_12792638 ] Marvin Humphrey commented on LUCENE-2026: - Yes, this is using the sort c

[jira] Commented: (LUCENE-2026) Refactoring of IndexWriter

2009-12-18 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792625#action_12792625 ] Marvin Humphrey commented on LUCENE-2026: - > Well, autoCommit jus

[jira] Commented: (LUCENE-2026) Refactoring of IndexWriter

2009-12-16 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791549#action_12791549 ] Marvin Humphrey commented on LUCENE-2026: - >> Wasn't that a po

[jira] Commented: (LUCENE-2026) Refactoring of IndexWriter

2009-12-13 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789905#action_12789905 ] Marvin Humphrey commented on LUCENE-2026: - > I think that's a

[jira] Commented: (LUCENE-2126) Split up IndexInput and IndexOutput into DataInput and DataOutput

2009-12-13 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789895#action_12789895 ] Marvin Humphrey commented on LUCENE-2126: - > I disagree with y

[jira] Commented: (LUCENE-2026) Refactoring of IndexWriter

2009-12-11 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789614#action_12789614 ] Marvin Humphrey commented on LUCENE-2026: - > I say it's better to sac

[jira] Commented: (LUCENE-2126) Split up IndexInput and IndexOutput into DataInput and DataOutput

2009-12-09 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788098#action_12788098 ] Marvin Humphrey commented on LUCENE-2126: - > These methods should only be

[jira] Commented: (LUCENE-2126) Split up IndexInput and IndexOutput into DataInput and DataOutput

2009-12-08 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12787876#action_12787876 ] Marvin Humphrey commented on LUCENE-2126: - I spent a long time today tryin

[jira] Commented: (LUCENE-2126) Split up IndexInput and IndexOutput into DataInput and DataOutput

2009-12-07 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786959#action_12786959 ] Marvin Humphrey commented on LUCENE-2126: - FWIW, this approach is sort of

Re: [jira] Resolved: (LUCENE-2119) If you pass Integer.MAX_VALUE as 2nd param to search(Query, int) you hit unexpected NegativeArraySizeException

2009-12-06 Thread Marvin Humphrey
ded to provide a more accurate error message in the event somebody specifies that they want Integer.MAX_VALUE elements, not realizing that they will be allocated up front rather than lazily -- they'll get an OOME rather than a NegativeArraySizeE

[jira] Commented: (LUCENE-1877) Use NativeFSLockFactory as default for new API (direct ctors & FSDir.open)

2009-11-23 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12781531#action_12781531 ] Marvin Humphrey commented on LUCENE-1877: - >> take it somewhere other

Socket and file locks

2009-11-23 Thread Marvin Humphrey
ly possible for the link() call to return false incorrectly when the hard link has actually been created, for instance because a network problem prevents the "success" packet from getting back to the client from the server. However, this is failsafe,

[jira] Commented: (LUCENE-1877) Use NativeFSLockFactory as default for new API (direct ctors & FSDir.open)

2009-11-20 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780647#action_12780647 ] Marvin Humphrey commented on LUCENE-1877: - > http://www.h2database.c

[jira] Commented: (LUCENE-2073) Document issues involved in building your index with one jdk version and then searching/updating with another

2009-11-17 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779015#action_12779015 ] Marvin Humphrey commented on LUCENE-2073: - > I am pretty sure StandardAnal

[jira] Commented: (LUCENE-2073) Document issues involved in building your index with one jdk version and then searching/updating with another

2009-11-17 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779006#action_12779006 ] Marvin Humphrey commented on LUCENE-2073: - > are you sure? StandardAnalyz

[jira] Commented: (LUCENE-2073) Document issues involved in building your index with one jdk version and then searching/updating with another

2009-11-17 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778998#action_12778998 ] Marvin Humphrey commented on LUCENE-2073: - I like this: > some parts of

[jira] Commented: (LUCENE-2073) Document issues involved in building your index with one jdk version and then searching/updating with another

2009-11-17 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778875#action_12778875 ] Marvin Humphrey commented on LUCENE-2073: - Which components are affected by

[jira] Commented: (LUCENE-1997) Explore performance of multi-PQ vs single-PQ sorting API

2009-10-28 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771201#action_12771201 ] Marvin Humphrey commented on LUCENE-1997: - > What kind of comparator ca

Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-05 Thread Marvin Humphrey
becomes easier to swap in the name of a class where most of the data can reside in nice, reliable, structured code. Marvin Humphrey - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-04 Thread Marvin Humphrey
marish notification system to propagate settings down into your subcomponents, which may or may not be prepared to handle the value modifications. Marvin Humphrey - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For addi

Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-04 Thread Marvin Humphrey
Directory dir, Analyzer analyzer) { return arch.buildIndexWriter(manager, dir, analyzer); } IMO, it's important not to force first-time users to grok builder classes in order to perform basic indexing or searching. Marvin Humphrey --

Re: custom segment files

2009-09-17 Thread Marvin Humphrey
inosearch/docs/devel/KSx/Index/ByteBufDocWriter.html http://www.rectangular.com/svn/kinosearch/trunk/perl/lib/KSx/Index/ByteBufDocWriter.pm > Hopefully I am describing it clearly. Sure, I understand exactly what you mean.

[jira] Commented: (LUCENE-1908) Similarity javadocs for scoring function to relate more tightly to scoring models in effect

2009-09-14 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12755062#action_12755062 ] Marvin Humphrey commented on LUCENE-1908: - The rationale behind the coarsenes

[jira] Commented: (LUCENE-1900) Confusing Javadoc in Searchable.java

2009-09-08 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12752675#action_12752675 ] Marvin Humphrey commented on LUCENE-1900: - IMO, maxDoc(), docFreq(), and docF

[jira] Commented: (LUCENE-1900) Confusing Javadoc in Searchable.java

2009-09-08 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12752612#action_12752612 ] Marvin Humphrey commented on LUCENE-1900: - maxDoc() isn't just

[jira] Commented: (LUCENE-1896) Modify confusing javadoc for queryNorm

2009-09-08 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12752591#action_12752591 ] Marvin Humphrey commented on LUCENE-1896: - > at what I am trusting is esse

[jira] Commented: (LUCENE-1896) Modify confusing javadoc for queryNorm

2009-09-07 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12752311#action_12752311 ] Marvin Humphrey commented on LUCENE-1896: - FWIW, after all that [fuss|

[jira] Commented: (LUCENE-1877) Improve IndexWriter javadoc on locking

2009-08-30 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749363#action_12749363 ] Marvin Humphrey commented on LUCENE-1877: - > I can see how this is not ide

[jira] Commented: (LUCENE-1877) Improve IndexWriter javadoc on locking

2009-08-30 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749330#action_12749330 ] Marvin Humphrey commented on LUCENE-1877: - > Anyone remem

[jira] Commented: (LUCENE-1859) TermAttributeImpl's buffer will never "shrink" if it grows too big

2009-08-26 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748109#action_12748109 ] Marvin Humphrey commented on LUCENE-1859: - > I don't believe there

[jira] Commented: (LUCENE-1859) TermAttributeImpl's buffer will never "shrink" if it grows too big

2009-08-26 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748102#action_12748102 ] Marvin Humphrey commented on LUCENE-1859: - > i fail to see the comple

[jira] Commented: (LUCENE-1859) TermAttributeImpl's buffer will never "shrink" if it grows too big

2009-08-26 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748089#action_12748089 ] Marvin Humphrey commented on LUCENE-1859: - IMO, the benefit of adding t

[jira] Commented: (LUCENE-1859) TermAttributeImpl's buffer will never "shrink" if it grows too big

2009-08-26 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748064#action_12748064 ] Marvin Humphrey commented on LUCENE-1859: - The worst-case scenario seems kin

Re: Finishing Lucene 2.9

2009-08-24 Thread Marvin Humphrey
rying to shoehorn what ought to be disruptive changes into an artificially continuous release cycle. It's a lot of work, results in a lot of inelegant compatibility APIs, and seems not to have been successfully implemented yet for 2.9. Ma

Re: Finishing Lucene 2.9

2009-08-24 Thread Marvin Humphrey
s like breaking the promise would be disruptive now. But you have an opportunity to change the policy at 3.0, affecting 3.9 and 4.0. That's a 3.0 issue, though -- not a 2.9 issue. Marvin Humphrey - To unsubs

Re: Finishing Lucene 2.9

2009-08-24 Thread Marvin Humphrey
edom to make the kind of API changes people normally expect to accompany a major version change, everything would be a lot simpler. Marvin Humphrey - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For add

[jira] Commented: (LUCENE-1684) Add matchVersion to StandardAnalyzer

2009-06-11 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12718485#action_12718485 ] Marvin Humphrey commented on LUCENE-1684: - +1 This approach addresses all o

Re: Lucene's default settings & back compatibility

2009-05-22 Thread Marvin Humphrey
ns of the class data off of the version number constructor argument, do that. If not and an index was built with an version of the Analyzer that is no longer supported, either throw an exception or intentionally ignore the mismatch and serve screwed up search results. Your call. Marvin Humphr

Re: Lucene's default settings & back compatibility

2009-05-22 Thread Marvin Humphrey
eparate machine to minimize GC pauses, or tag docs by running a > heap of queries against MemoryIndex. No problem. Distribute a Schema subclass among several machines. These are all solved problems under the per-index field semantics serialized Schema model. That's why I sai

Re: Lucene's default settings & back compatibility

2009-05-22 Thread Marvin Humphrey
ltiple apps with Lucene depenencies to coexist. > Will Lucy do scoring when sorting by field, by default? Nope. Why would we do that? The only reason you're doing it in Lucene is to preserve back compat, and Lucy doesn't have that constraint. Marvin Humphrey

Re: Lucene's default settings & back compatibility

2009-05-22 Thread Marvin Humphrey
my preference for keeping defaults intact: // haha eat it luser StandardAnalyzer analyzer = new StandardAnalyzer(); It's either make the arg mandatory when changing default behavior and recommend that new users pass a fixed argument, or make it optional but keep defaults i

Re: Lucene's default settings & back compatibility

2009-05-22 Thread Marvin Humphrey
or analyzers that have changed across > releases. So things like WhitespaceAnalyzer would likely never need > an actsAsVersion arg. Hmm, this is kind of hard. I'd prefer that the argument remain optional, so that new users don't have to think about it. But unlike in KS/Lu

Re: Lucene's default settings & back compatibility

2009-05-22 Thread Marvin Humphrey
ve to think about such things during the initial learning phase. This approach is reasonably close to how Architecture and IndexManager are used to hide away settings for the KS/Lucy Indexer class. Marvin Humphrey - To u

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Marvin Humphrey
nd feeding that to your analyzers would be a decent start. Lastly, I think a major java Lucene release is justified already. Won't this discussion die down somewhat if you can get 3.0 out? If there are issues that are half done, how about rolling back w

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Marvin Humphrey
, inexplicable flakiness. Is that what you want for Lucene? Marvin Humphrey - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Lucene's default settings & back compatibility

2009-05-20 Thread Marvin Humphrey
ed differently depending on who "won"). Marvin Humphrey - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Lucene's default settings & back compatibility

2009-05-20 Thread Marvin Humphrey
ens when two libraries loaded in the same VM have Lucene as a dependency and set actsAsVersion to conflicting numbers? Marvin Humphrey - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail

  1   2   3   4   5   >