Discard search results based on a doc field comparison

2009-05-13 Thread walki2
Hi, I have a website in 3 languages. The language management is done with Apache Struts MessageResources, so all the text is stored within .properties files. These .properties files are indexed and for each doc there is a field called language which is set to either en,fr or nl. At the moment if

[jira] Assigned: (LUCENE-1634) LogMergePolicy should use the number of deleted docs when deciding which segments to merge

2009-05-13 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-1634: -- Assignee: Michael McCandless LogMergePolicy should use the number of deleted

[jira] Commented: (LUCENE-1629) contrib intelligent Analyzer for Chinese

2009-05-13 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708878#action_12708878 ] Michael McCandless commented on LUCENE-1629: (Shooting in the dark, here,

[jira] Commented: (LUCENE-1629) contrib intelligent Analyzer for Chinese

2009-05-13 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708887#action_12708887 ] Uwe Schindler commented on LUCENE-1629: --- I wonder, why this build fragment did not

[jira] Commented: (LUCENE-1629) contrib intelligent Analyzer for Chinese

2009-05-13 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708889#action_12708889 ] Michael McCandless commented on LUCENE-1629: That fragment is under

[jira] Commented: (LUCENE-1629) contrib intelligent Analyzer for Chinese

2009-05-13 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708892#action_12708892 ] Uwe Schindler commented on LUCENE-1629: --- I will look into it this evening and

[jira] Commented: (LUCENE-1629) contrib intelligent Analyzer for Chinese

2009-05-13 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708894#action_12708894 ] Michael McCandless commented on LUCENE-1629: OK, I agree, separation of

[jira] Commented: (LUCENE-1634) LogMergePolicy should use the number of deleted docs when deciding which segments to merge

2009-05-13 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708898#action_12708898 ] Michael McCandless commented on LUCENE-1634: Actually, optimize() always

[jira] Commented: (LUCENE-1629) contrib intelligent Analyzer for Chinese

2009-05-13 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708909#action_12708909 ] Uwe Schindler commented on LUCENE-1629: --- Its only needed to have the src/resources

Re: Discard search results based on a doc field comparison

2009-05-13 Thread Erick Erickson
I don't quite understand what your code snipped is about, but to your larger problem, why not just index the language along with the doc and add a language:whatever to your query? But using StandardAnalyzer probably isn't as useful as it could be, since accented characters wouldn't be found when

Re: Discard search results based on a doc field comparison

2009-05-13 Thread Erick Erickson
P.S. This would also be better posted on the user list, not the dev list. (Forgot which list I was on) Best Erick On Wed, May 13, 2009 at 8:53 AM, Erick Erickson erickerick...@gmail.comwrote: I don't quite understand what your code snipped is about, but to your larger problem, why not just

Re: Discard search results based on a doc field comparison

2009-05-13 Thread Matthew Hall
Well, perhaps this is a bit.. simplistic.. but couldn't you simply automatically add a clause to your queries based on the detected language of the session? Basically you could make three ready to go BooleanQueries where you have something like this (psuedocode) enClause =language: en

[jira] Issue Comment Edited: (LUCENE-1629) contrib intelligent Analyzer for Chinese

2009-05-13 Thread Erik Hatcher (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708912#action_12708912 ] Erik Hatcher edited comment on LUCENE-1629 at 5/13/09 5:58 AM:

[jira] Commented: (LUCENE-1629) contrib intelligent Analyzer for Chinese

2009-05-13 Thread Erik Hatcher (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708912#action_12708912 ] Erik Hatcher commented on LUCENE-1629: -- My initial thought is to move the copy

[jira] Commented: (LUCENE-1634) LogMergePolicy should use the number of deleted docs when deciding which segments to merge

2009-05-13 Thread John Wang (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708989#action_12708989 ] John Wang commented on LUCENE-1634: --- Comment on implementing a custom merge policy: As

[jira] Commented: (LUCENE-1634) LogMergePolicy should use the number of deleted docs when deciding which segments to merge

2009-05-13 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708991#action_12708991 ] Michael McCandless commented on LUCENE-1634: bq. This is actually referring to

[jira] Commented: (LUCENE-1634) LogMergePolicy should use the number of deleted docs when deciding which segments to merge

2009-05-13 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708993#action_12708993 ] Michael McCandless commented on LUCENE-1634: bq. As the API current stands, I

[jira] Commented: (LUCENE-1634) LogMergePolicy should use the number of deleted docs when deciding which segments to merge

2009-05-13 Thread John Wang (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708995#action_12708995 ] John Wang commented on LUCENE-1634: --- The current lucene implementation, optimize(int)

[jira] Commented: (LUCENE-1634) LogMergePolicy should use the number of deleted docs when deciding which segments to merge

2009-05-13 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708999#action_12708999 ] Michael McCandless commented on LUCENE-1634: bq. say the index has 10

[jira] Commented: (LUCENE-1634) LogMergePolicy should use the number of deleted docs when deciding which segments to merge

2009-05-13 Thread John Wang (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12709003#action_12709003 ] John Wang commented on LUCENE-1634: --- So let's proceed with this patch, once you've

[jira] Commented: (LUCENE-1634) LogMergePolicy should use the number of deleted docs when deciding which segments to merge

2009-05-13 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12709004#action_12709004 ] Michael McCandless commented on LUCENE-1634: I mean a setter/getter to turn

[jira] Commented: (LUCENE-1634) LogMergePolicy should use the number of deleted docs when deciding which segments to merge

2009-05-13 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12709015#action_12709015 ] Michael McCandless commented on LUCENE-1634: bq. So to implement your own

[jira] Commented: (LUCENE-1634) LogMergePolicy should use the number of deleted docs when deciding which segments to merge

2009-05-13 Thread John Wang (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12709024#action_12709024 ] John Wang commented on LUCENE-1634: --- I mean a setter/getter to turn on/off taking

Re: InstantiatedIndex Memory required

2009-05-13 Thread Karl Wettin
Hi Ravichandra, this is a question better fitted the java-users maillinglist. On this list we talk about the development of the Lucene API rather than how to use it. To answer your question, there is no simple formula that says how much RAM an InstantiatedIndex will consume given the

[jira] Commented: (LUCENE-1608) CustomScoreQuery should support arbitrary Queries

2009-05-13 Thread Steven Bethard (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12709068#action_12709068 ] Steven Bethard commented on LUCENE-1608: Sorry it took so long for me to try this

Re: Discard search results based on a doc field comparison

2009-05-13 Thread walki2
Hi, This seems a good idea to me, I'll try it tomorrow. -- View this message in context: http://www.nabble.com/Discard-search-results-based-on-a-doc-field-comparison-tp23518565p23527968.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

[jira] Commented: (LUCENE-1634) LogMergePolicy should use the number of deleted docs when deciding which segments to merge

2009-05-13 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12709080#action_12709080 ] Michael McCandless commented on LUCENE-1634: bq. What do you suggest the

[jira] Updated: (LUCENE-1634) LogMergePolicy should use the number of deleted docs when deciding which segments to merge

2009-05-13 Thread Yasuhiro Matsuda (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuhiro Matsuda updated LUCENE-1634: - Attachment: LUCENE-1634.patch I added

[jira] Updated: (LUCENE-1607) String.intern() faster alternative

2009-05-13 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated LUCENE-1607: - Attachment: LUCENE-1607.patch Here's a slightly updated patch (javadoc and

[jira] Commented: (LUCENE-1596) optimize MultiTermEnum/MultiTermDocs

2009-05-13 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12709244#action_12709244 ] Yonik Seeley commented on LUCENE-1596: -- Getting back to this... although this

[jira] Created: (LUCENE-1635) Handle Escape character

2009-05-13 Thread rimi (JIRA)
Handle Escape character --- Key: LUCENE-1635 URL: https://issues.apache.org/jira/browse/LUCENE-1635 Project: Lucene - Java Issue Type: Bug Components: QueryParser Affects Versions: 2.0.0