[jira] Commented: (LUCENENET-54) ArgumentOurOfRangeException caused by SF.Snowball.Ext.DanishStemmer

2010-05-18 Thread Jason Fitzharris (JIRA)
[ https://issues.apache.org/jira/browse/LUCENENET-54?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868685#action_12868685 ] Jason Fitzharris commented on LUCENENET-54: --- I encountered the same issue when

RE: lucene performance questions

2010-05-18 Thread Digy
Whether you tokenize them or not, there shouldn't be any performance change. (ignoring the parsing of a few words of user's query) Is this some kind of XY problem (http://dictionary.babylon.com/xy%20problem/) DIGY -Original Message- From: Ravi Patel [mailto:rpat...@live.com] Sent:

Re: Problems passing PyLucene objects to jcc-wrapped bobo-browse api

2010-05-18 Thread Andi Vajda
On Tue, 18 May 2010, Christian Heimes wrote: Setup = I'm running the tests on an Ubuntu 10.04 X86_64 box with Python 2.6.5, Sun Java 1.6.0_20, GCC 4.4.3 and patched setuptools 0.6c11 (all 64bit). JCC: $ svn co http://svn.apache.org/repos/asf/lucene/pylucene/branches/branch_3x/jcc

Re: Problems passing PyLucene objects to jcc-wrapped bobo-browse api

2010-05-18 Thread Christian Heimes
The Makefile has about a dozen sets of defaults for a variety of platforms. Pick the one closest to your setup, uncomment it (and change it if needed). Then you don't have to enter these on the command line. My point is that you may want to remove most examples and replace them with sensible

Re: Problems passing PyLucene objects to jcc-wrapped bobo-browse api

2010-05-18 Thread Andi Vajda
On Tue, 18 May 2010, Christian Heimes wrote: Adding the --import'ed .so libraries to the link line on Linux solve several of the problems you reported: - import order doesn't matter anymore - initVM() order doesn't matter anymore - class_ properties are now correct This is now

Re: Problems passing PyLucene objects to jcc-wrapped bobo-browse api

2010-05-18 Thread Andi Vajda
On Tue, 18 May 2010, Christian Heimes wrote: The Makefile has about a dozen sets of defaults for a variety of platforms. Pick the one closest to your setup, uncomment it (and change it if needed). Then you don't have to enter these on the command line. My point is that you may want to remove

ReadTask and its hierarchy needs some house cleaning

2010-05-18 Thread Shai Erera
Hi I wanted to run a benchmark .alg which will take a Filter into account. However, ReadTask, which is the base for a variety of search related tasks, does not support a Filter. When I reviewed the class, to understand how I can easily add such Filter support, I discovered a whole set of classes

Re: ReadTask and its hierarchy needs some house cleaning

2010-05-18 Thread Doron Cohen
How would this affect for example current micro-standard.alg? In particular this part of it: {code} ... { WarmNewRdr Warm : 50 { SrchNewRdr Search : 500 { SrchTrvNewRdr SearchTrav(1000) : 300 { SrchTrvRetNewRdr SearchTravRet(2000) : 100 ...

Re: ReadTask and its hierarchy needs some house cleaning

2010-05-18 Thread Shai Erera
Yes, such algorithms will be affected, but not necessarily deleted. So if a WarmReader task is required, one can write it, but it doesn't need to extend SearchTask, or it can, but hard-code all the other properties to false. Though in most cases you can run SearchTask, w/ warm set to true and

[jira] Commented: (LUCENE-2257) relax the per-segment max unique term limit

2010-05-18 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868569#action_12868569 ] Michael McCandless commented on LUCENE-2257: Yes, the limit is number of

[jira] Assigned: (LUCENE-2468) reopen on NRT reader should share readers w/ unchanged segments

2010-05-18 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-2468: -- Assignee: Michael McCandless reopen on NRT reader should share readers w/

[jira] Commented: (LUCENE-2468) reopen on NRT reader should share readers w/ unchanged segments

2010-05-18 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868571#action_12868571 ] Earwin Burrfoot commented on LUCENE-2468: - Or, you do it so various caches are

Re: ReadTask and its hierarchy needs some house cleaning

2010-05-18 Thread Doron Cohen
Yes, such algorithms will be affected, but not necessarily deleted. So if a WarmReader task is required, one can write it, but it doesn't need to extend SearchTask, or it can, but hard-code all the other properties to false. Though in most cases you can run SearchTask, w/ warm set to true and

[jira] Commented: (LUCENE-2468) reopen on NRT reader should share readers w/ unchanged segments

2010-05-18 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868585#action_12868585 ] Michael McCandless commented on LUCENE-2468: Indeed, right now the newly

Re: ReadTask and its hierarchy needs some house cleaning

2010-05-18 Thread Michael McCandless
I agree we should do some house cleaning here... Can't we make warm, search, trav separate tasks? In fact what is now done by warm (just calling .document on all non-deleted docs) is not usually how warming is done. I would rather rename this to a LoadAllDocsTask. We could add other specific

Re: ReadTask and its hierarchy needs some house cleaning

2010-05-18 Thread Shai Erera
Doron, I like the idea of using params as properties, but I'm not sure I want to make it generic for all tasks. Some tasks, like AddDoc, receive a size as parameter. Moving from AddDoc(100) to AddDoc(size=100) seems redundant to me, although it's not the end of the world if we'll do that. In

[jira] Commented: (LUCENE-2468) reopen on NRT reader should share readers w/ unchanged segments

2010-05-18 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868604#action_12868604 ] Earwin Burrfoot commented on LUCENE-2468: - Reusing fieldCacheKey is probably a

[jira] Commented: (LUCENE-2468) reopen on NRT reader should share readers w/ unchanged segments

2010-05-18 Thread Shay Banon (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868617#action_12868617 ] Shay Banon commented on LUCENE-2468: Sounds like a good solution for me. I just

[jira] Commented: (LUCENE-2468) reopen on NRT reader should share readers w/ unchanged segments

2010-05-18 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868630#action_12868630 ] Yonik Seeley commented on LUCENE-2468: -- bq, I would lean towards letting the caches

[jira] Commented: (LUCENE-2468) reopen on NRT reader should share readers w/ unchanged segments

2010-05-18 Thread Shay Banon (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868647#action_12868647 ] Shay Banon commented on LUCENE-2468: bq. Shay, as far as CachingWrapperFilter and

Re: [jira] Commented: (LUCENE-2257) relax the per-segment max unique term limit

2010-05-18 Thread Koji Sekiguchi
but in trunk, the limit is across all fields Got it. Thanks, Mike! Koji -- http://www.rondhuit.com/en/ (10/05/18 18:21), Michael McCandless (JIRA) wrote: [

Re: (LUCENE-2257) relax the per-segment max unique term limit

2010-05-18 Thread Michael McCandless
Duh, sorry, that should have been but on stable (3x) the limit is across all fields. On trunk (= flex) the limit is per-field. Mike On Tue, May 18, 2010 at 10:12 AM, Koji Sekiguchi k...@r.email.ne.jp wrote:   but in trunk, the limit is across all fields Got it. Thanks, Mike! Koji --

[jira] Commented: (LUCENE-2468) reopen on NRT reader should share readers w/ unchanged segments

2010-05-18 Thread Shay Banon (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868659#action_12868659 ] Shay Banon commented on LUCENE-2468: I think that the solution suggested, to use the

[jira] Commented: (LUCENE-2468) reopen on NRT reader should share readers w/ unchanged segments

2010-05-18 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868660#action_12868660 ] Michael McCandless commented on LUCENE-2468: Renaming to cacheKey makes me a

[jira] Updated: (LUCENE-2468) reopen on NRT reader should share readers w/ unchanged segments

2010-05-18 Thread Shay Banon (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shay Banon updated LUCENE-2468: --- Attachment: CacheTest.java reopen on NRT reader should share readers w/ unchanged segments

Re: lucene performance questions

2010-05-18 Thread Robert Jordan
On 18.05.2010 15:33, Ravi Patel wrote: Two Questions: 1. Is there a cost at search-time in making fields Tokenized that don't need to be? I assume there's a cost at Index time, but I'm not too worried about the Index cost. Which analyzer are your using at index time? 2. Should fields

Re: ReadTask and its hierarchy needs some house cleaning

2010-05-18 Thread Doron Cohen
On Tue, May 18, 2010 at 1:39 PM, Shai Erera ser...@gmail.com wrote: Doron, I like the idea of using params as properties, but I'm not sure I want to make it generic for all tasks. Some tasks, like AddDoc, receive a size as parameter. Moving from AddDoc(100) to AddDoc(size=100) seems redundant

[jira] Commented: (LUCENE-2468) reopen on NRT reader should share readers w/ unchanged segments

2010-05-18 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868673#action_12868673 ] Michael McCandless commented on LUCENE-2468: bq. This is problematic, since a

[jira] Updated: (LUCENE-1812) Static index pruning by in-document term frequency (Carmel pruning)

2010-05-18 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki updated LUCENE-1812: -- Attachment: pruning.patch Updated patch relative to branch_3x. Static index pruning

[jira] Updated: (LUCENE-1812) Static index pruning by in-document term frequency (Carmel pruning)

2010-05-18 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki updated LUCENE-1812: -- Affects Version/s: 3.1 Lucene Fields: [New, Patch Available] (was: [New])

[jira] Created: (SOLR-1918) Bit-wise scoring field type

2010-05-18 Thread Andrzej Bialecki (JIRA)
Bit-wise scoring field type --- Key: SOLR-1918 URL: https://issues.apache.org/jira/browse/SOLR-1918 Project: Solr Issue Type: New Feature Components: Schema and Analysis Affects Versions: 3.1

[jira] Commented: (LUCENE-1812) Static index pruning by in-document term frequency (Carmel pruning)

2010-05-18 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868719#action_12868719 ] Robert Muir commented on LUCENE-1812: - Hi Andrzej, thanks for updating the patch. I

[jira] Commented: (LUCENE-1812) Static index pruning by in-document term frequency (Carmel pruning)

2010-05-18 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868730#action_12868730 ] Andrzej Bialecki commented on LUCENE-1812: --- I'm fine with reorganizing it - I

[jira] Updated: (LUCENE-2380) Add FieldCache.getTermBytes, to load term data as byte[]

2010-05-18 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-2380: --- Attachment: LUCENE-2380.patch Very rough first cut patch attached. I removed

[jira] Commented: (SOLR-1900) move Solr to flex APIs

2010-05-18 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868748#action_12868748 ] Michael McCandless commented on SOLR-1900: -- bq. and adds support to FieldType for

Re: Problems passing PyLucene objects to jcc-wrapped bobo-browse api

2010-05-18 Thread Andi Vajda
On Tue, 18 May 2010, Andi Vajda wrote: I don't have this problem with my simple program. My SillyAnalyzer class extends Lucene's Analyzer. foo.Analyzer is the one imported from lucene. import lucene, foo lucene.initVM() jcc.JCCEnv object at 0x1004100d8 foo.initVM(.) jcc.JCCEnv object

Re: Problems passing PyLucene objects to jcc-wrapped bobo-browse api

2010-05-18 Thread Andi Vajda
On Tue, 18 May 2010, Andi Vajda wrote: On Tue, 18 May 2010, Andi Vajda wrote: I don't have this problem with my simple program. My SillyAnalyzer class extends Lucene's Analyzer. foo.Analyzer is the one imported from lucene. import lucene, foo lucene.initVM() jcc.JCCEnv object at

Re: Problems passing PyLucene objects to jcc-wrapped bobo-browse api

2010-05-18 Thread Andi Vajda
On Tue, 18 May 2010, Christian Heimes wrote: Setup = I'm running the tests on an Ubuntu 10.04 X86_64 box with Python 2.6.5, Sun Java 1.6.0_20, GCC 4.4.3 and patched setuptools 0.6c11 (all 64bit). JCC: $ svn co http://svn.apache.org/repos/asf/lucene/pylucene/branches/branch_3x/jcc

[jira] Updated: (LUCENE-2468) reopen on NRT reader should share readers w/ unchanged segments

2010-05-18 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-2468: --- Attachment: LUCENE-2468.patch Attached patch -- renames IR.getFieldCacheKey -

Re: Problems passing PyLucene objects to jcc-wrapped bobo-browse api

2010-05-18 Thread Christian Heimes
Adding the --import'ed .so libraries to the link line on Linux solve several of the problems you reported: - import order doesn't matter anymore - initVM() order doesn't matter anymore - class_ properties are now correct This is now checked into the branch_3x pylucene's branch.

[jira] Commented: (LUCENE-2468) reopen on NRT reader should share readers w/ unchanged segments

2010-05-18 Thread Shay Banon (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868816#action_12868816 ] Shay Banon commented on LUCENE-2468: Thanks for the work Michael!. Is this issue going

FYI: Solr now Requires Java 6

2010-05-18 Thread Chris Hostetter
It's been mentioned in passing on a few issues (mainly Solr Cloud) as something that should probably happen, but recent commits (JSON Parsing via Noggit) have made it a reallity so i wanted to send out an explicit message pointing this out to folks if you haven't noticed already: Compiling

[jira] Resolved: (SOLR-1846) Remove support for (broken) abortOnConfigurationError

2010-05-18 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-1846. Assignee: Hoss Man Fix Version/s: 4.0 Resolution: Fixed Committed revision 945886. Remove

[jira] Resolved: (SOLR-1824) partial field types created on error

2010-05-18 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-1824. Assignee: Hoss Man Fix Version/s: 4.0 Resolution: Fixed fixed via SOLR-1846, IndexSchema

[jira] Commented: (LUCENE-2468) reopen on NRT reader should share readers w/ unchanged segments

2010-05-18 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868855#action_12868855 ] Michael McCandless commented on LUCENE-2468: bq. Is this issue going to

RE: Solr now Requires Java 6

2010-05-18 Thread Uwe Schindler
I already fixed hudson build to use latest 1.6 for solr nightly! We should now fix the main build.xml in root folder to not overlap the -target settings, so Lucene is still build with 1.5, so we need separate properties in both build files, else solr overrides build with 1.6 when lucene is build

[jira] Commented: (LUCENE-2468) reopen on NRT reader should share readers w/ unchanged segments

2010-05-18 Thread Shay Banon (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868869#action_12868869 ] Shay Banon commented on LUCENE-2468: Check two comments above :), we discussed it.

RE: Solr now Requires Java 6

2010-05-18 Thread Uwe Schindler
No comments... - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Wednesday, May 19, 2010 12:17 AM To: dev@lucene.apache.org Subject: Re: Solr now Requires Java

[jira] Commented: (SOLR-788) MoreLikeThis should support distributed search

2010-05-18 Thread Shawn Heisey (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868874#action_12868874 ] Shawn Heisey commented on SOLR-788: --- I couldn't get the original patch to work on the 4.0

Re: Problems passing PyLucene objects to jcc-wrapped bobo-browse api

2010-05-18 Thread Andi Vajda
On Tue, 18 May 2010, Christian Heimes wrote: Can you suggest a patch for this ? It is not immediately obvious to me how to do this. I've attached a quick patch. Sorry for all the noise, my editor is configured to reformat Python code. The patch provides a custom build_py command class for

[jira] Updated: (SOLR-1556) TermVectorComponents should provide good error messages when fieldtype isn't compatible with requested options

2010-05-18 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-1556: --- Summary: TermVectorComponents should provide good error messages when fieldtype isn't compatible with

[jira] Commented: (LUCENE-2468) reopen on NRT reader should share readers w/ unchanged segments

2010-05-18 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868909#action_12868909 ] Michael McCandless commented on LUCENE-2468: bq. Basically, it does not work

[jira] Commented: (SOLR-1556) TermVectorComponents should provide good error messages when fieldtype isn't compatible with requested options

2010-05-18 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868910#action_12868910 ] Hoss Man commented on SOLR-1556: skimming the code a bit, this doesn't seem easy to be a

[jira] Commented: (LUCENE-2468) reopen on NRT reader should share readers w/ unchanged segments

2010-05-18 Thread Shay Banon (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868923#action_12868923 ] Shay Banon commented on LUCENE-2468: Ahh, now I see that, sorry I missed it. But,

[jira] Updated: (LUCENE-2468) reopen on NRT reader should share readers w/ unchanged segments

2010-05-18 Thread Shay Banon (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shay Banon updated LUCENE-2468: --- Attachment: DeletionAwareConstantScoreQuery.java Here is a go at making ConstantScoreQuery deletion

[jira] Commented: (LUCENE-2468) reopen on NRT reader should share readers w/ unchanged segments

2010-05-18 Thread Shay Banon (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868959#action_12868959 ] Shay Banon commented on LUCENE-2468: Another quick question Mike, what do you think

[jira] Commented: (SOLR-1918) Bit-wise scoring field type

2010-05-18 Thread Lance Norskog (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868962#action_12868962 ] Lance Norskog commented on SOLR-1918: - I can't quite follow the patch. Does this do