[jira] [Resolved] (PYLUCENE-24) Shared JCC object on Linux requires setuptools patch
[ https://issues.apache.org/jira/browse/PYLUCENE-24?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andi Vajda resolved PYLUCENE-24. Resolution: Fixed Thank you, Caleb, I applied your patch. I simplified the setup.py logic a bit as require('setuptools') seems to work also on distribute. Changes are checked into trunk rev 1396894. Shared JCC object on Linux requires setuptools patch Key: PYLUCENE-24 URL: https://issues.apache.org/jira/browse/PYLUCENE-24 Project: PyLucene Issue Type: Bug Environment: Linux Reporter: Caleb Burns Labels: build, jcc, linux, pylucene, python Attachments: jcc-linux.patch Original Estimate: 0h Remaining Estimate: 0h The current method to build JCC as a shared object on Linux requires patching the setuptools package. Here's a patch to JCC that monkey-patches the setuptools Library and Extension classes to avoid the manual patch. It works with setuptools-0.6c7-11 and distribute-0.6.1+ without the need of manually patching setuptools. These are the same versions that the current manual patches work with: patch.43.0.6c7 works with setuptools-0.6c7-10 and distribute-0.6.1+ while patch.43.0.6.c11 works with setuptools-0.6c11. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (SOLR-3377) eDismax: A fielded query wrapped by parens is not recognized
[ https://issues.apache.org/jira/browse/SOLR-3377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473024#comment-13473024 ] Shawn Heisey commented on SOLR-3377: I confirmed (using the solr example) that 4.0-BETA and lucene_solr_4_0 can parse a query similar to my test query perfectly. Query URL created by filling out the admin query interface: http://server:8983/solr/collection1/select?q=((cat%3Astring1)+OR+(cat%3Astring2)+OR+(cat%3Astring3)+(Kitchen+Sink))wt=xmldebugQuery=truedefType=edismaxqf=textpf=text%5E2 Just for completeness, I also tried it in the lucene_solr_3_6 example. It behaves just like 3.5.0 does, including working when the spaces are added. Solr Implementation Version: 3.6.2-SNAPSHOT 1396476 - root - 2012-10-09 23:48:08 I will file a separate bug on 3.6.1 related to this one. eDismax: A fielded query wrapped by parens is not recognized Key: SOLR-3377 URL: https://issues.apache.org/jira/browse/SOLR-3377 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 3.6 Reporter: Jan Høydahl Assignee: Yonik Seeley Priority: Critical Fix For: 4.0-BETA Attachments: SOLR-3377.patch, SOLR-3377.patch, SOLR-3377.patch, SOLR-3377.patch As reported by bernd on the user list, a query like this {{q=(name:test)}} will yield 0 hits in 3.6 while it worked in 3.5. It works without the parens. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: whats happening with jenkins?
No, this looks like something else. The at: unknown suggest there are no more tests or suites on the stack for that JVM. The suite was terminated properly. The runner does a System.exit (and is permitted to do so) so any threads would be terminated. Looks like a bug in my code somewhere. I added saving of *.events artifact and am downloading the workspace right now to inspect what's going on. Dawid On Tue, Oct 9, 2012 at 11:10 PM, Robert Muir rcm...@gmail.com wrote: https://builds.apache.org/job/Lucene-Solr-Tests-4.x-java7/534/console been running for 24 hours... it looks like a test got hung and was timed out after two hours, but maybe spawned some zombies and the test runner allows their hearts to beat forever??? now we see: [junit4:junit4] HEARTBEAT J0: 2012-10-09T21:09:39, stalled for 74767s at: unknown - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-3377) eDismax: A fielded query wrapped by parens is not recognized
[ https://issues.apache.org/jira/browse/SOLR-3377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473024#comment-13473024 ] Shawn Heisey edited comment on SOLR-3377 at 10/10/12 6:51 AM: -- I confirmed (using the solr example) that 4.0-BETA and lucene_solr_4_0 can parse a query similar to my test query perfectly. Query URL created by filling out the admin query interface: http://server:8983/solr/collection1/select?q=((cat%3Astring1)+OR+(cat%3Astring2)+OR+(cat%3Astring3)+(Kitchen+Sink))wt=xmldebugQuery=truedefType=edismaxqf=textpf=text%5E2 was (Author: elyograg): I confirmed (using the solr example) that 4.0-BETA and lucene_solr_4_0 can parse a query similar to my test query perfectly. Query URL created by filling out the admin query interface: http://server:8983/solr/collection1/select?q=((cat%3Astring1)+OR+(cat%3Astring2)+OR+(cat%3Astring3)+(Kitchen+Sink))wt=xmldebugQuery=truedefType=edismaxqf=textpf=text%5E2 Just for completeness, I also tried it in the lucene_solr_3_6 example. It behaves just like 3.5.0 does, including working when the spaces are added. Solr Implementation Version: 3.6.2-SNAPSHOT 1396476 - root - 2012-10-09 23:48:08 I will file a separate bug on 3.6.1 related to this one. eDismax: A fielded query wrapped by parens is not recognized Key: SOLR-3377 URL: https://issues.apache.org/jira/browse/SOLR-3377 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 3.6 Reporter: Jan Høydahl Assignee: Yonik Seeley Priority: Critical Fix For: 4.0-BETA Attachments: SOLR-3377.patch, SOLR-3377.patch, SOLR-3377.patch, SOLR-3377.patch As reported by bernd on the user list, a query like this {{q=(name:test)}} will yield 0 hits in 3.6 while it worked in 3.5. It works without the parens. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: whats happening with jenkins?
From what I can say both forked JVMs completed their runs (that is, emitted a QUIT event back to the runner). Aborted by uschindler Uwe, next time don't be so fast (or take a stack trace so that I can see what's going on :). Evidently something stalled but I've no idea where. Dawid On Wed, Oct 10, 2012 at 8:37 AM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: No, this looks like something else. The at: unknown suggest there are no more tests or suites on the stack for that JVM. The suite was terminated properly. The runner does a System.exit (and is permitted to do so) so any threads would be terminated. Looks like a bug in my code somewhere. I added saving of *.events artifact and am downloading the workspace right now to inspect what's going on. Dawid On Tue, Oct 9, 2012 at 11:10 PM, Robert Muir rcm...@gmail.com wrote: https://builds.apache.org/job/Lucene-Solr-Tests-4.x-java7/534/console been running for 24 hours... it looks like a test got hung and was timed out after two hours, but maybe spawned some zombies and the test runner allows their hearts to beat forever??? now we see: [junit4:junit4] HEARTBEAT J0: 2012-10-09T21:09:39, stalled for 74767s at: unknown - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3923) eDismax: complex fielded query with parens is not recognized
Shawn Heisey created SOLR-3923: -- Summary: eDismax: complex fielded query with parens is not recognized Key: SOLR-3923 URL: https://issues.apache.org/jira/browse/SOLR-3923 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 3.5 Reporter: Shawn Heisey Fix For: 3.6.2 This is similar to SOLR-3377. That bug appears to have fixed this problem for 4.x. I can see the effects of SOLR-3377 when I test a query similar to below on the Solr 3.6 example, which is expected because SOLR-3377 was found in 3.6 but only fixed in 4.0. This bug is a little different, and exists in 3.5.0 for sure, possibly earlier. The first part of the parsed query looks right, but then something weird happens and it gets interpreted as a very strange phrase query. query URL sent to solr 3.5.0 example: {code}http://localhost:8983/solr/collection1/select?q=%28%28cat%3Astring1%29+%28Kitchen+Sink%29%29wt=xmldebugQuery=truedefType=edismaxqf=textpf=text^2.0 {code} parsedquery_toString: {code}+((cat:string1 ((text:kitchen) (text:sink)))~2) (text:cat:string1 kitchen sink^2.0) {code} Adding some spaces before and after cat:string1 fixes it: {code}http://localhost:8983/solr/collection1/select?q=%28%28%20cat%3Astring1%20%29+%28Kitchen+Sink%29%29wt=xmldebugQuery=truedefType=edismaxqf=textpf=text^2.0 {code} {code}+((cat:string1 ((text:kitchen) (text:sink)))~2) (text:kitchen sink^2.0) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4462) Publishing flushed segments is single threaded and too costly
[ https://issues.apache.org/jira/browse/LUCENE-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-4462: Attachment: LUCENE-4462.patch here is a new patch adding back the safety forcePurge. I will commit this to trunk and let it bake in a bit before I backport. I will keep this issue open until it's ported. Publishing flushed segments is single threaded and too costly - Key: LUCENE-4462 URL: https://issues.apache.org/jira/browse/LUCENE-4462 Project: Lucene - Core Issue Type: Improvement Components: core/index Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0 Reporter: Michael McCandless Assignee: Simon Willnauer Fix For: 4.1, 5.0 Attachments: LUCENE-4462.patch, LUCENE-4462.patch Spinoff from http://lucene.markmail.org/thread/4li6bbomru35qn7w The new TestBagOfPostings failed the build because it timed out after 2 hours ... but in digging I found that it was a starvation issue: the 4 threads were flushing segments much faster than the 1 thread could publish them. I think this is because publishing segments (DocumentsWriter.publishFlushedSegment) is actually rather costly (creates CFS file if necessary, writes .si, etc.). I committed a workaround for now, to prevent starvation (see svn diff -c 1394704 https://svn.apache.org/repos/asf/lucene/dev/trunk), but we really should address the root cause by moving these costly ops into flush() so that publishing is a low cost operation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4462) Publishing flushed segments is single threaded and too costly
[ https://issues.apache.org/jira/browse/LUCENE-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-4462: Component/s: core/index Lucene Fields: New,Patch Available (was: New) Affects Version/s: 4.0 4.0-ALPHA 4.0-BETA Fix Version/s: 5.0 4.1 Publishing flushed segments is single threaded and too costly - Key: LUCENE-4462 URL: https://issues.apache.org/jira/browse/LUCENE-4462 Project: Lucene - Core Issue Type: Improvement Components: core/index Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0 Reporter: Michael McCandless Assignee: Simon Willnauer Fix For: 4.1, 5.0 Attachments: LUCENE-4462.patch, LUCENE-4462.patch Spinoff from http://lucene.markmail.org/thread/4li6bbomru35qn7w The new TestBagOfPostings failed the build because it timed out after 2 hours ... but in digging I found that it was a starvation issue: the 4 threads were flushing segments much faster than the 1 thread could publish them. I think this is because publishing segments (DocumentsWriter.publishFlushedSegment) is actually rather costly (creates CFS file if necessary, writes .si, etc.). I committed a workaround for now, to prevent starvation (see svn diff -c 1394704 https://svn.apache.org/repos/asf/lucene/dev/trunk), but we really should address the root cause by moving these costly ops into flush() so that publishing is a low cost operation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4462) Publishing flushed segments is single threaded and too costly
[ https://issues.apache.org/jira/browse/LUCENE-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473069#comment-13473069 ] Simon Willnauer commented on LUCENE-4462: - Committed to trunk in revision 1396500 Publishing flushed segments is single threaded and too costly - Key: LUCENE-4462 URL: https://issues.apache.org/jira/browse/LUCENE-4462 Project: Lucene - Core Issue Type: Improvement Components: core/index Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0 Reporter: Michael McCandless Assignee: Simon Willnauer Fix For: 4.1, 5.0 Attachments: LUCENE-4462.patch, LUCENE-4462.patch Spinoff from http://lucene.markmail.org/thread/4li6bbomru35qn7w The new TestBagOfPostings failed the build because it timed out after 2 hours ... but in digging I found that it was a starvation issue: the 4 threads were flushing segments much faster than the 1 thread could publish them. I think this is because publishing segments (DocumentsWriter.publishFlushedSegment) is actually rather costly (creates CFS file if necessary, writes .si, etc.). I committed a workaround for now, to prevent starvation (see svn diff -c 1394704 https://svn.apache.org/repos/asf/lucene/dev/trunk), but we really should address the root cause by moving these costly ops into flush() so that publishing is a low cost operation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3924) The solr weakness fault tolerance on ext4
shou aoki created SOLR-3924: --- Summary: The solr weakness fault tolerance on ext4 Key: SOLR-3924 URL: https://issues.apache.org/jira/browse/SOLR-3924 Project: Solr Issue Type: Bug Affects Versions: 3.5 Environment: Ubuntu 12.04 LTS, Filesystem is ext4. Reporter: shou aoki In few days ago ours machine (with solr) was crashed. We rebooted machine and solr, The solr looks like validly behavior. But, The solr is not valid because following : - Exists Solr core. - Exists /tmp/solr directory and /tmp/solr/data/solr_core directory. - Exists /tmp/solr/solr.xml and wrote solr/cores/core tag. - We can start Solr Server without exception. - $ curl http://localhost:8983/solr/solr_core/select?; *returns 404 Not Found.* And I found the file of '/tmp/solr/data/solr_core/index/segments.gen' is empty file. So, I think there is no (or weak) fault tolerance about Solr. I hope the solr grow up crash-free server. For example, File handling is atomically as much as possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3924) The solr weakness fault tolerance on ext4
[ https://issues.apache.org/jira/browse/SOLR-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shou aoki updated SOLR-3924: Description: In few days ago ours machine (with solr) was crashed. We rebooted machine and solr, The solr looks like validly behavior. But, The solr is not valid because following : - Exists Solr core. - Exists /tmp/solr directory and /tmp/solr/data/solr_core directory. - Exists /tmp/solr/solr.xml and exists solr/cores/core tag. - We can start Solr Server without exception. - $ curl http://localhost:8983/solr/solr_core/select?; *returns 404 Not Found.* And I found the file of '/tmp/solr/data/solr_core/index/segments.gen' is empty file. So, I think there is no (or weak) fault tolerance about Solr. I hope the solr grow up crash-free server. For example, File handling is atomically as much as possible. was: In few days ago ours machine (with solr) was crashed. We rebooted machine and solr, The solr looks like validly behavior. But, The solr is not valid because following : - Exists Solr core. - Exists /tmp/solr directory and /tmp/solr/data/solr_core directory. - Exists /tmp/solr/solr.xml and wrote solr/cores/core tag. - We can start Solr Server without exception. - $ curl http://localhost:8983/solr/solr_core/select?; *returns 404 Not Found.* And I found the file of '/tmp/solr/data/solr_core/index/segments.gen' is empty file. So, I think there is no (or weak) fault tolerance about Solr. I hope the solr grow up crash-free server. For example, File handling is atomically as much as possible. The solr weakness fault tolerance on ext4 - Key: SOLR-3924 URL: https://issues.apache.org/jira/browse/SOLR-3924 Project: Solr Issue Type: Bug Affects Versions: 3.5 Environment: Ubuntu 12.04 LTS, Filesystem is ext4. Reporter: shou aoki In few days ago ours machine (with solr) was crashed. We rebooted machine and solr, The solr looks like validly behavior. But, The solr is not valid because following : - Exists Solr core. - Exists /tmp/solr directory and /tmp/solr/data/solr_core directory. - Exists /tmp/solr/solr.xml and exists solr/cores/core tag. - We can start Solr Server without exception. - $ curl http://localhost:8983/solr/solr_core/select?; *returns 404 Not Found.* And I found the file of '/tmp/solr/data/solr_core/index/segments.gen' is empty file. So, I think there is no (or weak) fault tolerance about Solr. I hope the solr grow up crash-free server. For example, File handling is atomically as much as possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-3924) The solr weakness fault tolerance on ext4
[ https://issues.apache.org/jira/browse/SOLR-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved SOLR-3924. - Resolution: Duplicate Fix Version/s: 3.6 This is a duplicate of LUCENE-3627 and was already fixed in Lucene 3.6.0. The solr weakness fault tolerance on ext4 - Key: SOLR-3924 URL: https://issues.apache.org/jira/browse/SOLR-3924 Project: Solr Issue Type: Bug Affects Versions: 3.5 Environment: Ubuntu 12.04 LTS, Filesystem is ext4. Reporter: shou aoki Fix For: 3.6 In few days ago ours machine (with solr) was crashed. We rebooted machine and solr, The solr looks like validly behavior. But, The solr is not valid because following : - Exists Solr core. - Exists /tmp/solr directory and /tmp/solr/data/solr_core directory. - Exists /tmp/solr/solr.xml and exists solr/cores/core tag. - We can start Solr Server without exception. - $ curl http://localhost:8983/solr/solr_core/select?; *returns 404 Not Found.* And I found the file of '/tmp/solr/data/solr_core/index/segments.gen' is empty file. So, I think there is no (or weak) fault tolerance about Solr. I hope the solr grow up crash-free server. For example, File handling is atomically as much as possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Closed] (SOLR-3924) The solr weakness fault tolerance on ext4
[ https://issues.apache.org/jira/browse/SOLR-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler closed SOLR-3924. --- The solr weakness fault tolerance on ext4 - Key: SOLR-3924 URL: https://issues.apache.org/jira/browse/SOLR-3924 Project: Solr Issue Type: Bug Affects Versions: 3.5 Environment: Ubuntu 12.04 LTS, Filesystem is ext4. Reporter: shou aoki Fix For: 3.6 In few days ago ours machine (with solr) was crashed. We rebooted machine and solr, The solr looks like validly behavior. But, The solr is not valid because following : - Exists Solr core. - Exists /tmp/solr directory and /tmp/solr/data/solr_core directory. - Exists /tmp/solr/solr.xml and exists solr/cores/core tag. - We can start Solr Server without exception. - $ curl http://localhost:8983/solr/solr_core/select?; *returns 404 Not Found.* And I found the file of '/tmp/solr/data/solr_core/index/segments.gen' is empty file. So, I think there is no (or weak) fault tolerance about Solr. I hope the solr grow up crash-free server. For example, File handling is atomically as much as possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3924) The solr weakness fault tolerance on ext4
[ https://issues.apache.org/jira/browse/SOLR-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473110#comment-13473110 ] shou aoki commented on SOLR-3924: - Thank you for your information Schindler! The solr weakness fault tolerance on ext4 - Key: SOLR-3924 URL: https://issues.apache.org/jira/browse/SOLR-3924 Project: Solr Issue Type: Bug Affects Versions: 3.5 Environment: Ubuntu 12.04 LTS, Filesystem is ext4. Reporter: shou aoki Fix For: 3.6 In few days ago ours machine (with solr) was crashed. We rebooted machine and solr, The solr looks like validly behavior. But, The solr is not valid because following : - Exists Solr core. - Exists /tmp/solr directory and /tmp/solr/data/solr_core directory. - Exists /tmp/solr/solr.xml and exists solr/cores/core tag. - We can start Solr Server without exception. - $ curl http://localhost:8983/solr/solr_core/select?; *returns 404 Not Found.* And I found the file of '/tmp/solr/data/solr_core/index/segments.gen' is empty file. So, I think there is no (or weak) fault tolerance about Solr. I hope the solr grow up crash-free server. For example, File handling is atomically as much as possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4470) Expose SpanFirst in eDismax
Markus Jelsma created LUCENE-4470: - Summary: Expose SpanFirst in eDismax Key: LUCENE-4470 URL: https://issues.apache.org/jira/browse/LUCENE-4470 Project: Lucene - Core Issue Type: Improvement Components: modules/queryparser Affects Versions: 4.0-BETA Environment: solr-spec 5.0.0.2012.10.09.19.29.59 solr-impl 5.0-SNAPSHOT 1366361:1396116M - markus - 2012-10-09 19:29:59 Reporter: Markus Jelsma Fix For: 4.1, 5.0 Expose Lucene's SpanFirst capability in Solr's extended Dismax query parser. This issue adds the SF-parameter (SpanFirst) and takes a FIELD~DISTANCE^BOOST formatted value. For example, sf=title~5^2 will give a boost of 2 if one of the normal clauses, originally generated for automatic phrase queries, is located within five positions from the field's start. Unit test is included and all tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4470) Expose SpanFirst in eDismax
[ https://issues.apache.org/jira/browse/LUCENE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated LUCENE-4470: -- Attachment: SOLR-4470-trunk-1.patch Expose SpanFirst in eDismax --- Key: LUCENE-4470 URL: https://issues.apache.org/jira/browse/LUCENE-4470 Project: Lucene - Core Issue Type: Improvement Components: modules/queryparser Affects Versions: 4.0-BETA Environment: solr-spec 5.0.0.2012.10.09.19.29.59 solr-impl 5.0-SNAPSHOT 1366361:1396116M - markus - 2012-10-09 19:29:59 Reporter: Markus Jelsma Fix For: 4.1, 5.0 Attachments: SOLR-4470-trunk-1.patch Expose Lucene's SpanFirst capability in Solr's extended Dismax query parser. This issue adds the SF-parameter (SpanFirst) and takes a FIELD~DISTANCE^BOOST formatted value. For example, sf=title~5^2 will give a boost of 2 if one of the normal clauses, originally generated for automatic phrase queries, is located within five positions from the field's start. Unit test is included and all tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Closed] (LUCENE-4470) Expose SpanFirst in eDismax
[ https://issues.apache.org/jira/browse/LUCENE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma closed LUCENE-4470. - Resolution: Invalid Fix Version/s: (was: 5.0) (was: 4.1) Accidentally added to Lucene. I'll close and open in the Solr project. Sorry. Expose SpanFirst in eDismax --- Key: LUCENE-4470 URL: https://issues.apache.org/jira/browse/LUCENE-4470 Project: Lucene - Core Issue Type: Improvement Components: modules/queryparser Affects Versions: 4.0-BETA Environment: solr-spec 5.0.0.2012.10.09.19.29.59 solr-impl 5.0-SNAPSHOT 1366361:1396116M - markus - 2012-10-09 19:29:59 Reporter: Markus Jelsma Attachments: SOLR-4470-trunk-1.patch Expose Lucene's SpanFirst capability in Solr's extended Dismax query parser. This issue adds the SF-parameter (SpanFirst) and takes a FIELD~DISTANCE^BOOST formatted value. For example, sf=title~5^2 will give a boost of 2 if one of the normal clauses, originally generated for automatic phrase queries, is located within five positions from the field's start. Unit test is included and all tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3925) Expose SpanFirst in eDismax
Markus Jelsma created SOLR-3925: --- Summary: Expose SpanFirst in eDismax Key: SOLR-3925 URL: https://issues.apache.org/jira/browse/SOLR-3925 Project: Solr Issue Type: Improvement Components: query parsers Affects Versions: 4.0-BETA Environment: solr-spec 5.0.0.2012.10.09.19.29.59 solr-impl 5.0-SNAPSHOT 1366361:1396116M - markus - 2012-10-09 19:29:59 Reporter: Markus Jelsma Fix For: 4.1, 5.0 Attachments: SOLR-3925-trunk-1.patch Expose Lucene's SpanFirst capability in Solr's extended Dismax query parser. This issue adds the SF-parameter (SpanFirst) and takes a FIELD~DISTANCE^BOOST formatted value. For example, sf=title~5^2 will give a boost of 2 if one of the normal clauses, originally generated for automatic phrase queries, is located within five positions from the field's start. Unit test is included and all tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3925) Expose SpanFirst in eDismax
[ https://issues.apache.org/jira/browse/SOLR-3925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated SOLR-3925: Attachment: SOLR-3925-trunk-1.patch Expose SpanFirst in eDismax --- Key: SOLR-3925 URL: https://issues.apache.org/jira/browse/SOLR-3925 Project: Solr Issue Type: Improvement Components: query parsers Affects Versions: 4.0-BETA Environment: solr-spec 5.0.0.2012.10.09.19.29.59 solr-impl 5.0-SNAPSHOT 1366361:1396116M - markus - 2012-10-09 19:29:59 Reporter: Markus Jelsma Fix For: 4.1, 5.0 Attachments: SOLR-3925-trunk-1.patch Expose Lucene's SpanFirst capability in Solr's extended Dismax query parser. This issue adds the SF-parameter (SpanFirst) and takes a FIELD~DISTANCE^BOOST formatted value. For example, sf=title~5^2 will give a boost of 2 if one of the normal clauses, originally generated for automatic phrase queries, is located within five positions from the field's start. Unit test is included and all tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: VOTE: release 4.0 (RC2)
+1 On 10 October 2012 05:23, Yonik Seeley yo...@lucidworks.com wrote: +1 -Yonik http://lucidworks.com On Sat, Oct 6, 2012 at 4:10 AM, Robert Muir rcm...@gmail.com wrote: artifacts here: http://s.apache.org/lusolr40rc2 Thanks for the good inspection of rc#1 and finding bugs, which found test bugs and other bugs! I am happy this was all discovered and sorted out before release. vote stays open until wednesday, the weekend is just extra time for evaluating the RC. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Met vriendelijke groet, Martijn van Groningen - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-trunk-Linux-Java7-64-test-only - Build # 9490 - Failure!
Build: builds.flonkings.com/job/Lucene-trunk-Linux-Java7-64-test-only/9490/ 1 tests failed. REGRESSION: org.apache.lucene.util.TestPagedBytes.testDataInputOutput Error Message: n must be positive Stack Trace: java.lang.IllegalArgumentException: n must be positive at __randomizedtesting.SeedInfo.seed([E2AD98D7834D0534:B9E34FD3E8763D47]:0) at java.util.Random.nextInt(Random.java:300) at com.carrotsearch.randomizedtesting.AssertingRandom.nextInt(AssertingRandom.java:81) at org.apache.lucene.util.TestPagedBytes.testDataInputOutput(TestPagedBytes.java:68) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at java.lang.Thread.run(Thread.java:722) Build Log: [...truncated 484 lines...] [junit4:junit4] Suite: org.apache.lucene.util.TestPagedBytes [junit4:junit4] 2 NOTE: reproduce with: ant test -Dtestcase=TestPagedBytes -Dtests.method=testDataInputOutput -Dtests.seed=E2AD98D7834D0534 -Dtests.slow=true -Dtests.locale=fr_CA -Dtests.timezone=Pacific/Samoa -Dtests.file.encoding=ISO-8859-1 [junit4:junit4] ERROR 0.05s J5 | TestPagedBytes.testDataInputOutput [junit4:junit4] Throwable #1: java.lang.IllegalArgumentException: n must be positive [junit4:junit4]
Re: [JENKINS] Lucene-trunk-Linux-Java7-64-test-only - Build # 9490 - Failure!
I committed a fix ... test bug. Mike McCandless http://blog.mikemccandless.com On Wed, Oct 10, 2012 at 8:35 AM, buil...@flonkings.com wrote: Build: builds.flonkings.com/job/Lucene-trunk-Linux-Java7-64-test-only/9490/ 1 tests failed. REGRESSION: org.apache.lucene.util.TestPagedBytes.testDataInputOutput Error Message: n must be positive Stack Trace: java.lang.IllegalArgumentException: n must be positive at __randomizedtesting.SeedInfo.seed([E2AD98D7834D0534:B9E34FD3E8763D47]:0) at java.util.Random.nextInt(Random.java:300) at com.carrotsearch.randomizedtesting.AssertingRandom.nextInt(AssertingRandom.java:81) at org.apache.lucene.util.TestPagedBytes.testDataInputOutput(TestPagedBytes.java:68) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at java.lang.Thread.run(Thread.java:722) Build Log: [...truncated 484 lines...] [junit4:junit4] Suite: org.apache.lucene.util.TestPagedBytes [junit4:junit4] 2 NOTE: reproduce with: ant test -Dtestcase=TestPagedBytes -Dtests.method=testDataInputOutput -Dtests.seed=E2AD98D7834D0534 -Dtests.slow=true -Dtests.locale=fr_CA
[jira] [Created] (SOLR-3926) solrj should support better way of finding active sorts
Eirik Lygre created SOLR-3926: - Summary: solrj should support better way of finding active sorts Key: SOLR-3926 URL: https://issues.apache.org/jira/browse/SOLR-3926 Project: Solr Issue Type: Improvement Components: clients - java Affects Versions: 4.0-BETA Reporter: Eirik Lygre Priority: Minor The Solrj api uses ortogonal concepts for setting/removing and getting sort information. Setting/removing uses a combination of (name,order), while getters return a String name order: {code} public SolrQuery setSortField(String field, ORDER order); public SolrQuery addSortField(String field, ORDER order); public SolrQuery removeSortField(String field, ORDER order); public String[] getSortFields(); public String getSortField(); {code} If you want to use the current sort information to present a list of active sorts, with the possibility to remove then, you need to manually parse the string(s) returned from getSortFields, to recreate the information required by removeSortField(). Not difficult, but not convenient either :-) Therefore this suggestion: Add a new method {{public MapString,ORDER getSortFieldMap();}} which returns an ordered map of active sort fields. An example implementation is shown below (here as a utility method living outside SolrQuery; the rewrite should be trivial) {code} public MapString, ORDER getSortFieldMap(SolrQuery query) { String[] actualSortFields = query.getSortFields(); if (actualSortFields == null || actualSortFields.length == 0) return Collections.emptyMap(); MapString, ORDER sortFieldMap = new LinkedHashMapString, ORDER(); for (String sortField : actualSortFields) { String[] fieldSpec = sortField.split( ); sortFieldMap.put(fieldSpec[0], ORDER.valueOf(fieldSpec[1])); } return sortFieldMap; } {code} For what it's worth, this is possible client code: {code} System.out.println(Active sorts); MapString, ORDER fieldMap = getSortFieldMap(query); for (String field : fieldMap.keySet()) { System.out.println(- + field + ; dir= + fieldMap.get(field)); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Q: dataset with full text + judgements for IR eval
TREC? On Oct 4, 2012, at 10:49 AM, Otis Gospodnetic wrote: Hi, I checked the Wiki, but couldn't find any references to dataset that have: * full document content * queries with relevance judgements Are there any such datasets available? Thanks, Otis Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm Grant Ingersoll http://www.lucidworks.com
[jira] [Commented] (SOLR-3923) eDismax: complex fielded query with parens is not recognized
[ https://issues.apache.org/jira/browse/SOLR-3923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473194#comment-13473194 ] Jack Krupansky commented on SOLR-3923: -- It looks like the pf phrase boosting is not ignoring fielded terms. eDismax: complex fielded query with parens is not recognized Key: SOLR-3923 URL: https://issues.apache.org/jira/browse/SOLR-3923 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 3.5 Reporter: Shawn Heisey Fix For: 3.6.2 This is similar to SOLR-3377. That bug appears to have fixed this problem for 4.x. I can see the effects of SOLR-3377 when I test a query similar to below on the Solr 3.6 example, which is expected because SOLR-3377 was found in 3.6 but only fixed in 4.0. This bug is a little different, and exists in 3.5.0 for sure, possibly earlier. The first part of the parsed query looks right, but then something weird happens and it gets interpreted as a very strange phrase query. query URL sent to solr 3.5.0 example: {code}http://localhost:8983/solr/collection1/select?q=%28%28cat%3Astring1%29+%28Kitchen+Sink%29%29wt=xmldebugQuery=truedefType=edismaxqf=textpf=text^2.0 {code} parsedquery_toString: {code}+((cat:string1 ((text:kitchen) (text:sink)))~2) (text:cat:string1 kitchen sink^2.0) {code} Adding some spaces before and after cat:string1 fixes it: {code}http://localhost:8983/solr/collection1/select?q=%28%28%20cat%3Astring1%20%29+%28Kitchen+Sink%29%29wt=xmldebugQuery=truedefType=edismaxqf=textpf=text^2.0 {code} {code}+((cat:string1 ((text:kitchen) (text:sink)))~2) (text:kitchen sink^2.0) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4471) Test4GBStoredFields
Adrien Grand created LUCENE-4471: Summary: Test4GBStoredFields Key: LUCENE-4471 URL: https://issues.apache.org/jira/browse/LUCENE-4471 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Priority: Minor Fix For: 4.1, 5.0 Yesterday I fixed a bug (integer overflow) that only happens when a fields data (.fdt) file grows larger than 4GB. We should have a test for that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4472) Add setting that prevents merging on updateDocument
Simon Willnauer created LUCENE-4472: --- Summary: Add setting that prevents merging on updateDocument Key: LUCENE-4472 URL: https://issues.apache.org/jira/browse/LUCENE-4472 Project: Lucene - Core Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Fix For: 4.1, 5.0 Attachments: LUCENE-4472.patch Currently we always call maybeMerge if a segment was flushed after updateDocument. Some apps and in particular ElasticSearch uses some hacky workarounds to disable that ie for merge throttling. It should be easier to enable this kind of behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4472) Add setting that prevents merging on updateDocument
[ https://issues.apache.org/jira/browse/LUCENE-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-4472: Attachment: LUCENE-4472.patch here is a patch that adds such an option to IWC via live settings. Add setting that prevents merging on updateDocument --- Key: LUCENE-4472 URL: https://issues.apache.org/jira/browse/LUCENE-4472 Project: Lucene - Core Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Fix For: 4.1, 5.0 Attachments: LUCENE-4472.patch Currently we always call maybeMerge if a segment was flushed after updateDocument. Some apps and in particular ElasticSearch uses some hacky workarounds to disable that ie for merge throttling. It should be easier to enable this kind of behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4471) Test4GBStoredFields
[ https://issues.apache.org/jira/browse/LUCENE-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-4471: - Attachment: Test4GBStoredFields.java Test case that finds the bug that I fixed yesterday. The only problem is that it is very slow (~ 100s with Lucene40, up to 250s with Compressing). Test4GBStoredFields --- Key: LUCENE-4471 URL: https://issues.apache.org/jira/browse/LUCENE-4471 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Priority: Minor Fix For: 4.1, 5.0 Attachments: Test4GBStoredFields.java Yesterday I fixed a bug (integer overflow) that only happens when a fields data (.fdt) file grows larger than 4GB. We should have a test for that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4471) Test4GBStoredFields
[ https://issues.apache.org/jira/browse/LUCENE-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473245#comment-13473245 ] Dawid Weiss commented on LUCENE-4471: - Make it a @Nightly and not worry? :) Test4GBStoredFields --- Key: LUCENE-4471 URL: https://issues.apache.org/jira/browse/LUCENE-4471 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Priority: Minor Fix For: 4.1, 5.0 Attachments: Test4GBStoredFields.java Yesterday I fixed a bug (integer overflow) that only happens when a fields data (.fdt) file grows larger than 4GB. We should have a test for that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4471) Test4GBStoredFields
[ https://issues.apache.org/jira/browse/LUCENE-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473249#comment-13473249 ] Robert Muir commented on LUCENE-4471: - +1 for just making it a nightly test. Test4GBStoredFields --- Key: LUCENE-4471 URL: https://issues.apache.org/jira/browse/LUCENE-4471 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Priority: Minor Fix For: 4.1, 5.0 Attachments: Test4GBStoredFields.java Yesterday I fixed a bug (integer overflow) that only happens when a fields data (.fdt) file grows larger than 4GB. We should have a test for that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4471) Test4GBStoredFields
[ https://issues.apache.org/jira/browse/LUCENE-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473252#comment-13473252 ] Michael McCandless commented on LUCENE-4471: +1 Test4GBStoredFields --- Key: LUCENE-4471 URL: https://issues.apache.org/jira/browse/LUCENE-4471 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Priority: Minor Fix For: 4.1, 5.0 Attachments: Test4GBStoredFields.java Yesterday I fixed a bug (integer overflow) that only happens when a fields data (.fdt) file grows larger than 4GB. We should have a test for that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4472) Add setting that prevents merging on updateDocument
[ https://issues.apache.org/jira/browse/LUCENE-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473255#comment-13473255 ] Michael McCandless commented on LUCENE-4472: Neat :) How does ES customize / throttle its merging? Maybe the setting should mean we never call maybeMerge implicitly? (Ie, neither on close nor NRT reader or any other time), rather than just singling out updateDocument/addDocument? Eg if we add other methods in the future (like field updates), it should also prevent those from doing merges? Add setting that prevents merging on updateDocument --- Key: LUCENE-4472 URL: https://issues.apache.org/jira/browse/LUCENE-4472 Project: Lucene - Core Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Fix For: 4.1, 5.0 Attachments: LUCENE-4472.patch Currently we always call maybeMerge if a segment was flushed after updateDocument. Some apps and in particular ElasticSearch uses some hacky workarounds to disable that ie for merge throttling. It should be easier to enable this kind of behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4471) Test4GBStoredFields
[ https://issues.apache.org/jira/browse/LUCENE-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473260#comment-13473260 ] Uwe Schindler commented on LUCENE-4471: --- Does it produce a file of this size? Test4GBStoredFields --- Key: LUCENE-4471 URL: https://issues.apache.org/jira/browse/LUCENE-4471 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Priority: Minor Fix For: 4.1, 5.0 Attachments: Test4GBStoredFields.java Yesterday I fixed a bug (integer overflow) that only happens when a fields data (.fdt) file grows larger than 4GB. We should have a test for that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4472) Add setting that prevents merging on updateDocument
[ https://issues.apache.org/jira/browse/LUCENE-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473264#comment-13473264 ] selckin commented on LUCENE-4472: - +1, if you have a lot of indexes and they all start merging at the same time it can be quite taxing I think ES has dedicated configurable thread pool where for each index a maybeMerge() is scheduled on an interval. (Size of thread pool limits number of concurrent merges) Add setting that prevents merging on updateDocument --- Key: LUCENE-4472 URL: https://issues.apache.org/jira/browse/LUCENE-4472 Project: Lucene - Core Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Fix For: 4.1, 5.0 Attachments: LUCENE-4472.patch Currently we always call maybeMerge if a segment was flushed after updateDocument. Some apps and in particular ElasticSearch uses some hacky workarounds to disable that ie for merge throttling. It should be easier to enable this kind of behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4472) Add setting that prevents merging on updateDocument
[ https://issues.apache.org/jira/browse/LUCENE-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473265#comment-13473265 ] Shai Erera commented on LUCENE-4472: Patch looks good. One minor comment -- LiveIWC.setMaybeMergeAfterFlush() contains a redundant 'a' in its javadocs -- .. after a each segment. bq. Maybe the setting should mean we never call maybeMerge implicitly? Perhaps the issue's description should change to Add setting to prevent merging on segment flush. Because as I understand the fix, the check is made only after a segment has been flushed, which already covers addDocument/updateDocument as well as future field updates? Add setting that prevents merging on updateDocument --- Key: LUCENE-4472 URL: https://issues.apache.org/jira/browse/LUCENE-4472 Project: Lucene - Core Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Fix For: 4.1, 5.0 Attachments: LUCENE-4472.patch Currently we always call maybeMerge if a segment was flushed after updateDocument. Some apps and in particular ElasticSearch uses some hacky workarounds to disable that ie for merge throttling. It should be easier to enable this kind of behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4471) Test4GBStoredFields
[ https://issues.apache.org/jira/browse/LUCENE-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473268#comment-13473268 ] Adrien Grand commented on LUCENE-4471: -- Yes it does (a little larger actually). I didn't find a way to lie to stored fields. Test4GBStoredFields --- Key: LUCENE-4471 URL: https://issues.apache.org/jira/browse/LUCENE-4471 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Priority: Minor Fix For: 4.1, 5.0 Attachments: Test4GBStoredFields.java Yesterday I fixed a bug (integer overflow) that only happens when a fields data (.fdt) file grows larger than 4GB. We should have a test for that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4472) Add setting that prevents merging on updateDocument
[ https://issues.apache.org/jira/browse/LUCENE-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473270#comment-13473270 ] Robert Muir commented on LUCENE-4472: - Can we consider instead giving MergePolicy the proper context here instead of adding a boolean? This seems more flexible. Add setting that prevents merging on updateDocument --- Key: LUCENE-4472 URL: https://issues.apache.org/jira/browse/LUCENE-4472 Project: Lucene - Core Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Fix For: 4.1, 5.0 Attachments: LUCENE-4472.patch Currently we always call maybeMerge if a segment was flushed after updateDocument. Some apps and in particular ElasticSearch uses some hacky workarounds to disable that ie for merge throttling. It should be easier to enable this kind of behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4471) Test4GBStoredFields
[ https://issues.apache.org/jira/browse/LUCENE-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473279#comment-13473279 ] Adrien Grand commented on LUCENE-4471: -- Uwe, is it a problem? Test4GBStoredFields --- Key: LUCENE-4471 URL: https://issues.apache.org/jira/browse/LUCENE-4471 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Priority: Minor Fix For: 4.1, 5.0 Attachments: Test4GBStoredFields.java Yesterday I fixed a bug (integer overflow) that only happens when a fields data (.fdt) file grows larger than 4GB. We should have a test for that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4471) Test4GBStoredFields
[ https://issues.apache.org/jira/browse/LUCENE-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473300#comment-13473300 ] Uwe Schindler commented on LUCENE-4471: --- Depends on the Jenkins server :-) The nightly tests are running only at apache and that one has enough space. The Windows one by SDDS might have disk space problems (virtual Windows 7 machine with few diskspace), but we don't run nightly on it. Test4GBStoredFields --- Key: LUCENE-4471 URL: https://issues.apache.org/jira/browse/LUCENE-4471 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Priority: Minor Fix For: 4.1, 5.0 Attachments: Test4GBStoredFields.java Yesterday I fixed a bug (integer overflow) that only happens when a fields data (.fdt) file grows larger than 4GB. We should have a test for that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4473) BlockPF encodes offsets inefficiently
Robert Muir created LUCENE-4473: --- Summary: BlockPF encodes offsets inefficiently Key: LUCENE-4473 URL: https://issues.apache.org/jira/browse/LUCENE-4473 Project: Lucene - Core Issue Type: Sub-task Components: core/codecs Reporter: Robert Muir when writing a vint block. It should write these like Lucene40 does. Here is geonames (all 19 fields as textfields with offsets): trunk _68_Block_0.pos: 178700442 patch _68_Block_0.pos: 155929641 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4473) BlockPF encodes offsets inefficiently
[ https://issues.apache.org/jira/browse/LUCENE-4473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-4473: Attachment: LUCENE-4473.patch patch. we already bumped Block's version in 4.1 to fix other bugs so we don't need to do it again. BlockPF encodes offsets inefficiently - Key: LUCENE-4473 URL: https://issues.apache.org/jira/browse/LUCENE-4473 Project: Lucene - Core Issue Type: Sub-task Components: core/codecs Reporter: Robert Muir Fix For: 4.1 Attachments: LUCENE-4473.patch when writing a vint block. It should write these like Lucene40 does. Here is geonames (all 19 fields as textfields with offsets): trunk _68_Block_0.pos: 178700442 patch _68_Block_0.pos: 155929641 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4467) SegmentReader.loadDeletedDocs FileNotFoundExceptio load _hko_7.del - corrupted index
[ https://issues.apache.org/jira/browse/LUCENE-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473330#comment-13473330 ] B.Nicolotti commented on LUCENE-4467: - We deleted the index folder this folder and started from empty index, the system worked without problems all the day indexing the xml produced today, until we had another problem, see below. We've 2 web applications in one tomcat java process that write the index. The two web applications use the same version of Lucene, 3.6.0. May this be a problem? Shouldn't each web application obtain the write.lok before to write the index? The index is small, so I can attach it Many thanks Best regards Wed Oct 10 17:34:05 CEST 2012:com.siap.WebServices.Utility.UtiIndexerLucene caught an exception: 16801917 java.io.FileNotFoundException e.toString():java.io.FileNotFoundException: _42.fdt, e.getMessage():_42.fdt org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:284) org.apache.lucene.index.SegmentInfo.sizeInBytes(SegmentInfo.java:303) org.apache.lucene.index.TieredMergePolicy.size(TieredMergePolicy.java:635) org.apache.lucene.index.TieredMergePolicy.useCompoundFile(TieredMergePolicy.java:613) org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:593) org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3580) org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3545) org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1852) org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1812) org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1776) com.siap.WebServices.Utility.UtiIndexerLucene.indexFile(UtiIndexerLucene.java:272) com.siap.WebServices.Utility.UtiLogPrintingThread.run(UtiLogPrintingThread.java:146) Somma controllo versione: Server info:Apache Tomcat/5.5.23@127.0.0.1(tomcatdemo) SegmentReader.loadDeletedDocs FileNotFoundExceptio load _hko_7.del - corrupted index Key: LUCENE-4467 URL: https://issues.apache.org/jira/browse/LUCENE-4467 Project: Lucene - Core Issue Type: Bug Affects Versions: 3.6 Environment: Currently using: java -version java version 1.5.0_13 Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_13-b05) Java HotSpot(TM) Client VM (build 1.5.0_13-b05, mixed mode, sharing) Tomcat 5.5 lucene 3.6.0 Reporter: B.Nicolotti Attachments: index.zip We're using lucene to index XML. We've had it in test on a server for some weeks with no problem, but today we've got the error below and the index seems no longer usable. Could you please tell us 1) is there a way to recover the index? 2) is there a way to avoid this error? I can supply the index if needed many thanks Tue Oct 09 17:41:02 CEST 2012:com.siap.WebServices.Utility.UtiIndexerLucene caught an exception: 32225010 java.io.FileNotFoundException e.toString():java.io.FileNotFoundException: /usr/local/WS_DynPkg/logs/index/_hko_7.del (No such file or directory), e.getMessage():/usr/local/WS_DynPkg/logs/index/_hko_7.del (No such file or directory) java.io.RandomAccessFile.open(Native Method) java.io.RandomAccessFile.init(RandomAccessFile.java:212) org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descriptor.init(SimpleFSDirectory.java:71) org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.init(SimpleFSDirectory.java:98) org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.init(NIOFSDirectory.java:92) org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:79) org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:345) org.apache.lucene.util.BitVector.init(BitVector.java:266) org.apache.lucene.index.SegmentReader.loadDeletedDocs(SegmentReader.java:160) org.apache.lucene.index.SegmentReader.get(SegmentReader.java:120) org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:696) org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:671) org.apache.lucene.index.BufferedDeletesStream.applyDeletes(BufferedDeletesStream.java:244) org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3608) org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3545) org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1852) org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1812) org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1776) com.siap.WebServices.Utility.UtiIndexerLucene.delete(UtiIndexerLucene.java:143) com.siap.WebServices.Utility.UtiIndexerLucene.indexFile(UtiIndexerLucene.java:221) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (LUCENE-4467) SegmentReader.loadDeletedDocs FileNotFoundExceptio load _hko_7.del - corrupted index
[ https://issues.apache.org/jira/browse/LUCENE-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] B.Nicolotti updated LUCENE-4467: Attachment: index.zip SegmentReader.loadDeletedDocs FileNotFoundExceptio load _hko_7.del - corrupted index Key: LUCENE-4467 URL: https://issues.apache.org/jira/browse/LUCENE-4467 Project: Lucene - Core Issue Type: Bug Affects Versions: 3.6 Environment: Currently using: java -version java version 1.5.0_13 Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_13-b05) Java HotSpot(TM) Client VM (build 1.5.0_13-b05, mixed mode, sharing) Tomcat 5.5 lucene 3.6.0 Reporter: B.Nicolotti Attachments: index.zip We're using lucene to index XML. We've had it in test on a server for some weeks with no problem, but today we've got the error below and the index seems no longer usable. Could you please tell us 1) is there a way to recover the index? 2) is there a way to avoid this error? I can supply the index if needed many thanks Tue Oct 09 17:41:02 CEST 2012:com.siap.WebServices.Utility.UtiIndexerLucene caught an exception: 32225010 java.io.FileNotFoundException e.toString():java.io.FileNotFoundException: /usr/local/WS_DynPkg/logs/index/_hko_7.del (No such file or directory), e.getMessage():/usr/local/WS_DynPkg/logs/index/_hko_7.del (No such file or directory) java.io.RandomAccessFile.open(Native Method) java.io.RandomAccessFile.init(RandomAccessFile.java:212) org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descriptor.init(SimpleFSDirectory.java:71) org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.init(SimpleFSDirectory.java:98) org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.init(NIOFSDirectory.java:92) org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:79) org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:345) org.apache.lucene.util.BitVector.init(BitVector.java:266) org.apache.lucene.index.SegmentReader.loadDeletedDocs(SegmentReader.java:160) org.apache.lucene.index.SegmentReader.get(SegmentReader.java:120) org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:696) org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:671) org.apache.lucene.index.BufferedDeletesStream.applyDeletes(BufferedDeletesStream.java:244) org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3608) org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3545) org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1852) org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1812) org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1776) com.siap.WebServices.Utility.UtiIndexerLucene.delete(UtiIndexerLucene.java:143) com.siap.WebServices.Utility.UtiIndexerLucene.indexFile(UtiIndexerLucene.java:221) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4474) CloseableThreadLocal maybePurge could be too expensive
Robert Muir created LUCENE-4474: --- Summary: CloseableThreadLocal maybePurge could be too expensive Key: LUCENE-4474 URL: https://issues.apache.org/jira/browse/LUCENE-4474 Project: Lucene - Core Issue Type: Task Reporter: Robert Muir Was doing some tests with geonames database (19 fields, just using StandardAnalyzer), and noticed this in the profiler. It could be a ghost, but we should investigate anyway. It seems ridiculous for a situation like mine: * indexing with one thread * every 40 Analyzer.tokenStream() calls [basically every other doc], this thing is called * it gets iterators over the map, checks threads, this and that. but of course there is only one thread! Maybe its a good idea if it checks size() first or something. at least dont do this stuff if size() == 1, as I bet a lot of people index with a single thread. Or maybe all this stuff is really cheap and its just a ghost. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4474) CloseableThreadLocal maybePurge could be too expensive
[ https://issues.apache.org/jira/browse/LUCENE-4474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473336#comment-13473336 ] Uwe Schindler commented on LUCENE-4474: --- We should do this check in all cases! If there is only =1 entry in the map, we don't need to do anything (because this is the live thread!). CloseableThreadLocal maybePurge could be too expensive -- Key: LUCENE-4474 URL: https://issues.apache.org/jira/browse/LUCENE-4474 Project: Lucene - Core Issue Type: Task Reporter: Robert Muir Was doing some tests with geonames database (19 fields, just using StandardAnalyzer), and noticed this in the profiler. It could be a ghost, but we should investigate anyway. It seems ridiculous for a situation like mine: * indexing with one thread * every 40 Analyzer.tokenStream() calls [basically every other doc], this thing is called * it gets iterators over the map, checks threads, this and that. but of course there is only one thread! Maybe its a good idea if it checks size() first or something. at least dont do this stuff if size() == 1, as I bet a lot of people index with a single thread. Or maybe all this stuff is really cheap and its just a ghost. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3922) AbstractSolrTestCase duplicates a lot from SolrTestCaseJ4 and is one of the few lines of Solr test classes that do not inherit from SolrTestCaseJ4.
[ https://issues.apache.org/jira/browse/SOLR-3922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473343#comment-13473343 ] Mark Miller commented on SOLR-3922: --- Moving these tests over to SolrTestCaseJ4 should also bring some speed gains since the SolrTestCaseJ4 tests generally use the same CoreContainer/SolrCore across test methods. AbstractSolrTestCase duplicates a lot from SolrTestCaseJ4 and is one of the few lines of Solr test classes that do not inherit from SolrTestCaseJ4. --- Key: SOLR-3922 URL: https://issues.apache.org/jira/browse/SOLR-3922 Project: Solr Issue Type: Improvement Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.1, 5.0 I plan on fixing both of these issues as part of my work on SOLR-3911. Most of AbstractSolrTestCase can go away. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4471) Test4GBStoredFields
[ https://issues.apache.org/jira/browse/LUCENE-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-4471. -- Resolution: Fixed Assignee: Adrien Grand Committed (r1396656 on trunk and r1396671 on branch 4.x). Test4GBStoredFields --- Key: LUCENE-4471 URL: https://issues.apache.org/jira/browse/LUCENE-4471 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.1, 5.0 Attachments: Test4GBStoredFields.java Yesterday I fixed a bug (integer overflow) that only happens when a fields data (.fdt) file grows larger than 4GB. We should have a test for that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3927) Ability to use CompressingStoredFieldsFormat
Adrien Grand created SOLR-3927: -- Summary: Ability to use CompressingStoredFieldsFormat Key: SOLR-3927 URL: https://issues.apache.org/jira/browse/SOLR-3927 Project: Solr Issue Type: Task Reporter: Adrien Grand Priority: Trivial It would be nice to let Solr users use {{CompressingStoredFieldsFormat}} to compress their stored fields (with warnings given that this feature is experimental and that we don't guarantee backwards compat for it). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3927) Ability to use CompressingStoredFieldsFormat
[ https://issues.apache.org/jira/browse/SOLR-3927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated SOLR-3927: --- Assignee: Adrien Grand Ability to use CompressingStoredFieldsFormat Key: SOLR-3927 URL: https://issues.apache.org/jira/browse/SOLR-3927 Project: Solr Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Trivial It would be nice to let Solr users use {{CompressingStoredFieldsFormat}} to compress their stored fields (with warnings given that this feature is experimental and that we don't guarantee backwards compat for it). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3911) Make Directory and DirectoryFactory first class so that the majority of Solr's features work with any custom implementations.
[ https://issues.apache.org/jira/browse/SOLR-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-3911: -- Attachment: SOLR-3911.patch This issue has been very satisfying. All tests passing. I still force an fs directory for 2 solrcloud tests due to the recovery issue mentioned above. We can probably fix that in another issue. Make Directory and DirectoryFactory first class so that the majority of Solr's features work with any custom implementations. - Key: SOLR-3911 URL: https://issues.apache.org/jira/browse/SOLR-3911 Project: Solr Issue Type: Improvement Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.1, 5.0 Attachments: SOLR-3911.patch, SOLR-3911.patch, SOLR-3911.patch The biggest issue is that many parts of Solr rely on a local file system based Directory implementation - most notably, replication. This should all be changed to use the Directory and DirectoryFactory abstractions. Other parts of the code that count on the local file system for making paths and getting file sizes should also be changed to use Directory and/or DirectoryFactory. Original title: Replication should work with any Directory impl, not just local filesystem based Directories. I've wanted to do this for a long time - there is no reason replication should not support any directory impl. This will let us use the mockdir for replication tests rather than having to force an FSDir and lose all the extra test checks and simulations. This will improve our testing around replication a lot, and allow custom Directory impls to be used on multi node Solr. Expanded scope - full first class support for DirectoryFactory and Directory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3928) A PropertiesEntityProcessor for DIH
Tricia Jenkins created SOLR-3928: Summary: A PropertiesEntityProcessor for DIH Key: SOLR-3928 URL: https://issues.apache.org/jira/browse/SOLR-3928 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: Tricia Jenkins Priority: Minor Fix For: 4.0 Add a simple PropertiesEntityProcessor which can read from any DataSourceReader and output rows corresponding to the a href=http://docs.oracle.com/javase/7/docs/api/java/util/Properties.html;properties/a file key/value pairs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3928) A PropertiesEntityProcessor for DIH
[ https://issues.apache.org/jira/browse/SOLR-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tricia Jenkins updated SOLR-3928: - Attachment: SOLR-3928.patch PropertiesEntityProcessor with test. It's in the dataimporthandler-extras directory but dataimporthandler might make more sense. A PropertiesEntityProcessor for DIH --- Key: SOLR-3928 URL: https://issues.apache.org/jira/browse/SOLR-3928 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: Tricia Jenkins Priority: Minor Labels: dih, patch, test Fix For: 4.0 Attachments: SOLR-3928.patch Add a simple PropertiesEntityProcessor which can read from any DataSourceReader and output rows corresponding to the a href=http://docs.oracle.com/javase/7/docs/api/java/util/Properties.html;properties/a file key/value pairs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: VOTE: release 4.0 (RC2)
+1 smoketest succeeded on macos 10.7.4. Michael On 10/6/12 1:10 AM, Robert Muir wrote: artifacts here: http://s.apache.org/lusolr40rc2 Thanks for the good inspection of rc#1 and finding bugs, which found test bugs and other bugs! I am happy this was all discovered and sorted out before release. vote stays open until wednesday, the weekend is just extra time for evaluating the RC. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3928) A PropertiesEntityProcessor for DIH
[ https://issues.apache.org/jira/browse/SOLR-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tricia Jenkins updated SOLR-3928: - Description: Add a simple PropertiesEntityProcessor which can read from any DataSourceReader and output rows corresponding to the [properties|http://docs.oracle.com/javase/7/docs/api/java/util/Properties.html] file key/value pairs. (was: Add a simple PropertiesEntityProcessor which can read from any DataSourceReader and output rows corresponding to the a href=http://docs.oracle.com/javase/7/docs/api/java/util/Properties.html;properties/a file key/value pairs.) A PropertiesEntityProcessor for DIH --- Key: SOLR-3928 URL: https://issues.apache.org/jira/browse/SOLR-3928 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: Tricia Jenkins Priority: Minor Labels: dih, patch, test Fix For: 4.0 Attachments: SOLR-3928.patch Add a simple PropertiesEntityProcessor which can read from any DataSourceReader and output rows corresponding to the [properties|http://docs.oracle.com/javase/7/docs/api/java/util/Properties.html] file key/value pairs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3928) A PropertiesEntityProcessor for DIH
[ https://issues.apache.org/jira/browse/SOLR-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tricia Jenkins updated SOLR-3928: - Fix Version/s: (was: 4.0) 4.1 A PropertiesEntityProcessor for DIH --- Key: SOLR-3928 URL: https://issues.apache.org/jira/browse/SOLR-3928 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: Tricia Jenkins Priority: Minor Labels: dih, patch, test Fix For: 4.1 Attachments: SOLR-3928.patch Add a simple PropertiesEntityProcessor which can read from any DataSourceReader and output rows corresponding to the [properties|http://docs.oracle.com/javase/7/docs/api/java/util/Properties.html] file key/value pairs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4473) BlockPF encodes offsets inefficiently
[ https://issues.apache.org/jira/browse/LUCENE-4473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473554#comment-13473554 ] Michael McCandless commented on LUCENE-4473: +1 BlockPF encodes offsets inefficiently - Key: LUCENE-4473 URL: https://issues.apache.org/jira/browse/LUCENE-4473 Project: Lucene - Core Issue Type: Sub-task Components: core/codecs Reporter: Robert Muir Fix For: 4.1 Attachments: LUCENE-4473.patch when writing a vint block. It should write these like Lucene40 does. Here is geonames (all 19 fields as textfields with offsets): trunk _68_Block_0.pos: 178700442 patch _68_Block_0.pos: 155929641 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4472) Add setting that prevents merging on updateDocument
[ https://issues.apache.org/jira/browse/LUCENE-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473617#comment-13473617 ] Shay Banon commented on LUCENE-4472: Agree with Robert on the additional context flag, that would make things most flexible. A flag on IW makes things simpler from the user perspective though, cause then there is no need to customize the built in merge policies. Add setting that prevents merging on updateDocument --- Key: LUCENE-4472 URL: https://issues.apache.org/jira/browse/LUCENE-4472 Project: Lucene - Core Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Fix For: 4.1, 5.0 Attachments: LUCENE-4472.patch Currently we always call maybeMerge if a segment was flushed after updateDocument. Some apps and in particular ElasticSearch uses some hacky workarounds to disable that ie for merge throttling. It should be easier to enable this kind of behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4472) Add setting that prevents merging on updateDocument
[ https://issues.apache.org/jira/browse/LUCENE-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473624#comment-13473624 ] Robert Muir commented on LUCENE-4472: - I don't think you need to customize anything built in. Just delegate and forward findMerges()? The problem is this doesn't really have the necessary context today: I think we should fix that. But I'd like policies around merging to stay in ... MergePolicy :) Otherwise IWC could easily get cluttered with conflicting options which makes it complex. Add setting that prevents merging on updateDocument --- Key: LUCENE-4472 URL: https://issues.apache.org/jira/browse/LUCENE-4472 Project: Lucene - Core Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Fix For: 4.1, 5.0 Attachments: LUCENE-4472.patch Currently we always call maybeMerge if a segment was flushed after updateDocument. Some apps and in particular ElasticSearch uses some hacky workarounds to disable that ie for merge throttling. It should be easier to enable this kind of behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4472) Add setting that prevents merging on updateDocument
[ https://issues.apache.org/jira/browse/LUCENE-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473626#comment-13473626 ] Simon Willnauer commented on LUCENE-4472: - The MergePolicy is tricky here since we clone the MP in IW you need to actually pull and cast the MP from IW to change the setting if you want to do this in realtime. Maybe we can add something like this to MP so we can change MergeSettings in RT too. Otherwise you need to build a special MP but we can certainly do that. Add setting that prevents merging on updateDocument --- Key: LUCENE-4472 URL: https://issues.apache.org/jira/browse/LUCENE-4472 Project: Lucene - Core Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Fix For: 4.1, 5.0 Attachments: LUCENE-4472.patch Currently we always call maybeMerge if a segment was flushed after updateDocument. Some apps and in particular ElasticSearch uses some hacky workarounds to disable that ie for merge throttling. It should be easier to enable this kind of behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: VOTE: release 4.0 (RC2)
if buschmi votes we are good :D simon On Wed, Oct 10, 2012 at 9:30 PM, Michael Busch busch...@gmail.com wrote: +1 smoketest succeeded on macos 10.7.4. Michael On 10/6/12 1:10 AM, Robert Muir wrote: artifacts here: http://s.apache.org/lusolr40rc2 Thanks for the good inspection of rc#1 and finding bugs, which found test bugs and other bugs! I am happy this was all discovered and sorted out before release. vote stays open until wednesday, the weekend is just extra time for evaluating the RC. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3929) support configuring IndexWriter max thread count in solrconfig
Patrick Hunt created SOLR-3929: -- Summary: support configuring IndexWriter max thread count in solrconfig Key: SOLR-3929 URL: https://issues.apache.org/jira/browse/SOLR-3929 Project: Solr Issue Type: New Feature Affects Versions: 3.1 Reporter: Patrick Hunt Assignee: Patrick Hunt Fix For: 4.1, 5.0 Lucene 3.1.0 added the ability to configure the IndexWriter's previously fixed internal thread limit by calling setMaxThreadStates. This parameter should be exposed through Solr configuration. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3929) support configuring IndexWriter max thread count in solrconfig
[ https://issues.apache.org/jira/browse/SOLR-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated SOLR-3929: --- Attachment: SOLR-3929.patch Added configuration parameter as indexConfig/maxIndexingThreads. Also added a test and example solrconfig.xml. support configuring IndexWriter max thread count in solrconfig -- Key: SOLR-3929 URL: https://issues.apache.org/jira/browse/SOLR-3929 Project: Solr Issue Type: New Feature Affects Versions: 3.1 Reporter: Patrick Hunt Assignee: Patrick Hunt Fix For: 4.1, 5.0 Attachments: SOLR-3929.patch Lucene 3.1.0 added the ability to configure the IndexWriter's previously fixed internal thread limit by calling setMaxThreadStates. This parameter should be exposed through Solr configuration. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4473) BlockPF encodes offsets inefficiently
[ https://issues.apache.org/jira/browse/LUCENE-4473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-4473. - Resolution: Fixed Fix Version/s: 5.0 BlockPF encodes offsets inefficiently - Key: LUCENE-4473 URL: https://issues.apache.org/jira/browse/LUCENE-4473 Project: Lucene - Core Issue Type: Sub-task Components: core/codecs Reporter: Robert Muir Fix For: 4.1, 5.0 Attachments: LUCENE-4473.patch when writing a vint block. It should write these like Lucene40 does. Here is geonames (all 19 fields as textfields with offsets): trunk _68_Block_0.pos: 178700442 patch _68_Block_0.pos: 155929641 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2058) Adds optional phrase slop to edismax pf2, pf3 and pf parameters with field~slop^boost syntax
[ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473765#comment-13473765 ] Ron Mayer commented on SOLR-2058: - I just tried them both (the committed one, and my original patch); and at least they both produce much better relevancy on my test data than I was able to get without the patch. However I agree with you that it looks to me like the change was probably unintentional and seems different from the way I think normal dismax queries work. TL/DR: I'm not sure. Anyone else care to either testcompare them or just look at the code and see which is more reasonable? Adds optional phrase slop to edismax pf2, pf3 and pf parameters with field~slop^boost syntax Key: SOLR-2058 URL: https://issues.apache.org/jira/browse/SOLR-2058 Project: Solr Issue Type: Improvement Components: query parsers Environment: n/a Reporter: Ron Mayer Assignee: James Dyer Priority: Minor Fix For: 4.0-ALPHA Attachments: edismax_pf_with_slop_v2.1.patch, edismax_pf_with_slop_v2.patch, pf2_with_slop.patch, SOLR-2058-and-3351-not-finished.patch, SOLR-2058.patch http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3c4c659119.2010...@0ape.com%3E {quote} From Ron Mayer r...@0ape.com ... my results might be even better if I had a couple different pf2s with different ps's at the same time. In particular. One with ps=0 to put a high boost on ones the have the right ordering of words. For example insuring that [the query]: red hat black jacket boosts only documents with red hats and not black hats. And another pf2 with a more modest boost with ps=5 or so to handle the query above also boosting docs with red baseball hat. {quote} [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3caanlktimd+v3g6d_mnhp+jykkd+dej8fvmvf_1lqoi...@mail.gmail.com%3E] {quote} From Yonik Seeley yo...@lucidimagination.com Perhaps fold it into the pf/pf2 syntax? pf=text^2// current syntax... makes phrases with a boost of 2 pf=text~1^2 // proposed syntax... makes phrases with a slop of 1 and a boost of 2 That actually seems pretty natural given the lucene query syntax - an actual boosted sloppy phrase query already looks like {{text:foo bar~1^2}} -Yonik {quote} [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3calpine.deb.1.10.1008161300510.6...@radix.cryptio.net%3E] {quote} From Chris Hostetter hossman_luc...@fucit.org Big +1 to this idea ... the existing ps param can stick arround as the default for any field that doesn't specify it's own slop in the pf/pf2/pf3 fields using the ~ syntax. -Hoss {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3221) Make Shard handler threadpool configurable
[ https://issues.apache.org/jira/browse/SOLR-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473791#comment-13473791 ] David Smiley commented on SOLR-3221: Greg, I heard you intend to add a small patch to flip Solr 4's default on this feature? I was picking Erick since forever and he pawned it off on you. Make Shard handler threadpool configurable -- Key: SOLR-3221 URL: https://issues.apache.org/jira/browse/SOLR-3221 Project: Solr Issue Type: Improvement Affects Versions: 3.6, 4.0-ALPHA Reporter: Greg Bowyer Assignee: Erick Erickson Labels: distributed, http, shard Fix For: 3.6, 4.0-ALPHA Attachments: SOLR-3221-3x_branch.patch, SOLR-3221-3x_branch.patch, SOLR-3221-3x_branch.patch, SOLR-3221-3x_branch.patch, SOLR-3221-3x_branch.patch, SOLR-3221-trunk.patch, SOLR-3221-trunk.patch, SOLR-3221-trunk.patch, SOLR-3221-trunk.patch, SOLR-3221-trunk.patch From profiling of monitor contention, as well as observations of the 95th and 99th response times for nodes that perform distributed search (or ‟aggregator‟ nodes) it would appear that the HttpShardHandler code currently does a suboptimal job of managing outgoing shard level requests. Presently the code contained within lucene 3.5's SearchHandler and Lucene trunk / 3x's ShardHandlerFactory create arbitrary threads in order to service distributed search requests. This is done presently to limit the size of the threadpool such that it does not consume resources in deployment configurations that do not use distributed search. This unfortunately has two impacts on the response time if the node coordinating the distribution is under high load. The usage of the MaxConnectionsPerHost configuration option results in aggressive activity on semaphores within HttpCommons, it has been observed that the aggregator can have a response time far greater than that of the searchers. The above monitor contention would appear to suggest that in some cases its possible for liveness issues to occur and for simple queries to be starved of resources simply due to a lack of attention from the viewpoint of context switching. With, as mentioned above the http commons connection being hotly contended The fair, queue based configuration eliminates this, at the cost of throughput. This patch aims to make the threadpool largely configurable allowing for those using solr to choose the throughput vs latency balance they desire. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4475) eDismax boost on multiValued fields
Bill Bell created LUCENE-4475: - Summary: eDismax boost on multiValued fields Key: LUCENE-4475 URL: https://issues.apache.org/jira/browse/LUCENE-4475 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.0 Reporter: Bill Bell Replace bq with boost, but we get the multi-valued field issue when we try to do the equivalent queries HTTP ERROR 400 Problem accessing /solr/providersearch/select. Reason: can not use FieldCache on multivalued field: specialties_ids q=*:*bq=multi_field:87^2defType=dismax How do you do this using boost? q=*:*boost=multi_field:87defType=edismax -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3930) eDismax Multivalued boost
Bill Bell created SOLR-3930: --- Summary: eDismax Multivalued boost Key: SOLR-3930 URL: https://issues.apache.org/jira/browse/SOLR-3930 Project: Solr Issue Type: Bug Affects Versions: 4.0 Reporter: Bill Bell Want to replace bq with boost, but we get the multi-valued field issue when we try to do the equivalent queries HTTP ERROR 400 Problem accessing /solr/providersearch/select. Reason: can not use FieldCache on multivalued field: specialties_ids q=*:*bq=multi_field:87^2defType=dismax How do you do this using boost? q=*:*boost=multi_field:87defType=edismax We know we can use bq with edismax, but we like the multiply feature of boost. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Closed] (LUCENE-4475) eDismax boost on multiValued fields
[ https://issues.apache.org/jira/browse/LUCENE-4475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bell closed LUCENE-4475. - Resolution: Fixed eDismax boost on multiValued fields --- Key: LUCENE-4475 URL: https://issues.apache.org/jira/browse/LUCENE-4475 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.0 Reporter: Bill Bell Replace bq with boost, but we get the multi-valued field issue when we try to do the equivalent queries HTTP ERROR 400 Problem accessing /solr/providersearch/select. Reason: can not use FieldCache on multivalued field: specialties_ids q=*:*bq=multi_field:87^2defType=dismax How do you do this using boost? q=*:*boost=multi_field:87defType=edismax -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3931) Turn off coord() factor for scoring
Bill Bell created SOLR-3931: --- Summary: Turn off coord() factor for scoring Key: SOLR-3931 URL: https://issues.apache.org/jira/browse/SOLR-3931 Project: Solr Issue Type: Bug Affects Versions: 4.0 Reporter: Bill Bell We would like to remove coordination factor from scoring. FOr small fields (like name of doctor), we want to not score higher if the same term is in the field more than once. Makes sense for books, not so much for formal names. /solr/select?q=*:*coordFactor=false Default is true. (Note: we might want to make each of these optional - tf, idf, coord, queryNorm coord(q,d) is a score factor based on how many of the query terms are found in the specified document. Typically, a document that contains more of the query's terms will receive a higher score than another document with fewer query terms. This is a search time factor computed in coord(q,d) by the Similarity in effect at search time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org