[jira] Resolved: (LUCENE-1540) Improvements to contrib.benchmark for TREC collections

2011-02-06 Thread Doron Cohen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doron Cohen resolved LUCENE-1540. - Resolution: Fixed ok no new failures, closing as fixed, Thanks Shai and Robert for your help her

[jira] Commented: (SOLR-1395) Integrate Katta

2011-02-06 Thread JohnWu (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991279#comment-12991279 ] JohnWu commented on SOLR-1395: -- TomLiu: as you said:QueryComponent returns DocSlice, but XMLW

[jira] Assigned: (LUCENE-2909) NGramTokenFilter may generate offsets that exceed the length of original text

2011-02-06 Thread Koji Sekiguchi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi reassigned LUCENE-2909: -- Assignee: Koji Sekiguchi > NGramTokenFilter may generate offsets that exceed the lengt

[jira] Updated: (LUCENE-2909) NGramTokenFilter may generate offsets that exceed the length of original text

2011-02-06 Thread Shinya Kasatani (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinya Kasatani updated LUCENE-2909: Attachment: TokenFilterOffset.patch The patch that fixes the problem, including tests. >

[jira] Created: (LUCENE-2909) NGramTokenFilter may generate offsets that exceed the length of original text

2011-02-06 Thread Shinya Kasatani (JIRA)
NGramTokenFilter may generate offsets that exceed the length of original text - Key: LUCENE-2909 URL: https://issues.apache.org/jira/browse/LUCENE-2909 Project: Lucene - Java

Keyword - search statistics

2011-02-06 Thread Selvaraj Varadharajan
Hi Is there any way i can get 'no of times' a key word searched in SOLR ? *Here is my solr package details* Solr Specification Version: 1.4.0 Solr Implementation Version: 1.4.0 833479 - grantingersoll - 2009-11-06 12:33:40 Lucene Specification Version: 2.9.1 Lucene Implement

Re: Distributed Indexing

2011-02-06 Thread Alex Cowell
Hey, We're making good progress, but our DistributedUpdateRequestHandler is having a bit of an identity crisis, so we thought we'd ask what other people's opinions are. The current situation is as follows: We've added a method to ContentStreamHandlerBase to check if an update request is distribut

[jira] Issue Comment Edited: (SOLR-2341) Shard distribution policy

2011-02-06 Thread William Mayor (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991225#comment-12991225 ] William Mayor edited comment on SOLR-2341 at 2/6/11 10:00 PM: --

RE: Arabic Analyzer

2011-02-06 Thread Digy
Here is a port of lucene.java's arabic analyzer ( https://issues.apache.org/jira/browse/LUCENENET-392 ) You can safely remove nunit dependency and test cases from the project. DIGY -Original Message- From: Ben Foster [mailto:b...@planetcloud.co.uk] Sent: Sunday, February 06, 2011 5:47 P

Re: Distributed Indexing

2011-02-06 Thread William Mayor
Hi Good call about the policies being deterministic, should've thought of that earlier. We've changed the patch to include this and I've removed the random assignment one (for obvious reasons). Take a look and let me know what's to do. ( https://issues.apache.org/jira/browse/SOLR-2341) Cheers

[jira] Updated: (SOLR-2341) Shard distribution policy

2011-02-06 Thread William Mayor (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] William Mayor updated SOLR-2341: Attachment: SOLR-2341.patch This patch makes the implemented policy deterministic. This is missing f

[jira] Commented: (LUCENE-1540) Improvements to contrib.benchmark for TREC collections

2011-02-06 Thread Doron Cohen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991224#comment-12991224 ] Doron Cohen commented on LUCENE-1540: - Following suggestions by Robert, brought back

[jira] Commented: (LUCENE-2903) Improvement of PForDelta Codec

2011-02-06 Thread hao yan (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991221#comment-12991221 ] hao yan commented on LUCENE-2903: - Hi, Paul I tested ByteBuffer->IntBuffer, it is not fa

[jira] Commented: (LUCENE-2903) Improvement of PForDelta Codec

2011-02-06 Thread hao yan (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991222#comment-12991222 ] hao yan commented on LUCENE-2903: - And it sure complicate the pfordelta algorithm a lot b

[jira] Commented: (LUCENE-2903) Improvement of PForDelta Codec

2011-02-06 Thread hao yan (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991220#comment-12991220 ] hao yan commented on LUCENE-2903: - HI, Michael Did u try FrameOfRef and PatchedFrameOfRe

Re: svn commit: r1067699 - in /lucene/dev/branches/branch_3x/lucene/contrib/benchmark/src: java/org/apache/lucene/benchmark/byTask/feeds/TrecDocParser.java test/org/apache/lucene/benchmark/byTask/feed

2011-02-06 Thread Doron Cohen
Interesting... Thanks Robert for pointing this out! > "To obtain correct results for locale insensitive strings, use toUpperCase(Locale.ENGLISH)" Actually this is one of the things I tried and did solve it - with toUpperCase(Locale.US) - not exactly Locale.ENGLISH but quite similar I assume - an

[jira] Commented: (LUCENE-1540) Improvements to contrib.benchmark for TREC collections

2011-02-06 Thread Doron Cohen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991210#comment-12991210 ] Doron Cohen commented on LUCENE-1540: - bq. I wish we knew of a good solution, because

[jira] Resolved: (LUCENE-2609) Generate jar containing test classes.

2011-02-06 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera resolved LUCENE-2609. Resolution: Fixed Committed revision 1067738. Thanks all for your comments and help ! > Generate

[jira] Commented: (LUCENE-2907) automaton termsenum bug when running with multithreaded search

2011-02-06 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991198#comment-12991198 ] Uwe Schindler commented on LUCENE-2907: --- Thanks, really nice now :-) > automaton t

[jira] Resolved: (LUCENE-2907) automaton termsenum bug when running with multithreaded search

2011-02-06 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-2907. - Resolution: Fixed Assignee: Robert Muir Committed revision 1067720. > automaton termsenum

[jira] Commented: (SOLR-2256) CommonsHttpSolrServer.deleteById(emptyList) causes SolrException: missing_content_stream

2011-02-06 Thread Stevo Slavic (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991191#comment-12991191 ] Stevo Slavic commented on SOLR-2256: I've experienced similar behavior with SolrJ 1.4.1

[jira] Commented: (LUCENE-2908) clean up serialization in the codebase

2011-02-06 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991187#comment-12991187 ] Uwe Schindler commented on LUCENE-2908: --- +1 > clean up serialization in the codeba

[jira] Commented: (LUCENE-2906) Filter to process output of ICUTokenizer and create overlapping bigrams for CJK

2011-02-06 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991186#comment-12991186 ] Robert Muir commented on LUCENE-2906: - {quote} How will this differ from the SmartChi

[jira] Commented: (LUCENE-1540) Improvements to contrib.benchmark for TREC collections

2011-02-06 Thread Doron Cohen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991181#comment-12991181 ] Doron Cohen commented on LUCENE-1540: - Fix for the locale issue merged to trunk at r1

[jira] Commented: (LUCENE-2907) automaton termsenum bug when running with multithreaded search

2011-02-06 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991180#comment-12991180 ] Robert Muir commented on LUCENE-2907: - bq. I am not sure if CompiledAutomation is a g

[jira] Commented: (LUCENE-1540) Improvements to contrib.benchmark for TREC collections

2011-02-06 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991179#comment-12991179 ] Robert Muir commented on LUCENE-1540: - Hi Doron, about the test random seeds: It is

[jira] Commented: (LUCENE-2907) automaton termsenum bug when running with multithreaded search

2011-02-06 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991178#comment-12991178 ] Simon Willnauer commented on LUCENE-2907: - patch looks good - just being super pi

[jira] Commented: (LUCENE-2908) clean up serialization in the codebase

2011-02-06 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991177#comment-12991177 ] Simon Willnauer commented on LUCENE-2908: - big +1 to get rid of Serializable its

[jira] Commented: (LUCENE-1540) Improvements to contrib.benchmark for TREC collections

2011-02-06 Thread Doron Cohen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991176#comment-12991176 ] Doron Cohen commented on LUCENE-1540: - I am able to reproduce this on Linux. The test

[jira] Updated: (LUCENE-2907) automaton termsenum bug when running with multithreaded search

2011-02-06 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2907: Attachment: LUCENE-2907.patch here's the same patch, but cleaned up a bit (e.g. making some things

[jira] Commented: (LUCENE-2894) Use of google-code-prettify for Lucene/Solr Javadoc

2011-02-06 Thread Steven Rowe (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991173#comment-12991173 ] Steven Rowe commented on LUCENE-2894: - Both of the nightly Hudson Maven builds failed

Re: svn commit: r1067699 - in /lucene/dev/branches/branch_3x/lucene/contrib/benchmark/src: java/org/apache/lucene/benchmark/byTask/feeds/TrecDocParser.java test/org/apache/lucene/benchmark/byTask/feed

2011-02-06 Thread Robert Muir
Thanks for catching this Doron. Another option if you want to keep the case-insensitive feature here would be to use toUpperCase(Locale.ENGLISH) It might look bad, but its actually recommended by the JDK for locale-insensitive strings: http://download.oracle.com/javase/6/docs/api/java/lang/String.

[jira] Commented: (LUCENE-1799) Unicode compression

2011-02-06 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991170#comment-12991170 ] DM Smith commented on LUCENE-1799: -- Any idea as to when this will be released? > Unicod

[jira] Commented: (LUCENE-2906) Filter to process output of ICUTokenizer and create overlapping bigrams for CJK

2011-02-06 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991169#comment-12991169 ] DM Smith commented on LUCENE-2906: -- Two questions: How will this differ from the SmartCh

[jira] Created: (LUCENE-2908) clean up serialization in the codebase

2011-02-06 Thread Robert Muir (JIRA)
clean up serialization in the codebase -- Key: LUCENE-2908 URL: https://issues.apache.org/jira/browse/LUCENE-2908 Project: Lucene - Java Issue Type: Task Reporter: Robert Muir Fix

[jira] Updated: (LUCENE-2908) clean up serialization in the codebase

2011-02-06 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2908: Attachment: LUCENE-2908.patch attached is a patch. all tests pass. > clean up serialization in th

[jira] Commented: (LUCENE-2894) Use of google-code-prettify for Lucene/Solr Javadoc

2011-02-06 Thread Koji Sekiguchi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991165#comment-12991165 ] Koji Sekiguchi commented on LUCENE-2894: On my mac, there is prettify correctly u

[jira] Reopened: (LUCENE-2894) Use of google-code-prettify for Lucene/Solr Javadoc

2011-02-06 Thread Koji Sekiguchi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi reopened LUCENE-2894: Reopening the issue. Lucene javadoc on hudson looks fine (syntax highlighting works correctly

[HUDSON-MAVEN] Lucene-Solr-Maven-trunk #17: POMs out of sync

2011-02-06 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-Maven-trunk/17/ No tests ran. Build Log (for compile errors): [...truncated 7757 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional comm

[HUDSON] Lucene-Solr-tests-only-3.x - Build # 4561 - Failure

2011-02-06 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/4561/ 1 tests failed. REGRESSION: org.apache.solr.client.solrj.TestLBHttpSolrServer.testReliability Error Message: No live SolrServers available to handle this request Stack Trace: org.apache.solr.client.solrj.SolrServerExce

[jira] Updated: (LUCENE-2906) Filter to process output of ICUTokenizer and create overlapping bigrams for CJK

2011-02-06 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2906: Attachment: LUCENE-2906.patch here's a patch going in a slightly different direction (though we ca

Re: [HUDSON] Lucene-Solr-tests-only-3.x - Build # 4555 - Failure

2011-02-06 Thread Doron Cohen
checking... On Sun, Feb 6, 2011 at 2:19 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > I think this is happening because of LUCENE-1540... > > Mike > > On Sun, Feb 6, 2011 at 5:25 AM, Apache Hudson Server > wrote: > > Build: > https://hudson.apache.org/hudson/job/Lucene-Solr-tests-

[jira] Updated: (LUCENE-2907) automaton termsenum bug when running with multithreaded search

2011-02-06 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2907: Attachment: LUCENE-2907.patch attached is a patch. I removed all the transient/synchronized stuff

[jira] Updated: (LUCENE-2907) automaton termsenum bug when running with multithreaded search

2011-02-06 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2907: Summary: automaton termsenum bug when running with multithreaded search (was: termsenum bug when

Re: [HUDSON] Lucene-Solr-tests-only-3.x - Build # 4555 - Failure

2011-02-06 Thread Michael McCandless
I think this is happening because of LUCENE-1540... Mike On Sun, Feb 6, 2011 at 5:25 AM, Apache Hudson Server wrote: > Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/4555/ > > 1 tests failed. > REGRESSION:   > org.apache.lucene.benchmark.byTask.feeds.TrecContentSourceTest

[jira] Commented: (LUCENE-1540) Improvements to contrib.benchmark for TREC collections

2011-02-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991145#comment-12991145 ] Michael McCandless commented on LUCENE-1540: I think this commit has caused a

[HUDSON-MAVEN] Lucene-Solr-Maven-3.x #16: POMs out of sync

2011-02-06 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-Maven-3.x/16/ No tests ran. Build Log (for compile errors): [...truncated 8390 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional comman

[HUDSON] Lucene-Solr-tests-only-3.x - Build # 4555 - Failure

2011-02-06 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/4555/ 1 tests failed. REGRESSION: org.apache.lucene.benchmark.byTask.feeds.TrecContentSourceTest.testTrecFeedDirAllTypes Error Message: expected: but was: Stack Trace: at org.apache.lucene.benchmark.byTask.feeds.Tr

[jira] Commented: (LUCENE-2907) termsenum bug when running with multithreaded search

2011-02-06 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991141#comment-12991141 ] Uwe Schindler commented on LUCENE-2907: --- Yes the numbered states cache was always b

[jira] Commented: (LUCENE-2907) termsenum bug when running with multithreaded search

2011-02-06 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991140#comment-12991140 ] Robert Muir commented on LUCENE-2907: - in combination with other things. in my opinio

[jira] Commented: (LUCENE-2907) termsenum bug when running with multithreaded search

2011-02-06 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991137#comment-12991137 ] Uwe Schindler commented on LUCENE-2907: --- A bug in automaton that only hapoens in mu

[jira] Commented: (LUCENE-2907) termsenum bug when running with multithreaded search

2011-02-06 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991133#comment-12991133 ] Robert Muir commented on LUCENE-2907: - bq. Have you found out what happens or where a

[jira] Commented: (LUCENE-2609) Generate jar containing test classes.

2011-02-06 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991132#comment-12991132 ] Shai Erera commented on LUCENE-2609: Thanks Steven ! Committed revision 1067623 (3x)

[jira] Commented: (LUCENE-2907) termsenum bug when running with multithreaded search

2011-02-06 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991130#comment-12991130 ] Uwe Schindler commented on LUCENE-2907: --- Have you found out what happens or where a