date:20120525


[ 
https://issues.apache.org/jira/browse/SOLR-3433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283161#comment-13283161
 ] 

Sami Siren commented on SOLR-3433:
--

It seems this has been fixed in trunk: SOLR-3035. Alex, can you please give me 
some more details on how you tested this and what versions, especially did you 
see this still happen in trunk.

 binary field returns differently when do the distribute search
 --

 Key: SOLR-3433
 URL: https://issues.apache.org/jira/browse/SOLR-3433
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 3.5, 3.6, 4.0
 Environment: linux, amazon ec2
Reporter: Alex Liu

 When install multiple nodes (more than one node), the repeated searches 
 through solr returns binary data back differently each time.
 lst name=responseHeaderint name=status0/intint 
 name=QTime26/intlst name=paramsstr 
 name=qtext_col:woodman/str/lst/lstresult name=response 
 numFound=1 start=0 maxScore=0.13258252docstr 
 name=binary_col[B:[B@714fef9f/str
 lst name=responseHeaderint name=status0/intint 
 name=QTime11/intlst name=paramsstr 
 name=qtext_col:woodman/str/lst/lstresult name=response 
 numFound=1 start=0 maxScore=0.13258252docstr 
 name=binary_col[B:[B@4be22114/str
 check this link, some one report the same issue. 
 http://grokbase.com/t/lucene/solr-user/11beyhmxjw/distributed-search-and-binary-fields-w-solr-3-4
 it works for a single node, but fails for multiple node. it's something 
 related to distributed search

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3487) XMLResponseParser does not handle named lists in doc fields

Sami Siren created SOLR-3487:


 Summary: XMLResponseParser does not handle named lists in doc 
fields
 Key: SOLR-3487
 URL: https://issues.apache.org/jira/browse/SOLR-3487
 Project: Solr
  Issue Type: Bug
Reporter: Sami Siren
Priority: Minor
 Fix For: 4.0


For example when one uses xml and specifies fl to contain [explain style=nl] 
parser currently cannot handle the response.

I also noticed that the example tests are not run with xml (that would have 
caught this earlier).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3487) XMLResponseParser does not handle named lists in doc fields


 [ 
https://issues.apache.org/jira/browse/SOLR-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sami Siren updated SOLR-3487:
-

Attachment: SOLR-3487.patch

here's a proposed fix. I also added a test class that runs the example tests by 
using the xml format.

will commit shortly unless someone stops me...

 XMLResponseParser does not handle named lists in doc fields
 ---

 Key: SOLR-3487
 URL: https://issues.apache.org/jira/browse/SOLR-3487
 Project: Solr
  Issue Type: Bug
Reporter: Sami Siren
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-3487.patch


 For example when one uses xml and specifies fl to contain [explain style=nl] 
 parser currently cannot handle the response.
 I also noticed that the example tests are not run with xml (that would have 
 caught this earlier).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3433) binary field returns differently when do the distribute search

2012-05-25 Thread Alex Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283178#comment-13283178
 ] 

Alex Liu commented on SOLR-3433:


Sami, I think SOLR-3035 fixed the issue for a single node. This ticket is only 
for multiple node. To reproduce it, you can set up a three nodes cluster, and 
upload solrconfig.xml, schema.xml with binary fields and some testing data, 
then you can search on any node.

 binary field returns differently when do the distribute search
 --

 Key: SOLR-3433
 URL: https://issues.apache.org/jira/browse/SOLR-3433
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 3.5, 3.6, 4.0
 Environment: linux, amazon ec2
Reporter: Alex Liu

 When install multiple nodes (more than one node), the repeated searches 
 through solr returns binary data back differently each time.
 lst name=responseHeaderint name=status0/intint 
 name=QTime26/intlst name=paramsstr 
 name=qtext_col:woodman/str/lst/lstresult name=response 
 numFound=1 start=0 maxScore=0.13258252docstr 
 name=binary_col[B:[B@714fef9f/str
 lst name=responseHeaderint name=status0/intint 
 name=QTime11/intlst name=paramsstr 
 name=qtext_col:woodman/str/lst/lstresult name=response 
 numFound=1 start=0 maxScore=0.13258252docstr 
 name=binary_col[B:[B@4be22114/str
 check this link, some one report the same issue. 
 http://grokbase.com/t/lucene/solr-user/11beyhmxjw/distributed-search-and-binary-fields-w-solr-3-4
 it works for a single node, but fails for multiple node. it's something 
 related to distributed search

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java6-64 #175

2012-05-25 Thread Dawid Weiss

Let me know if you need any help (or if you have questions) wrt to the
new test infrastructure. I am busy with other things at the moment and
there are rough edges... I plan to jump into it again once we ship a
release (don't know when this going to happen at the moment).

Dawid

On Thu, May 24, 2012 at 11:59 PM, Mark Miller markrmil...@gmail.com wrote:
 Just noticed this seems to happen fairly frequently in the java 7 windows 
 build, but I don't seem to see it in the java 6 windows build.

 I'll try and use Java 7 on my win machine when I get chance - should make it 
 easier to experiment with fixes if I can get the same results locally.
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3487) XMLResponseParser does not handle named lists in doc fields


 [ 
https://issues.apache.org/jira/browse/SOLR-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sami Siren updated SOLR-3487:
-

Component/s: clients - java

 XMLResponseParser does not handle named lists in doc fields
 ---

 Key: SOLR-3487
 URL: https://issues.apache.org/jira/browse/SOLR-3487
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Reporter: Sami Siren
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-3487.patch


 For example when one uses xml and specifies fl to contain [explain style=nl] 
 parser currently cannot handle the response.
 I also noticed that the example tests are not run with xml (that would have 
 caught this earlier).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3433) binary field returns differently when do the distribute search


[ 
https://issues.apache.org/jira/browse/SOLR-3433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283191#comment-13283191
 ] 

Sami Siren commented on SOLR-3433:
--

From what I understand from SOLR-3035 it was not about single node. I also did 
some tests with multiple shards and did not see this problem on trunk. Perhaps 
I am missing something important. Could you provide a test case that 
demonstrates the problem on trunk?

 binary field returns differently when do the distribute search
 --

 Key: SOLR-3433
 URL: https://issues.apache.org/jira/browse/SOLR-3433
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 3.5, 3.6, 4.0
 Environment: linux, amazon ec2
Reporter: Alex Liu

 When install multiple nodes (more than one node), the repeated searches 
 through solr returns binary data back differently each time.
 lst name=responseHeaderint name=status0/intint 
 name=QTime26/intlst name=paramsstr 
 name=qtext_col:woodman/str/lst/lstresult name=response 
 numFound=1 start=0 maxScore=0.13258252docstr 
 name=binary_col[B:[B@714fef9f/str
 lst name=responseHeaderint name=status0/intint 
 name=QTime11/intlst name=paramsstr 
 name=qtext_col:woodman/str/lst/lstresult name=response 
 numFound=1 start=0 maxScore=0.13258252docstr 
 name=binary_col[B:[B@4be22114/str
 check this link, some one report the same issue. 
 http://grokbase.com/t/lucene/solr-user/11beyhmxjw/distributed-search-and-binary-fields-w-solr-3-4
 it works for a single node, but fails for multiple node. it's something 
 related to distributed search

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java7-64 #119

See 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/119/

--
[...truncated 14684 lines...]
   [junit4]   2 16087 T3282 oass.SolrIndexSearcher.init WARNING WARNING: 
Directory impl does not support setting indexDir: 
org.apache.lucene.store.MockDirectoryWrapper
   [junit4]   2 16088 T3282 oasu.CommitTracker.init Hard AutoCommit: disabled
   [junit4]   2 16088 T3282 oasu.CommitTracker.init Soft AutoCommit: disabled
   [junit4]   2 16088 T3282 oashc.SpellCheckComponent.inform Initializing 
spell checkers
   [junit4]   2 16096 T3282 oass.DirectSolrSpellChecker.init init: 
{name=direct,classname=DirectSolrSpellChecker,field=lowerfilt,minQueryLength=3}
   [junit4]   2 16142 T3282 oashc.HttpShardHandlerFactory.getParameter Setting 
socketTimeout to: 0
   [junit4]   2 16142 T3282 oashc.HttpShardHandlerFactory.getParameter Setting 
urlScheme to: http://
   [junit4]   2 16142 T3282 oashc.HttpShardHandlerFactory.getParameter Setting 
connTimeout to: 0
   [junit4]   2 16142 T3282 oashc.HttpShardHandlerFactory.getParameter Setting 
maxConnectionsPerHost to: 20
   [junit4]   2 16143 T3282 oashc.HttpShardHandlerFactory.getParameter Setting 
corePoolSize to: 0
   [junit4]   2 16143 T3282 oashc.HttpShardHandlerFactory.getParameter Setting 
maximumPoolSize to: 2147483647
   [junit4]   2 16143 T3282 oashc.HttpShardHandlerFactory.getParameter Setting 
maxThreadIdleTime to: 5
   [junit4]   2 16144 T3282 oashc.HttpShardHandlerFactory.getParameter Setting 
sizeOfQueue to: -1
   [junit4]   2 16144 T3282 oashc.HttpShardHandlerFactory.getParameter Setting 
fairnessPolicy to: false
   [junit4]   2 16154 T3285 oasc.SolrCore.registerSearcher [collection1] 
Registered new searcher Searcher@3d102489 
main{StandardDirectoryReader(segments_1:1)}
   [junit4]   2 16154 T3282 oasc.CoreContainer.register registering core: 
collection1
   [junit4]   2 16155 T3282 oasu.AbstractSolrTestCase.setUp SETUP_END 
testSoftAndHardCommitMaxTimeDelete
   [junit4]   2 16156 T3282 oasu.AbstractSolrTestCase.tearDown 
TEARDOWN_START testSoftAndHardCommitMaxTimeDelete
   [junit4]   2 16156 T3282 oasc.CoreContainer.shutdown Shutting down 
CoreContainer instance=883130242
   [junit4]   2 16156 T3282 oasc.SolrCore.close [collection1]  CLOSING 
SolrCore org.apache.solr.core.SolrCore@5a084acd
   [junit4]   2 16160 T3282 oasc.SolrCore.closeSearcher [collection1] Closing 
main searcher on request.
   [junit4]   2 16160 T3282 oasu.DirectUpdateHandler2.close closing 
DirectUpdateHandler2{commits=0,autocommits=0,soft 
autocommits=0,optimizes=0,rollbacks=0,expungeDeletes=0,docsPending=0,adds=0,deletesById=0,deletesByQuery=0,errors=0,cumulative_adds=0,cumulative_deletesById=0,cumulative_deletesByQuery=0,cumulative_errors=0}
   [junit4]   2
   [junit4] Completed in 3.27s, 3 tests, 3 skipped
   [junit4]  
   [junit4] Suite: 
org.apache.solr.handler.component.DistributedTermsComponentTest
   [junit4] Completed in 14.90s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.TestGroupingSearch
   [junit4] Completed in 6.62s, 12 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.component.SpellCheckComponentTest
   [junit4] Completed in 12.90s, 9 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.cloud.TestMultiCoreConfBootstrap
   [junit4] Completed in 5.82s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.request.SimpleFacetsTest
   [junit4] Completed in 9.93s, 29 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.update.DirectUpdateHandlerTest
   [junit4] Completed in 5.39s, 6 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.MoreLikeThisHandlerTest
   [junit4] Completed in 1.72s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.core.TestCoreContainer
   [junit4] Completed in 5.34s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.update.TestIndexingPerformance
   [junit4] Completed in 1.57s, 1 test
   [junit4]  
   [junit4] Suite: 
org.apache.solr.search.similarities.TestLMDirichletSimilarityFactory
   [junit4] Completed in 0.32s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.component.TermsComponentTest
   [junit4] Completed in 1.65s, 13 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.search.function.SortByFunctionTest
   [junit4] Completed in 3.15s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.spelling.suggest.SuggesterTSTTest
   [junit4] Completed in 2.36s, 4 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.spelling.suggest.SuggesterWFSTTest
   [junit4] Completed in 2.30s, 4 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.search.TestFoldingMultitermQuery
   [junit4] Completed in 2.06s, 18 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.schema.CurrencyFieldTest
   [junit4] IGNORED 0.00s | CurrencyFieldTest.testPerformance
   [junit4] Cause: Annotated @Ignore()
   [junit4] Completed in 1.99s, 8 tests, 1 skipped
   [junit4]  
   [junit4] Suite:

Re: Welcome Simon Svensson as a new committer

2012-05-25 Thread zoolette

Welcome in Simon !

2012/5/24 Prescott Nasser geobmx...@hotmail.com





  Hey All, Our roster is growing a bit, I'd like to welcome Simon as a new
 committer. Simon has been quite active on the user mailing list helping
 answer community questions, he also maintains a C# port of the
 lucene-hunspell project (java: http://code.google.com/p/lucene-hunspell/,
 Simons c# port: https://github.com/sisve/Lucene.Net.Analysis.Hunspell)
 which is commonly used for spell checking (but has a wide array of
 purposes. Please join me in welcoming Simon to the team, ~Prescott

[JENKINS] Lucene-Solr-tests-only-trunk - Build # 14323 - Failure

2012-05-25 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/14323/

1 tests failed.
REGRESSION:  
org.apache.solr.client.solrj.embedded.MultiCoreExampleJettyTest.testDistributed

Error Message:
Server at http://localhost:18961/example/core0 returned non ok status:500, 
message:Server Error

Stack Trace:
org.apache.solr.common.SolrException: Server at 
http://localhost:18961/example/core0 returned non ok status:500, message:Server 
Error
at 
__randomizedtesting.SeedInfo.seed([7A452B0B4F6909F6:15779CE33ED15799]:0)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:403)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:209)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
at 
org.apache.solr.client.solrj.embedded.MultiCoreExampleJettyTest.testDistributed(MultiCoreExampleJettyTest.java:117)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
at 
org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53)
at 
org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551)




Build Log (for compile errors):
[...truncated 11071 lines...]



-
To unsubscribe, e-mail:

[jira] [Updated] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans

[
https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Simon Willnauer updated LUCENE-2878:

Attachment: LUCENE-2878.patch

hey alan,

great job.. your are getting up to speed. I fixed that testcase (the boolean
one) since in the conjunction case you have to consume the conjunction
positions/offsets ie. the intervals given by the term matches. I also fixed the
license header in that file and brought the highlighter prototype test back. I
will commit this to the branch now.

wow man this makes me happy! Good job.

Allow Scorer to expose positions and payloads aka. nuke spans
--

Key: LUCENE-2878
URL: https://issues.apache.org/jira/browse/LUCENE-2878
Project: Lucene - Java
Issue Type: Improvement
Components: core/search
Affects Versions: Positions Branch
Reporter: Simon Willnauer
Assignee: Simon Willnauer
Labels: gsoc2011, gsoc2012, lucene-gsoc-11, lucene-gsoc-12,
mentor
Fix For: Positions Branch

Attachments: LUCENE-2878-OR.patch, LUCENE-2878.patch,
LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch,
LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch,
LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878_trunk.patch,
LUCENE-2878_trunk.patch, PosHighlighter.patch, PosHighlighter.patch

Currently we have two somewhat separate types of queries, the one which can
make use of positions (mainly spans) and payloads (spans). Yet Span*Query
doesn't really do scoring comparable to what other queries do and at the end
of the day they are duplicating lot of code all over lucene. Span*Queries are
also limited to other Span*Query instances such that you can not use a
TermQuery or a BooleanQuery with SpanNear or anthing like that.
Beside of the Span*Query limitation other queries lacking a quiet interesting
feature since they can not score based on term proximity since scores doesn't
expose any positional information. All those problems bugged me for a while
now so I stared working on that using the bulkpostings API. I would have done
that first cut on trunk but TermScorer is working on BlockReader that do not
expose positions while the one in this branch does. I started adding a new
Positions class which users can pull from a scorer, to prevent unnecessary
positions enums I added ScorerContext#needsPositions and eventually
Scorere#needsPayloads to create the corresponding enum on demand. Yet,
currently only TermQuery / TermScorer implements this API and other simply
return null instead.
To show that the API really works and our BulkPostings work fine too with
positions I cut over TermSpanQuery to use a TermScorer under the hood and
nuked TermSpans entirely. A nice sideeffect of this was that the Position
BulkReading implementation got some exercise which now :) work all with
positions while Payloads for bulkreading are kind of experimental in the
patch and those only work with Standard codec.
So all spans now work on top of TermScorer ( I truly hate spans since today )
including the ones that need Payloads (StandardCodec ONLY)!! I didn't bother
to implement the other codecs yet since I want to get feedback on the API and
on this first cut before I go one with it. I will upload the corresponding
patch in a minute.
I also had to cut over SpanQuery.getSpans(IR) to
SpanQuery.getSpans(AtomicReaderContext) which I should probably do on trunk
first but after that pain today I need a break first :).
The patch passes all core tests
(org.apache.lucene.search.highlight.HighlighterTest still fails but I didn't
look into the MemoryIndex BulkPostings API yet)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans


[ 
https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283217#comment-13283217
 ] 

Simon Willnauer commented on LUCENE-2878:
-

oh btw. All tests on the branch pass now :)


 Allow Scorer to expose positions and payloads aka. nuke spans 
 --

 Key: LUCENE-2878
 URL: https://issues.apache.org/jira/browse/LUCENE-2878
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/search
Affects Versions: Positions Branch
Reporter: Simon Willnauer
Assignee: Simon Willnauer
  Labels: gsoc2011, gsoc2012, lucene-gsoc-11, lucene-gsoc-12, 
 mentor
 Fix For: Positions Branch

 Attachments: LUCENE-2878-OR.patch, LUCENE-2878.patch, 
 LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, 
 LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, 
 LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, 
 LUCENE-2878_trunk.patch, LUCENE-2878_trunk.patch, PosHighlighter.patch, 
 PosHighlighter.patch


 Currently we have two somewhat separate types of queries, the one which can 
 make use of positions (mainly spans) and payloads (spans). Yet Span*Query 
 doesn't really do scoring comparable to what other queries do and at the end 
 of the day they are duplicating lot of code all over lucene. Span*Queries are 
 also limited to other Span*Query instances such that you can not use a 
 TermQuery or a BooleanQuery with SpanNear or anthing like that. 
 Beside of the Span*Query limitation other queries lacking a quiet interesting 
 feature since they can not score based on term proximity since scores doesn't 
 expose any positional information. All those problems bugged me for a while 
 now so I stared working on that using the bulkpostings API. I would have done 
 that first cut on trunk but TermScorer is working on BlockReader that do not 
 expose positions while the one in this branch does. I started adding a new 
 Positions class which users can pull from a scorer, to prevent unnecessary 
 positions enums I added ScorerContext#needsPositions and eventually 
 Scorere#needsPayloads to create the corresponding enum on demand. Yet, 
 currently only TermQuery / TermScorer implements this API and other simply 
 return null instead. 
 To show that the API really works and our BulkPostings work fine too with 
 positions I cut over TermSpanQuery to use a TermScorer under the hood and 
 nuked TermSpans entirely. A nice sideeffect of this was that the Position 
 BulkReading implementation got some exercise which now :) work all with 
 positions while Payloads for bulkreading are kind of experimental in the 
 patch and those only work with Standard codec. 
 So all spans now work on top of TermScorer ( I truly hate spans since today ) 
 including the ones that need Payloads (StandardCodec ONLY)!!  I didn't bother 
 to implement the other codecs yet since I want to get feedback on the API and 
 on this first cut before I go one with it. I will upload the corresponding 
 patch in a minute. 
 I also had to cut over SpanQuery.getSpans(IR) to 
 SpanQuery.getSpans(AtomicReaderContext) which I should probably do on trunk 
 first but after that pain today I need a break first :).
 The patch passes all core tests 
 (org.apache.lucene.search.highlight.HighlighterTest still fails but I didn't 
 look into the MemoryIndex BulkPostings API yet)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans


 [ 
https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-2878:


Attachment: LUCENE-2878.patch

I messed up the last patch - here is the actual patch.

 Allow Scorer to expose positions and payloads aka. nuke spans 
 --

 Key: LUCENE-2878
 URL: https://issues.apache.org/jira/browse/LUCENE-2878
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/search
Affects Versions: Positions Branch
Reporter: Simon Willnauer
Assignee: Simon Willnauer
  Labels: gsoc2011, gsoc2012, lucene-gsoc-11, lucene-gsoc-12, 
 mentor
 Fix For: Positions Branch

 Attachments: LUCENE-2878-OR.patch, LUCENE-2878.patch, 
 LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, 
 LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, 
 LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, 
 LUCENE-2878_trunk.patch, LUCENE-2878_trunk.patch, PosHighlighter.patch, 
 PosHighlighter.patch


 Currently we have two somewhat separate types of queries, the one which can 
 make use of positions (mainly spans) and payloads (spans). Yet Span*Query 
 doesn't really do scoring comparable to what other queries do and at the end 
 of the day they are duplicating lot of code all over lucene. Span*Queries are 
 also limited to other Span*Query instances such that you can not use a 
 TermQuery or a BooleanQuery with SpanNear or anthing like that. 
 Beside of the Span*Query limitation other queries lacking a quiet interesting 
 feature since they can not score based on term proximity since scores doesn't 
 expose any positional information. All those problems bugged me for a while 
 now so I stared working on that using the bulkpostings API. I would have done 
 that first cut on trunk but TermScorer is working on BlockReader that do not 
 expose positions while the one in this branch does. I started adding a new 
 Positions class which users can pull from a scorer, to prevent unnecessary 
 positions enums I added ScorerContext#needsPositions and eventually 
 Scorere#needsPayloads to create the corresponding enum on demand. Yet, 
 currently only TermQuery / TermScorer implements this API and other simply 
 return null instead. 
 To show that the API really works and our BulkPostings work fine too with 
 positions I cut over TermSpanQuery to use a TermScorer under the hood and 
 nuked TermSpans entirely. A nice sideeffect of this was that the Position 
 BulkReading implementation got some exercise which now :) work all with 
 positions while Payloads for bulkreading are kind of experimental in the 
 patch and those only work with Standard codec. 
 So all spans now work on top of TermScorer ( I truly hate spans since today ) 
 including the ones that need Payloads (StandardCodec ONLY)!!  I didn't bother 
 to implement the other codecs yet since I want to get feedback on the API and 
 on this first cut before I go one with it. I will upload the corresponding 
 patch in a minute. 
 I also had to cut over SpanQuery.getSpans(IR) to 
 SpanQuery.getSpans(AtomicReaderContext) which I should probably do on trunk 
 first but after that pain today I need a break first :).
 The patch passes all core tests 
 (org.apache.lucene.search.highlight.HighlighterTest still fails but I didn't 
 look into the MemoryIndex BulkPostings API yet)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Jenkins build is back to normal : Lucene-Solr-trunk-Windows-Java7-64 #120

See 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/120/


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Solr-trunk - Build # 1865 - Failure

2012-05-25 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Solr-trunk/1865/

1 tests failed.
REGRESSION:  org.apache.solr.cloud.RecoveryZkTest.testDistribSearch

Error Message:
Thread threw an uncaught exception, thread: Thread[Lucene Merge Thread #2,6,]

Stack Trace:
java.lang.RuntimeException: Thread threw an uncaught exception, thread: 
Thread[Lucene Merge Thread #2,6,]
at 
com.carrotsearch.randomizedtesting.RunnerThreadGroup.processUncaught(RunnerThreadGroup.java:96)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:857)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
at 
org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53)
at 
org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551)
Caused by: org.apache.lucene.index.MergePolicy$MergeException: 
org.apache.lucene.store.AlreadyClosedException: this Directory is closed
at __randomizedtesting.SeedInfo.seed([8B4A827F28B6F16]:0)
at 
org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:507)
at 
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:480)
Caused by: org.apache.lucene.store.AlreadyClosedException: this Directory is 
closed
at org.apache.lucene.store.Directory.ensureOpen(Directory.java:244)
at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:241)
at 
org.apache.lucene.index.IndexFileDeleter.refresh(IndexFileDeleter.java:345)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3031)
at 
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:382)
at 
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:451)




Build Log (for compile errors):
[...truncated 41930 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Reopened] (LUCENE-2566) + - operators allow any amount of whitespace


 [ 
https://issues.apache.org/jira/browse/LUCENE-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl reopened LUCENE-2566:
-

  Assignee: Jan Høydahl

Re-opening for backport

 + - operators allow any amount of whitespace
 

 Key: LUCENE-2566
 URL: https://issues.apache.org/jira/browse/LUCENE-2566
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/queryparser
Reporter: Yonik Seeley
Assignee: Jan Høydahl
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-2566.patch


 As an example, (foo - bar) is treated like (foo -bar).
 It seems like for +- to be treated as unary operators, they should be 
 immediately followed by the operand.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-2566) + - operators allow any amount of whitespace


 [ 
https://issues.apache.org/jira/browse/LUCENE-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated LUCENE-2566:


Attachment: LUCENE-2566-3x.patch

Backport to 3.6 branch. All tests pass. Committing soon.

 + - operators allow any amount of whitespace
 

 Key: LUCENE-2566
 URL: https://issues.apache.org/jira/browse/LUCENE-2566
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/queryparser
Affects Versions: 3.6
Reporter: Yonik Seeley
Assignee: Jan Høydahl
Priority: Minor
 Fix For: 4.0, 3.6.1

 Attachments: LUCENE-2566-3x.patch, LUCENE-2566.patch


 As an example, (foo - bar) is treated like (foo -bar).
 It seems like for +- to be treated as unary operators, they should be 
 immediately followed by the operand.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4076) When doing nested (index-time) joins, ToParentBlockJoinCollector delivers incomplete information on the grand-children

2012-05-25 Thread Christoph Kaser (JIRA)

Christoph Kaser created LUCENE-4076:
---

 Summary: When doing nested (index-time) joins, 
ToParentBlockJoinCollector delivers incomplete information on the grand-children
 Key: LUCENE-4076
 URL: https://issues.apache.org/jira/browse/LUCENE-4076
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/join
Affects Versions: 3.6, 3.5, 3.4
Reporter: Christoph Kaser


ToParentBlockJoinCollector.getTopGroups does not provide the correct answer 
when a query with nested ToParentBlockJoinCollectors is performed.

Given the following example query:
{code}
Query grandChildQuery=new TermQuery(new Term(color, red));
Filter childFilter = new CachingWrapperFilter(new RawTermFilter(new 
Term(type,child)), DeletesMode.IGNORE);
ToParentBlockJoinQuery grandchildJoinQuery = new 
ToParentBlockJoinQuery(grandChildQuery, childFilter, ScoreMode.Max);

BooleanQuery childQuery= new BooleanQuery();
childQuery.add(grandchildJoinQuery, Occur.MUST);
childQuery.add(new TermQuery(new Term(shape, round)), Occur.MUST);

Filter parentFilter = new CachingWrapperFilter(new RawTermFilter(new 
Term(type,parent)), DeletesMode.IGNORE);
ToParentBlockJoinQuery childJoinQuery = new ToParentBlockJoinQuery(childQuery, 
parentFilter, ScoreMode.Max);

parentQuery=new BooleanQuery();
parentQuery.add(childJoinQuery, Occur.MUST);
parentQuery.add(new TermQuery(new Term(name, test)), Occur.MUST);

ToParentBlockJoinCollector parentCollector= new 
ToParentBlockJoinCollector(Sort.RELEVANCE, 30, true, true);
searcher.search(parentQuery, null, parentCollector);
{code}

This produces the correct results:
{code}
TopGroupsInteger childGroups = parentCollector.getTopGroups(childJoinQuery, 
null, 0, 20, 0, false); 
{code}

However, this does not:
{code}
TopGroupsInteger grandChildGroups = 
parentCollector.getTopGroups(grandchildJoinQuery, null, 0, 20, 0, false); 
{code}

The content of grandChildGroups is broken in the following ways:
* The groupValue is not the document id of the child document (which is the 
parent of a grandchild document), but the document id of the _previous_ 
matching parent document
* There are only as much GroupDocs as there are parent documents (not child 
documents), and they only contain the children of the last child document (but, 
as mentioned before, with the wrong groupValue). 


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-2566) + - operators allow any amount of whitespace


 [ 
https://issues.apache.org/jira/browse/LUCENE-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated LUCENE-2566:


Affects Version/s: 3.6
Fix Version/s: 3.6.1

 + - operators allow any amount of whitespace
 

 Key: LUCENE-2566
 URL: https://issues.apache.org/jira/browse/LUCENE-2566
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/queryparser
Affects Versions: 3.6
Reporter: Yonik Seeley
Assignee: Jan Høydahl
Priority: Minor
 Fix For: 4.0, 3.6.1

 Attachments: LUCENE-2566-3x.patch, LUCENE-2566.patch


 As an example, (foo - bar) is treated like (foo -bar).
 It seems like for +- to be treated as unary operators, they should be 
 immediately followed by the operand.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3131) XA Resource/Transaction support

2012-05-25 Thread Lajos Kesik (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283255#comment-13283255
 ] 

Lajos Kesik commented on LUCENE-3131:
-

Really there is no plan to support XA Transactions? Without it is in quite hard 
to keep consistency between database and lucene index.

 XA Resource/Transaction  support
 

 Key: LUCENE-3131
 URL: https://issues.apache.org/jira/browse/LUCENE-3131
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 3.1
Reporter: Magnus
Assignee: Robert Muir
Priority: Minor
 Fix For: 3.1.1


 Please add XAResoure/XATransaction support into Lucene core.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4077) ToParentBlockJoinCollector provides no way to access computed scores and the maxScore

2012-05-25 Thread Christoph Kaser (JIRA)

Christoph Kaser created LUCENE-4077:
---

 Summary: ToParentBlockJoinCollector provides no way to access 
computed scores and the maxScore
 Key: LUCENE-4077
 URL: https://issues.apache.org/jira/browse/LUCENE-4077
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/join
Affects Versions: 3.6, 3.5, 3.4
Reporter: Christoph Kaser


The constructor of ToParentBlockJoinCollector allows to turn on the tracking of 
parent scores and the maximum parent score, however there is no way to access 
those scores because:
* maxScore is a private field, and there is no getter
* TopGroups / GroupDocs does not provide access to the scores for the parent 
documents, only the children

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3486) The memory size of Solr caches should be configurable

2012-05-25 Thread Adrien Grand (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated SOLR-3486:
---

Attachment: SOLR-3486.patch

Hi Shawn,

I modified the patch in order to make it easier to add this functionality to 
other cache implementations. All you need to do for SOLR-3393 to support 
maximum memory size is to split your implementation into a LFU map (a regular 
map, with no evictions) which iterates (entrySet().iterator()) in frequency 
order and a LFU cache (that will probably extend or wrap this LFU map). Then to 
have a LFU cache with a fixed max mem size, just wrap your LFU map into a new 
SizableCache instance.

 The memory size of Solr caches should be configurable
 -

 Key: SOLR-3486
 URL: https://issues.apache.org/jira/browse/SOLR-3486
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Adrien Grand
Priority: Minor
 Attachments: SOLR-3486.patch, SOLR-3486.patch


 It is currently possible to configure the sizes of Solr caches based on the 
 number of entries of the cache. The problem is that the memory size of cached 
 values may vary a lot over time (depending on IndexReader.maxDoc and the 
 queries that are run) although the JVM heap size does not.
 Having a configurable max size in bytes would also help optimize cache 
 utilization, making it possible to store more values provided that they have 
 a small memory footprint.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-3487) XMLResponseParser does not handle named lists in doc fields


 [ 
https://issues.apache.org/jira/browse/SOLR-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sami Siren resolved SOLR-3487.
--

Resolution: Fixed
  Assignee: Sami Siren

 XMLResponseParser does not handle named lists in doc fields
 ---

 Key: SOLR-3487
 URL: https://issues.apache.org/jira/browse/SOLR-3487
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Reporter: Sami Siren
Assignee: Sami Siren
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-3487.patch


 For example when one uses xml and specifies fl to contain [explain style=nl] 
 parser currently cannot handle the response.
 I also noticed that the example tests are not run with xml (that would have 
 caught this earlier).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java7-64 #121

See 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/121/changes

Changes:

[siren] SOLR-3487: handle named lists in xml response

--
[...truncated 4693 lines...]
resolve:
[ivy:retrieve] :: loading settings :: url = 
jar:file:/C:/Users/JenkinsSlave/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml

init:

clover.setup:

clover.info:
 [echo] 
 [echo]   Clover not found. Code coverage reports disabled.
 [echo] 

clover:

common.compile-core:
[javac] Compiling 1 source file to 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/ws/lucene\build\core\classes\java
[javac] warning: [options] bootstrap class path not set in conjunction with 
-source 1.6
[javac] 1 warning

compile-core:

init:

compile-test:
 [echo] Building queries...

ivy-availability-check:

ivy-fail:

ivy-configure:

resolve:

common.init:

compile-lucene-core:

init:

clover.setup:

clover.info:
 [echo] 
 [echo]   Clover not found. Code coverage reports disabled.
 [echo] 

clover:

compile-core:

compile-test-framework:

ivy-availability-check:

ivy-fail:

ivy-configure:

resolve:
[ivy:retrieve] :: loading settings :: url = 
jar:file:/C:/Users/JenkinsSlave/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml

init:

compile-lucene-core:

compile-core:

common.compile-test:
[mkdir] Created dir: 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/ws/lucene\build\queries\classes\test
[javac] Compiling 11 source files to 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/ws/lucene\build\queries\classes\test
[javac] warning: [options] bootstrap class path not set in conjunction with 
-source 1.6
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] 1 warning
 [echo] Building queryparser...

ivy-availability-check:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/ws/lucene\ivy-settings.xml

resolve:

common.init:

compile-lucene-core:

jflex-uptodate-check:

jflex-notice:

javacc-uptodate-check:

javacc-notice:

ivy-availability-check:

ivy-fail:

ivy-configure:

resolve:
[ivy:retrieve] :: loading settings :: url = 
jar:file:/C:/Users/JenkinsSlave/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml

init:

clover.setup:

clover.info:
 [echo] 
 [echo]   Clover not found. Code coverage reports disabled.
 [echo] 

clover:

common.compile-core:
[javac] Compiling 1 source file to 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/ws/lucene\build\core\classes\java
[javac] warning: [options] bootstrap class path not set in conjunction with 
-source 1.6
[javac] 1 warning

compile-core:

init:

compile-test:
 [echo] Building queryparser...

check-queries-uptodate:

jar-queries:

check-sandbox-uptodate:

jar-sandbox:

ivy-availability-check:

ivy-fail:

ivy-configure:

resolve:

common.init:

compile-lucene-core:

init:

clover.setup:

clover.info:
 [echo] 
 [echo]   Clover not found. Code coverage reports disabled.
 [echo] 

clover:

common.compile-core:

compile-core:

compile-test-framework:

ivy-availability-check:

ivy-fail:

ivy-configure:

resolve:
[ivy:retrieve] :: loading settings :: url = 
jar:file:/C:/Users/JenkinsSlave/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml

init:

compile-lucene-core:

compile-core:

common.compile-test:
[mkdir] Created dir: 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/ws/lucene\build\queryparser\classes\test
[javac] Compiling 40 source files to 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/ws/lucene\build\queryparser\classes\test
[javac] warning: [options] bootstrap class path not set in conjunction with 
-source 1.6
[javac] 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/ws/lucene\queryparser\src\test\org\apache\lucene\queryparser\util\QueryParserTestBase.java:1137:
 warning: [rawtypes] found raw type: Class
[javac]   QueryParser.class.getConstructor(new Class[] 
{CharStream.class});
[javac]^
[javac]   missing type arguments for generic class ClassT
[javac]   where T is a type-variable:
[javac] T extends Object declared in class Class
[javac] 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/ws/lucene\queryparser\src\test\org\apache\lucene\queryparser\util\QueryParserTestBase.java:1143:
 warning: [rawtypes] found raw type: Class
[javac]   QueryParser.class.getConstructor(new Class[] 
{QueryParserTokenManager.class});
[javac]

[jira] [Commented] (LUCENE-4074) FST Sorter BufferSize causes int overflow if BufferSize 2048MB


[ 
https://issues.apache.org/jira/browse/LUCENE-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283303#comment-13283303
 ] 

Jan Høydahl commented on LUCENE-4074:
-

Checked in a fix in 3.6 for non-compiling TestSort.testRamBuffer. It referred 
to random().nextInt() instead of random.nextInt() - clear copy/paste error from 
Trunk code

 FST Sorter BufferSize causes int overflow if BufferSize  2048MB
 

 Key: LUCENE-4074
 URL: https://issues.apache.org/jira/browse/LUCENE-4074
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/spellchecker
Affects Versions: 3.6, 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.0, 3.6.1

 Attachments: LUCENE-4074.patch


 the BufferSize constructor accepts size in MB as an integer and uses 
 multiplication to convert to bytes. While its checking the size in bytes to 
 be less than 2048 MB it does that after byte conversion. If you pass a value 
  2047 to the ctor the value overflows since all constants and methods based 
 on MB expect 32 bit signed ints. This does not even result in an exception 
 until the BufferSize is actually passed to the sorter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4074) FST Sorter BufferSize causes int overflow if BufferSize 2048MB


[ 
https://issues.apache.org/jira/browse/LUCENE-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283305#comment-13283305
 ] 

Simon Willnauer commented on LUCENE-4074:
-

thanks jan! totally my fault! seems we dont' have jenkins testing this though :(

 FST Sorter BufferSize causes int overflow if BufferSize  2048MB
 

 Key: LUCENE-4074
 URL: https://issues.apache.org/jira/browse/LUCENE-4074
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/spellchecker
Affects Versions: 3.6, 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.0, 3.6.1

 Attachments: LUCENE-4074.patch


 the BufferSize constructor accepts size in MB as an integer and uses 
 multiplication to convert to bytes. While its checking the size in bytes to 
 be less than 2048 MB it does that after byte conversion. If you pass a value 
  2047 to the ctor the value overflows since all constants and methods based 
 on MB expect 32 bit signed ints. This does not even result in an exception 
 until the BufferSize is actually passed to the sorter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3488) Create a Collections API for SolrCloud

Mark Miller created SOLR-3488:
-

 Summary: Create a Collections API for SolrCloud
 Key: SOLR-3488
 URL: https://issues.apache.org/jira/browse/SOLR-3488
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Jenkins build is back to normal : Lucene-Solr-trunk-Windows-Java6-64 #199

See 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/199/


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Solr-trunk - Build # 1865 - Failure

I actually know what this one is now.

Jetty is shutting down, and the graceful timeout is too low, and so jetty 
interrupts the webapp, and while we are waiting for merges to finish on 
IW#close, an interrupt is thrown and we stop waiting. So the directory is then 
closed out from under the merge thread. So really, mostly a test issue it seems?

So I changed out jetty instances in tests to a 30 second graceful shutdown. 
Tests went from 6 minutes for me, to 33 minutes. I won't make this fix for now 
:) One idea is to perhaps do it just for this test - but even then it makes the 
test *much* longer, and there is no reason it can't happen on other tests that 
use jetty instances. It just happens to only show up in the test currently 
AFAICT.

On May 25, 2012, at 5:30 AM, Apache Jenkins Server wrote:

 Build: https://builds.apache.org/job/Solr-trunk/1865/
 
 1 tests failed.
 REGRESSION:  org.apache.solr.cloud.RecoveryZkTest.testDistribSearch
 
 Error Message:
 Thread threw an uncaught exception, thread: Thread[Lucene Merge Thread #2,6,]
 
 Stack Trace:
 java.lang.RuntimeException: Thread threw an uncaught exception, thread: 
 Thread[Lucene Merge Thread #2,6,]
   at 
 com.carrotsearch.randomizedtesting.RunnerThreadGroup.processUncaught(RunnerThreadGroup.java:96)
   at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:857)
   at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132)
   at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669)
   at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695)
   at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734)
   at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745)
   at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
   at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
   at 
 org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
   at 
 org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
   at 
 org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51)
   at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
   at 
 org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53)
   at 
 org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52)
   at 
 org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36)
   at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
   at 
 org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56)
   at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605)
   at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132)
   at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551)
 Caused by: org.apache.lucene.index.MergePolicy$MergeException: 
 org.apache.lucene.store.AlreadyClosedException: this Directory is closed
   at __randomizedtesting.SeedInfo.seed([8B4A827F28B6F16]:0)
   at 
 org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:507)
   at 
 org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:480)
 Caused by: org.apache.lucene.store.AlreadyClosedException: this Directory is 
 closed
   at org.apache.lucene.store.Directory.ensureOpen(Directory.java:244)
   at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:241)
   at 
 org.apache.lucene.index.IndexFileDeleter.refresh(IndexFileDeleter.java:345)
   at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3031)
   at 
 org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:382)
   at 
 org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:451)
 
 
 
 
 Build Log (for compile errors):
 [...truncated 41930 lines...]
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

- Mark Miller
lucidimagination.com












-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Solr-trunk - Build # 1865 - Failure

2012-05-25 Thread Sami Siren

Just thinking out loud... shouldn't solr(cloud) manage such situation
gracefully? I mean in real life solr instances can be killed or even
whole servers can go away. Would it be ok to ignore that exception
instead?
--
 Sami Siren

On Fri, May 25, 2012 at 3:01 PM, Mark Miller markrmil...@gmail.com wrote:
 I actually know what this one is now.

 Jetty is shutting down, and the graceful timeout is too low, and so jetty 
 interrupts the webapp, and while we are waiting for merges to finish on 
 IW#close, an interrupt is thrown and we stop waiting. So the directory is 
 then closed out from under the merge thread. So really, mostly a test issue 
 it seems?

 So I changed out jetty instances in tests to a 30 second graceful shutdown. 
 Tests went from 6 minutes for me, to 33 minutes. I won't make this fix for 
 now :) One idea is to perhaps do it just for this test - but even then it 
 makes the test *much* longer, and there is no reason it can't happen on other 
 tests that use jetty instances. It just happens to only show up in the test 
 currently AFAICT.

 On May 25, 2012, at 5:30 AM, Apache Jenkins Server wrote:

 Build: https://builds.apache.org/job/Solr-trunk/1865/

 1 tests failed.
 REGRESSION:  org.apache.solr.cloud.RecoveryZkTest.testDistribSearch

 Error Message:
 Thread threw an uncaught exception, thread: Thread[Lucene Merge Thread #2,6,]

 Stack Trace:
 java.lang.RuntimeException: Thread threw an uncaught exception, thread: 
 Thread[Lucene Merge Thread #2,6,]
       at 
 com.carrotsearch.randomizedtesting.RunnerThreadGroup.processUncaught(RunnerThreadGroup.java:96)
       at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:857)
       at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132)
       at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669)
       at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695)
       at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734)
       at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745)
       at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
       at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
       at 
 org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
       at 
 org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
       at 
 org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51)
       at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
       at 
 org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53)
       at 
 org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52)
       at 
 org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36)
       at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
       at 
 org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56)
       at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605)
       at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132)
       at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551)
 Caused by: org.apache.lucene.index.MergePolicy$MergeException: 
 org.apache.lucene.store.AlreadyClosedException: this Directory is closed
       at __randomizedtesting.SeedInfo.seed([8B4A827F28B6F16]:0)
       at 
 org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:507)
       at 
 org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:480)
 Caused by: org.apache.lucene.store.AlreadyClosedException: this Directory is 
 closed
       at org.apache.lucene.store.Directory.ensureOpen(Directory.java:244)
       at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:241)
       at 
 org.apache.lucene.index.IndexFileDeleter.refresh(IndexFileDeleter.java:345)
       at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3031)
       at 
 org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:382)
       at 
 org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:451)




 Build Log (for compile errors):
 [...truncated 41930 lines...]



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional

[jira] [Resolved] (LUCENE-2566) + - operators allow any amount of whitespace


 [ 
https://issues.apache.org/jira/browse/LUCENE-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl resolved LUCENE-2566.
-

Resolution: Fixed

Checked in for 3.6.1

 + - operators allow any amount of whitespace
 

 Key: LUCENE-2566
 URL: https://issues.apache.org/jira/browse/LUCENE-2566
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/queryparser
Affects Versions: 3.6
Reporter: Yonik Seeley
Assignee: Jan Høydahl
Priority: Minor
 Fix For: 4.0, 3.6.1

 Attachments: LUCENE-2566-3x.patch, LUCENE-2566.patch


 As an example, (foo - bar) is treated like (foo -bar).
 It seems like for +- to be treated as unary operators, they should be 
 immediately followed by the operand.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java6-64 #200

See 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/200/

--
[...truncated 13549 lines...]
   [junit4] Suite: org.apache.solr.analysis.TestPatternReplaceCharFilterFactory
   [junit4] Completed in 0.02s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.schema.CopyFieldTest
   [junit4] Completed in 0.74s, 6 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.analysis.TestRussianLightStemFilterFactory
   [junit4] Completed in 0.01s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.search.TestPseudoReturnFields
   [junit4] Completed in 1.70s, 13 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.search.ReturnFieldsTest
   [junit4] Completed in 1.02s, 10 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.search.TestRealTimeGet
   [junit4] IGNOR/A 0.01s | TestRealTimeGet.testStressRecovery
   [junit4] Assumption #1: FIXME: This test is horribly slow sometimes on 
Windows!
   [junit4]   2 20987 T2115 oas.SolrTestCaseJ4.setUp ###Starting 
testStressRecovery
   [junit4]   2 20988 T2115 oas.SolrTestCaseJ4.tearDown ###Ending 
testStressRecovery
   [junit4]   2
   [junit4] Completed in 29.76s, 8 tests, 1 skipped
   [junit4]  
   [junit4] Suite: org.apache.solr.cloud.OverseerTest
   [junit4] Completed in 56.01s, 7 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.cloud.RecoveryZkTest
   [junit4] Completed in 29.48s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.cloud.NodeStateWatcherTest
   [junit4] Completed in 24.76s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.cloud.ZkSolrClientTest
   [junit4] Completed in 16.14s, 4 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.TestDistributedGrouping
   [junit4] Completed in 22.43s, 1 test
   [junit4]  
   [junit4] Suite: 
org.apache.solr.handler.component.DistributedSpellCheckComponentTest
   [junit4] Completed in 16.26s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.search.TestRangeQuery
   [junit4] Completed in 7.02s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.search.TestSort
   [junit4] Completed in 2.99s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.core.TestJmxIntegration
   [junit4] IGNORED 0.00s | TestJmxIntegration.testJmxOnCoreReload
   [junit4] Cause: Annotated @Ignore(timing problem? 
https://issues.apache.org/jira/browse/SOLR-2715)
   [junit4] Completed in 1.82s, 3 tests, 1 skipped
   [junit4]  
   [junit4] Suite: org.apache.solr.spelling.IndexBasedSpellCheckerTest
   [junit4] Completed in 1.24s, 5 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.core.TestSolrDeletionPolicy2
   [junit4] Completed in 0.85s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.update.TestIndexingPerformance
   [junit4] Completed in 0.95s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.spelling.DirectSolrSpellCheckerTest
   [junit4] Completed in 1.17s, 2 tests
   [junit4]  
   [junit4] Suite: 
org.apache.solr.update.processor.UniqFieldsUpdateProcessorFactoryTest
   [junit4] Completed in 1.00s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.update.DocumentBuilderTest
   [junit4] Completed in 1.03s, 11 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.search.SpatialFilterTest
   [junit4] Completed in 1.82s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.schema.PolyFieldTest
   [junit4] Completed in 1.59s, 4 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.spelling.suggest.SuggesterWFSTTest
   [junit4] Completed in 1.55s, 4 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.core.TestPropInject
   [junit4] Completed in 1.95s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.core.RequestHandlersTest
   [junit4] Completed in 1.13s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.highlight.FastVectorHighlighterTest
   [junit4] Completed in 1.13s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.search.TestDocSet
   [junit4] Completed in 0.73s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.analysis.TestReversedWildcardFilterFactory
   [junit4] Completed in 0.84s, 4 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.schema.RequiredFieldsTest
   [junit4] Completed in 0.93s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.core.IndexReaderFactoryTest
   [junit4] Completed in 0.97s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.highlight.HighlighterConfigTest
   [junit4] Completed in 1.08s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.search.TestSolrQueryParser
   [junit4] Completed in 0.97s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.core.AlternateDirectoryTest
   [junit4] Completed in 0.95s, 1 test
   [junit4]  
   [junit4] Suite: 
org.apache.solr.update.processor.UpdateRequestProcessorFactoryTest
   [junit4] Completed in 0.84s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.schema.MultiTermTest
   [junit4] Completed in 0.43s, 3 tests
   [junit4]  
   [junit4] Suite:

[jira] [Commented] (LUCENE-3131) XA Resource/Transaction support


[ 
https://issues.apache.org/jira/browse/LUCENE-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283337#comment-13283337
 ] 

Michael McCandless commented on LUCENE-3131:


Lucene itself is already transactional (see 
http://blog.mikemccandless.com/2012/03/transactional-lucene.html ); it's just 
that we don't have XA wrapper...

 XA Resource/Transaction  support
 

 Key: LUCENE-3131
 URL: https://issues.apache.org/jira/browse/LUCENE-3131
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 3.1
Reporter: Magnus
Assignee: Robert Muir
Priority: Minor
 Fix For: 3.1.1


 Please add XAResoure/XATransaction support into Lucene core.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Solr-trunk - Build # 1865 - Failure


On May 25, 2012, at 8:11 AM, Sami Siren wrote:

 Just thinking out loud... shouldn't solr(cloud) manage such situation
 gracefully?

Currently, you can handle it gracefully if you up the graceful timeout in 
jetty. It's easy enough to do that with the jetty we ship, but it's painful 
(extremely it seems) to do it in tests.

In any case, I don't think it hurts anything practically? The merge thread 
fails, and so simply, you don't get those merges I think? The problem with the 
tests is that the exception is thrown from the merge thread. We have no affect 
on that from Solr - the test framework picks up an uncaught exception in the 
thread, and our goose is cooked.

 I mean in real life solr instances can be killed or even
 whole servers can go away. Would it be ok to ignore that exception
 instead?

It's at the Lucene level really, so unless we try really hard to work around 
it, we would have to figure out if something different made sense there I think.

Right now, if its waiting for merges to finish and gets interrupted, it throws 
an interrupted exception. Unless we explicitly try and kill the current merge 
threads, I'd think that could be a problem in any general code. You close the 
IW with wait for merges to finish = true, then you start closing other 
resources, because you assume you are done with the IW, but in fact merges can 
still be occurring if the thread was interrupted. And you might close resources 
merging depends on (ie the directory).

Lucene does not like interruptions in other cases as well, but unfortunately, 
running in a webapp, we can't easily always avoid them it seems.

 --
 Sami Siren
 
 On Fri, May 25, 2012 at 3:01 PM, Mark Miller markrmil...@gmail.com wrote:
 I actually know what this one is now.
 
 Jetty is shutting down, and the graceful timeout is too low, and so jetty 
 interrupts the webapp, and while we are waiting for merges to finish on 
 IW#close, an interrupt is thrown and we stop waiting. So the directory is 
 then closed out from under the merge thread. So really, mostly a test issue 
 it seems?
 
 So I changed out jetty instances in tests to a 30 second graceful shutdown. 
 Tests went from 6 minutes for me, to 33 minutes. I won't make this fix for 
 now :) One idea is to perhaps do it just for this test - but even then it 
 makes the test *much* longer, and there is no reason it can't happen on 
 other tests that use jetty instances. It just happens to only show up in the 
 test currently AFAICT.
 
 On May 25, 2012, at 5:30 AM, Apache Jenkins Server wrote:
 
 Build: https://builds.apache.org/job/Solr-trunk/1865/
 
 1 tests failed.
 REGRESSION:  org.apache.solr.cloud.RecoveryZkTest.testDistribSearch
 
 Error Message:
 Thread threw an uncaught exception, thread: Thread[Lucene Merge Thread 
 #2,6,]
 
 Stack Trace:
 java.lang.RuntimeException: Thread threw an uncaught exception, thread: 
 Thread[Lucene Merge Thread #2,6,]
   at 
 com.carrotsearch.randomizedtesting.RunnerThreadGroup.processUncaught(RunnerThreadGroup.java:96)
   at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:857)
   at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132)
   at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669)
   at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695)
   at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734)
   at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745)
   at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
   at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
   at 
 org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
   at 
 org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
   at 
 org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51)
   at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
   at 
 org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53)
   at 
 org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52)
   at 
 org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36)
   at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
   at 
 org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56)
   at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605)
   at

Re: [JENKINS] Solr-trunk - Build # 1865 - Failure

2012-05-25 Thread Sami Siren

On Fri, May 25, 2012 at 3:44 PM, Mark Miller markrmil...@gmail.com wrote:

 On May 25, 2012, at 8:11 AM, Sami Siren wrote:

 Just thinking out loud... shouldn't solr(cloud) manage such situation
 gracefully?

 Currently, you can handle it gracefully if you up the graceful timeout in 
 jetty. It's easy enough to do that with the jetty we ship, but it's painful 
 (extremely it seems) to do it in tests.

 In any case, I don't think it hurts anything practically?

that was my point.

 the test framework picks up an uncaught exception in the thread, and our 
 goose is cooked.

by ignoring the exception I was trying to say that it should be
ignored from POV of test framework, ie not fail the build. I now
understand that it might not actually solve the issue...

--
 Sami Siren

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Jenkins build is back to normal : Lucene-Solr-trunk-Windows-Java7-64 #122

See 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/122/


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2822) don't run update processors twice


[ 
https://issues.apache.org/jira/browse/SOLR-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283361#comment-13283361
 ] 

Mark Miller commented on SOLR-2822:
---

+1 - approach seems as elegant as we could shoot for right now. I much prefer 
it to juggling multiple chains.

I still worry about the 'clone doc' issue and update procs between distrib and 
run - if we do decide to not let procs live there, we should probably hard fail 
on it.

Latest patch looks good to me - let's commit and iterate on trunk.

 don't run update processors twice
 -

 Key: SOLR-2822
 URL: https://issues.apache.org/jira/browse/SOLR-2822
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud, update
Reporter: Yonik Seeley
 Fix For: 4.0

 Attachments: SOLR-2822.patch, SOLR-2822.patch, SOLR-2822.patch


 An update will first go through processors until it gets to the point where 
 it is forwarded to the leader (or forwarded to replicas if already on the 
 leader).
 We need a way to skip over the processors that were already run (perhaps by 
 using a processor chain dedicated to sub-updates?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-3429) new GatherTransformer

2012-05-25 Thread Giovanni Bricconi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giovanni Bricconi resolved SOLR-3429.
-

   Resolution: Fixed
Fix Version/s: 4.0

I propose this implementation

 new GatherTransformer
 -

 Key: SOLR-3429
 URL: https://issues.apache.org/jira/browse/SOLR-3429
 Project: Solr
  Issue Type: New Feature
  Components: contrib - DataImportHandler
Affects Versions: 4.0
Reporter: Giovanni Bricconi
Priority: Minor
  Labels: json
 Fix For: 4.0

 Attachments: SOLR-3429.patch


 This is a new transformer for dih.
 I'm often asked to import a lot of fields, many of these fields are read only 
 and sould not be searched.
 I found useful to gather them in a single json field, and returning them 
 untouched to the client.
 This patch provides a transformer that collects a list of db columns an 
 writes out a json map that contains all of them.
 A regression test is included. 
 A new dependency for jsonic has been added to dih, (already used by langid), 
 I can use a different library if needed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3131) XA Resource/Transaction support

2012-05-25 Thread Magnus (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283376#comment-13283376
 ] 

Magnus commented on LUCENE-3131:


Is XA wrapper in the roadmap?

 XA Resource/Transaction  support
 

 Key: LUCENE-3131
 URL: https://issues.apache.org/jira/browse/LUCENE-3131
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 3.1
Reporter: Magnus
Assignee: Robert Muir
Priority: Minor
 Fix For: 3.1.1


 Please add XAResoure/XATransaction support into Lucene core.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3488) Create a Collections API for SolrCloud

[
https://issues.apache.org/jira/browse/SOLR-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283382#comment-13283382
]

Mark Miller commented on SOLR-3488:
---

I'll post an initial patch just for create soon. It's just a start though. I've
added a bunch of comments for TODOs or things to consider for the future. I'd
like to start simple just to get 'something' in though.

So initially, you can create a new collection and pass an existing collection
name to determine which shards it's created on. Would also be nice to be able
to explicitly pass the shard urls to use, as well as simply offer X shards, Y
replicas. In that case, perhaps the leader could handle ensuring that. You
might also want to be able to simply say, create it on all known shards.

Further things to look at:

* other commands, like remove/delete.
* what to do when some create calls fail? should we instead add a create node
to a queue in zookeeper? Make the overseer responsible for checking for any
jobs there, completing them (if needed) and then removing the job from the
queue? Other ideas.

Create a Collections API for SolrCloud
--

Key: SOLR-3488
URL: https://issues.apache.org/jira/browse/SOLR-3488
Project: Solr
Issue Type: New Feature
Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans

2012-05-25 Thread Alan Woodward (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283392#comment-13283392
 ] 

Alan Woodward commented on LUCENE-2878:
---

I think my next step is to have a go at implementing ReqOptSumScorer and 
RelExclScorer, so that all the BooleanQuery cases work.  Testing it via the 
PosHighlighter seems to be the way to go as well.

This might take a little longer, in that it will require me to actually think 
about what I'm doing...

 Allow Scorer to expose positions and payloads aka. nuke spans 
 --

 Key: LUCENE-2878
 URL: https://issues.apache.org/jira/browse/LUCENE-2878
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/search
Affects Versions: Positions Branch
Reporter: Simon Willnauer
Assignee: Simon Willnauer
  Labels: gsoc2011, gsoc2012, lucene-gsoc-11, lucene-gsoc-12, 
 mentor
 Fix For: Positions Branch

 Attachments: LUCENE-2878-OR.patch, LUCENE-2878.patch, 
 LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, 
 LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, 
 LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, 
 LUCENE-2878_trunk.patch, LUCENE-2878_trunk.patch, PosHighlighter.patch, 
 PosHighlighter.patch


 Currently we have two somewhat separate types of queries, the one which can 
 make use of positions (mainly spans) and payloads (spans). Yet Span*Query 
 doesn't really do scoring comparable to what other queries do and at the end 
 of the day they are duplicating lot of code all over lucene. Span*Queries are 
 also limited to other Span*Query instances such that you can not use a 
 TermQuery or a BooleanQuery with SpanNear or anthing like that. 
 Beside of the Span*Query limitation other queries lacking a quiet interesting 
 feature since they can not score based on term proximity since scores doesn't 
 expose any positional information. All those problems bugged me for a while 
 now so I stared working on that using the bulkpostings API. I would have done 
 that first cut on trunk but TermScorer is working on BlockReader that do not 
 expose positions while the one in this branch does. I started adding a new 
 Positions class which users can pull from a scorer, to prevent unnecessary 
 positions enums I added ScorerContext#needsPositions and eventually 
 Scorere#needsPayloads to create the corresponding enum on demand. Yet, 
 currently only TermQuery / TermScorer implements this API and other simply 
 return null instead. 
 To show that the API really works and our BulkPostings work fine too with 
 positions I cut over TermSpanQuery to use a TermScorer under the hood and 
 nuked TermSpans entirely. A nice sideeffect of this was that the Position 
 BulkReading implementation got some exercise which now :) work all with 
 positions while Payloads for bulkreading are kind of experimental in the 
 patch and those only work with Standard codec. 
 So all spans now work on top of TermScorer ( I truly hate spans since today ) 
 including the ones that need Payloads (StandardCodec ONLY)!!  I didn't bother 
 to implement the other codecs yet since I want to get feedback on the API and 
 on this first cut before I go one with it. I will upload the corresponding 
 patch in a minute. 
 I also had to cut over SpanQuery.getSpans(IR) to 
 SpanQuery.getSpans(AtomicReaderContext) which I should probably do on trunk 
 first but after that pain today I need a break first :).
 The patch passes all core tests 
 (org.apache.lucene.search.highlight.HighlighterTest still fails but I didn't 
 look into the MemoryIndex BulkPostings API yet)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Jenkins build is back to normal : Lucene-Solr-trunk-Windows-Java6-64 #201

See 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/201/


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Solr-trunk - Build # 1865 - Failure


On May 25, 2012, at 9:00 AM, Sami Siren wrote:

 by ignoring the exception I was trying to say that it should be
 ignored from POV of test framework, ie not fail the build. I now
 understand that it might not actually solve the issue...

Yeah, I suppose if we could tell the test framework, for this test, ignore this 
expected uncaught exception, that might help.

Usually you can work around this type of thing more cleanly though - so I don't 
know if it's worth the effort or extra code - if this ends up being it's only 
use case, it's hard to argue we add the capability. And I suspect it would mean 
conning dawid to suck it up and update our test jars? I think it also has a lot 
of potential for abuse.

But the fail sucks too, so I don't know...


- Mark Miller
lucidimagination.com












-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3489) Config file replication less error prone

Jochen Just created SOLR-3489:
-

 Summary: Config file replication less error prone
 Key: SOLR-3489
 URL: https://issues.apache.org/jira/browse/SOLR-3489
 Project: Solr
  Issue Type: Improvement
  Components: replication (java)
Affects Versions: 3.6
Reporter: Jochen Just
Priority: Minor


If the listing of configuration files that should be replicated contains a 
space, the following file is not replicated.
Example:
{{
!-- the error in the configuration is the space before stopwords.txt. Because 
of that that file is not replicated --
str name=confFilesschema.xml,test.txt, stopwords.txt/str
}}

It would be nice, if that space simply would be ignored.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3489) Config file replication less error prone

[
https://issues.apache.org/jira/browse/SOLR-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jochen Just updated SOLR-3489:
--

Description:
If the listing of configuration files that should be replicated contains a
space, the following file is not replicated.
Example:
{code:xml}
!-- The error in the configuration is the space before stopwords.txt.
Because of that that file is not replicated --
str name=confFilesschema.xml,test.txt, stopwords.txt/str
{code}

It would be nice, if that space simply would be ignored.

was:
If the listing of configuration files that should be replicated contains a
space, the following file is not replicated.
Example:
{{
!-- the error in the configuration is the space before stopwords.txt. Because
of that that file is not replicated --
str name=confFilesschema.xml,test.txt, stopwords.txt/str
}}

It would be nice, if that space simply would be ignored.

Config file replication less error prone

Key: SOLR-3489
URL: https://issues.apache.org/jira/browse/SOLR-3489
Project: Solr
Issue Type: Improvement
Components: replication (java)
Affects Versions: 3.6
Reporter: Jochen Just
Priority: Minor

If the listing of configuration files that should be replicated contains a
space, the following file is not replicated.
Example:
{code:xml}
!-- The error in the configuration is the space before stopwords.txt.
Because of that that file is not replicated --
str name=confFilesschema.xml,test.txt, stopwords.txt/str
{code}
It would be nice, if that space simply would be ignored.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java6-64 #175

This def happens more with Java 7 for me. Rather than seeing it like 1 out of 
100 at best, it is now happening about 1 out of 20.

Going on vacation for a week, so not sure if I will figure this out anytime 
soon, but now at least I can try some things and get more rapid and trustable 
feedback.


 On Thu, May 24, 2012 at 11:59 PM, Mark Miller markrmil...@gmail.com wrote:
 Just noticed this seems to happen fairly frequently in the java 7 windows 
 build, but I don't seem to see it in the java 6 windows build.
 
 I'll try and use Java 7 on my win machine when I get chance - should make it 
 easier to experiment with fixes if I can get the same results locally.
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 

- Mark Miller
lucidimagination.com












-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-2923) IllegalArgumentException when using useFilterForSortedQuery on an empty index


 [ 
https://issues.apache.org/jira/browse/SOLR-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller reassigned SOLR-2923:
-

Assignee: Mark Miller

 IllegalArgumentException when using useFilterForSortedQuery on an empty index
 -

 Key: SOLR-2923
 URL: https://issues.apache.org/jira/browse/SOLR-2923
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 3.6, 4.0
Reporter: Adrien Grand
Assignee: Mark Miller
Priority: Trivial
 Attachments: SOLR-2923.patch


 An IllegalArgumentException can occur under the following circumstances:
  - the index is empty,
  - {{useFilterForSortedQuery}} is enabled,
  - {{queryResultsCache}} is disabled.
 Here are what the exception and its stack trace look like (Solr trunk):
 {quote}
 numHits must be  0; please use TotalHitCountCollector if you just need the 
 total hit count
 java.lang.IllegalArgumentException: numHits must be  0; please use 
 TotalHitCountCollector if you just need the total hit count
   at 
 org.apache.lucene.search.TopFieldCollector.create(TopFieldCollector.java:917)
   at 
 org.apache.solr.search.SolrIndexSearcher.sortDocSet(SolrIndexSearcher.java:1741)
   at 
 org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1211)
   at 
 org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:353)
   ...
 {quote}
 To reproduce this error from a fresh copy of Solr trunk, edit 
 {{example/solr/conf/solrconfig.xml}} to disable {{queryResultCache}} and 
 enable {{useFilterForSortedQuery}}. Then run {{ant run-example}} and issue a 
 query which sorts against any field 
 ({{http://localhost:8983/solr/select?q=*:*sort=manu+desc}} for example).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3489) Config file replication less error prone


 [ 
https://issues.apache.org/jira/browse/SOLR-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jochen Just updated SOLR-3489:
--

Attachment: SOLR-3489_reproducing_config.tar.gz

Steps to reproduce:
# unpack SOLR-3489_reproducing_config.tar.gz into solr-example directory
# start master via {{java -Denable.master=true -Dsolr.solr.home=master -jar 
start.jar}}
# start slave via {{java -Denable.slave=true -Dsolr.solr.home=slave 
-Djetty.port=8984 -jar start.jar}}
# add document in master/singledoc.xml to master
# either replicate manually or wait 60 seconds

Result:
* test.txt will be replicated
* stopwords.txt won't

 Config file replication less error prone
 

 Key: SOLR-3489
 URL: https://issues.apache.org/jira/browse/SOLR-3489
 Project: Solr
  Issue Type: Improvement
  Components: replication (java)
Affects Versions: 3.6
Reporter: Jochen Just
Priority: Minor
 Attachments: SOLR-3489_reproducing_config.tar.gz


 If the listing of configuration files that should be replicated contains a 
 space, the following file is not replicated.
 Example:
 {code:xml}
 !-- The error in the configuration is the space before stopwords.txt.
  Because of that that file is not replicated --
 str name=confFilesschema.xml,test.txt, stopwords.txt/str
 {code}
 It would be nice, if that space simply would be ignored.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-3489) Config file replication less error prone


[ 
https://issues.apache.org/jira/browse/SOLR-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283456#comment-13283456
 ] 

Jochen Just edited comment on SOLR-3489 at 5/25/12 2:14 PM:


Steps to reproduce:
# unpack [^SOLR-3489_reproducing_config.tar.gz] into solr-example directory
# start master via {{java -Denable.master=true -Dsolr.solr.home=master -jar 
start.jar}}
# start slave via {{java -Denable.slave=true -Dsolr.solr.home=slave 
-Djetty.port=8984 -jar start.jar}}
# add document in master/singledoc.xml to master
# either replicate manually or wait 60 seconds

Result:
* test.txt will be replicated
* stopwords.txt won't

  was (Author: jjaa):
Steps to reproduce:
# unpack SOLR-3489_reproducing_config.tar.gz into solr-example directory
# start master via {{java -Denable.master=true -Dsolr.solr.home=master -jar 
start.jar}}
# start slave via {{java -Denable.slave=true -Dsolr.solr.home=slave 
-Djetty.port=8984 -jar start.jar}}
# add document in master/singledoc.xml to master
# either replicate manually or wait 60 seconds

Result:
* test.txt will be replicated
* stopwords.txt won't
  
 Config file replication less error prone
 

 Key: SOLR-3489
 URL: https://issues.apache.org/jira/browse/SOLR-3489
 Project: Solr
  Issue Type: Improvement
  Components: replication (java)
Affects Versions: 3.6
Reporter: Jochen Just
Priority: Minor
 Attachments: SOLR-3489_reproducing_config.tar.gz


 If the listing of configuration files that should be replicated contains a 
 space, the following file is not replicated.
 Example:
 {code:xml}
 !-- The error in the configuration is the space before stopwords.txt.
  Because of that that file is not replicated --
 str name=confFilesschema.xml,test.txt, stopwords.txt/str
 {code}
 It would be nice, if that space simply would be ignored.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Solr-trunk - Build # 1865 - Failure

2012-05-25 Thread Robert Muir

On Fri, May 25, 2012 at 9:54 AM, Mark Miller markrmil...@gmail.com wrote:

 On May 25, 2012, at 9:00 AM, Sami Siren wrote:

 by ignoring the exception I was trying to say that it should be
 ignored from POV of test framework, ie not fail the build. I now
 understand that it might not actually solve the issue...

 Yeah, I suppose if we could tell the test framework, for this test, ignore 
 this expected uncaught exception, that might help.

 Usually you can work around this type of thing more cleanly though - so I 
 don't know if it's worth the effort or extra code - if this ends up being 
 it's only use case, it's hard to argue we add the capability. And I suspect 
 it would mean conning dawid to suck it up and update our test jars? I think 
 it also has a lot of potential for abuse.


the exception-from-another-thread is just an uncaught exception
handler. you can replace it with your own that handles things
differently,
and restore the old one back.

Here's an example of one that does this when the exception is really a jvm bug:

http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_6/lucene/contrib/analyzers/common/src/test/org/apache/lucene/analysis/miscellaneous/PatternAnalyzerTest.java

-- 
lucidimagination.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3486) The memory size of Solr caches should be configurable

2012-05-25 Thread Shawn Heisey (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283461#comment-13283461
 ] 

Shawn Heisey commented on SOLR-3486:


It's going to take me a while to digest what you've just said, but my first 
thought is that I can't change the implementation without destroying the O(1) 
nature.  The cache is implemented in two parts - a simple map (HashMap) for 
fast lookup, and an array of sets (LinkedHashSet[]) for fast frequency 
ordering.  When the frequency for an entry needs to be changed, it is removed 
from one set and added to another.

Although it's not implemented as an actual iterator method, I have code to 
iterate over the array.  I should probably create an iterator and backwards 
iterator, just to eliminate some duplicate code.  If I don't already have a 
remove method, I should be able to add one.


 The memory size of Solr caches should be configurable
 -

 Key: SOLR-3486
 URL: https://issues.apache.org/jira/browse/SOLR-3486
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Adrien Grand
Priority: Minor
 Attachments: SOLR-3486.patch, SOLR-3486.patch


 It is currently possible to configure the sizes of Solr caches based on the 
 number of entries of the cache. The problem is that the memory size of cached 
 values may vary a lot over time (depending on IndexReader.maxDoc and the 
 queries that are run) although the JVM heap size does not.
 Having a configurable max size in bytes would also help optimize cache 
 utilization, making it possible to store more values provided that they have 
 a small memory footprint.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3489) Config file replication less error prone


 [ 
https://issues.apache.org/jira/browse/SOLR-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jochen Just updated SOLR-3489:
--

Attachment: SOLR-3489.patch

The attached patch should solve that problem.

 Config file replication less error prone
 

 Key: SOLR-3489
 URL: https://issues.apache.org/jira/browse/SOLR-3489
 Project: Solr
  Issue Type: Improvement
  Components: replication (java)
Affects Versions: 3.6
Reporter: Jochen Just
Priority: Minor
 Attachments: SOLR-3489.patch, SOLR-3489_reproducing_config.tar.gz


 If the listing of configuration files that should be replicated contains a 
 space, the following file is not replicated.
 Example:
 {code:xml}
 !-- The error in the configuration is the space before stopwords.txt.
  Because of that that file is not replicated --
 str name=confFilesschema.xml,test.txt, stopwords.txt/str
 {code}
 It would be nice, if that space simply would be ignored.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3440) FastVectorHighlighter: IDF-weighted terms for ordered fragments

2012-05-25 Thread sebastian L. (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-3440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283467#comment-13283467
]

sebastian L. commented on LUCENE-3440:
--

Hi Koji,
hi Simon,

if there is something to do for me, please let me know.

Maybe it would be better to split the patch in several smaller ones, e.g.

1. Use Getters/Setters where possible in FVH
2. Make FieldFragList interface and BaseFieldFragList abstract class
3. Introduction of SimpleFieldFragList and SimpleFragListBuilder as default
4. Introduction of WeightedFieldFragList and WeightedFragListBuilder
5. Integration into Solr

When's the 4.0-release scheduled, anyway?

A Patch for trunk 1342490 is on it's way.

FastVectorHighlighter: IDF-weighted terms for ordered fragments

Key: LUCENE-3440
URL: https://issues.apache.org/jira/browse/LUCENE-3440
Project: Lucene - Java
Issue Type: Improvement
Components: modules/highlighter
Reporter: sebastian L.
Priority: Minor
Labels: FastVectorHighlighter
Fix For: 4.0

Attachments: LUCENE-3440.patch, LUCENE-3440.patch,
LUCENE-3440_3.6.1-SNAPSHOT.patch, LUCENE-4.0-SNAPSHOT-3440-9.patch,
weight-vs-boost_table01.html, weight-vs-boost_table02.html

The FastVectorHighlighter uses for every term found in a fragment an equal
weight, which causes a higher ranking for fragments with a high number of
words or, in the worst case, a high number of very common words than
fragments that contains *all* of the terms used in the original query.
This patch provides ordered fragments with IDF-weighted terms:
total weight = total weight + IDF for unique term per fragment * boost of
query;
The ranking-formula should be the same, or at least similar, to that one used
in org.apache.lucene.search.highlight.QueryTermScorer.
The patch is simple, but it works for us.
Some ideas:
- A better approach would be moving the whole fragments-scoring into a
separate class.
- Switch scoring via parameter
- Exact phrases should be given a even better score, regardless if a
phrase-query was executed or not
- edismax/dismax-parameters pf, ps and pf^boost should be observed and
corresponding fragments should be ranked higher

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3489) Config file replication less error prone


[ 
https://issues.apache.org/jira/browse/SOLR-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283469#comment-13283469
 ] 

Jochen Just commented on SOLR-3489:
---

The patch is based on branch lucene_solr_36

 Config file replication less error prone
 

 Key: SOLR-3489
 URL: https://issues.apache.org/jira/browse/SOLR-3489
 Project: Solr
  Issue Type: Improvement
  Components: replication (java)
Affects Versions: 3.6
Reporter: Jochen Just
Priority: Minor
 Attachments: SOLR-3489.patch, SOLR-3489_reproducing_config.tar.gz


 If the listing of configuration files that should be replicated contains a 
 space, the following file is not replicated.
 Example:
 {code:xml}
 !-- The error in the configuration is the space before stopwords.txt.
  Because of that that file is not replicated --
 str name=confFilesschema.xml,test.txt, stopwords.txt/str
 {code}
 It would be nice, if that space simply would be ignored.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3440) FastVectorHighlighter: IDF-weighted terms for ordered fragments

2012-05-25 Thread sebastian L. (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sebastian L. updated LUCENE-3440:
-

Attachment: LUCENE-3440.patch

Patch for trunk (1342490)

 FastVectorHighlighter: IDF-weighted terms for ordered fragments 
 

 Key: LUCENE-3440
 URL: https://issues.apache.org/jira/browse/LUCENE-3440
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/highlighter
Reporter: sebastian L.
Priority: Minor
  Labels: FastVectorHighlighter
 Fix For: 4.0

 Attachments: LUCENE-3440.patch, LUCENE-3440.patch, LUCENE-3440.patch, 
 LUCENE-3440_3.6.1-SNAPSHOT.patch, LUCENE-4.0-SNAPSHOT-3440-9.patch, 
 weight-vs-boost_table01.html, weight-vs-boost_table02.html


 The FastVectorHighlighter uses for every term found in a fragment an equal 
 weight, which causes a higher ranking for fragments with a high number of 
 words or, in the worst case, a high number of very common words than 
 fragments that contains *all* of the terms used in the original query. 
 This patch provides ordered fragments with IDF-weighted terms: 
 total weight = total weight + IDF for unique term per fragment * boost of 
 query; 
 The ranking-formula should be the same, or at least similar, to that one used 
 in org.apache.lucene.search.highlight.QueryTermScorer.
 The patch is simple, but it works for us. 
 Some ideas:
 - A better approach would be moving the whole fragments-scoring into a 
 separate class.
 - Switch scoring via parameter 
 - Exact phrases should be given a even better score, regardless if a 
 phrase-query was executed or not
 - edismax/dismax-parameters pf, ps and pf^boost should be observed and 
 corresponding fragments should be ranked higher 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3489) Config file replication less error prone


 [ 
https://issues.apache.org/jira/browse/SOLR-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jochen Just updated SOLR-3489:
--

Attachment: (was: SOLR-3489_reproducing_config.tar.gz)

 Config file replication less error prone
 

 Key: SOLR-3489
 URL: https://issues.apache.org/jira/browse/SOLR-3489
 Project: Solr
  Issue Type: Improvement
  Components: replication (java)
Affects Versions: 3.6
Reporter: Jochen Just
Priority: Minor
 Attachments: SOLR-3489.patch, SOLR-3489_reproducing_config.tar.gz


 If the listing of configuration files that should be replicated contains a 
 space, the following file is not replicated.
 Example:
 {code:xml}
 !-- The error in the configuration is the space before stopwords.txt.
  Because of that that file is not replicated --
 str name=confFilesschema.xml,test.txt, stopwords.txt/str
 {code}
 It would be nice, if that space simply would be ignored.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3489) Config file replication less error prone


 [ 
https://issues.apache.org/jira/browse/SOLR-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jochen Just updated SOLR-3489:
--

Attachment: SOLR-3489_reproducing_config.tar.gz

 Config file replication less error prone
 

 Key: SOLR-3489
 URL: https://issues.apache.org/jira/browse/SOLR-3489
 Project: Solr
  Issue Type: Improvement
  Components: replication (java)
Affects Versions: 3.6
Reporter: Jochen Just
Priority: Minor
 Attachments: SOLR-3489.patch, SOLR-3489_reproducing_config.tar.gz


 If the listing of configuration files that should be replicated contains a 
 space, the following file is not replicated.
 Example:
 {code:xml}
 !-- The error in the configuration is the space before stopwords.txt.
  Because of that that file is not replicated --
 str name=confFilesschema.xml,test.txt, stopwords.txt/str
 {code}
 It would be nice, if that space simply would be ignored.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3488) Create a Collections API for SolrCloud


[ 
https://issues.apache.org/jira/browse/SOLR-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283478#comment-13283478
 ] 

Sami Siren commented on SOLR-3488:
--

bq. should we instead add a create node to a queue in zookeeper? Make the 
overseer responsible for checking for any jobs there, completing them (if 
needed) and then removing the job from the queue? 

I like this idea, i would also refactor current zkcontroller-overseer 
communication to use this same technique.

 Create a Collections API for SolrCloud
 --

 Key: SOLR-3488
 URL: https://issues.apache.org/jira/browse/SOLR-3488
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3489) Config file replication less error prone

2012-05-25 Thread Jack Krupansky (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283485#comment-13283485
 ] 

Jack Krupansky commented on SOLR-3489:
--

It would be nice to add a similar protection against space before and after the 
colon for aliases. As well as a check for an empty name before and after the 
colon.

 Config file replication less error prone
 

 Key: SOLR-3489
 URL: https://issues.apache.org/jira/browse/SOLR-3489
 Project: Solr
  Issue Type: Improvement
  Components: replication (java)
Affects Versions: 3.6
Reporter: Jochen Just
Priority: Minor
 Attachments: SOLR-3489.patch, SOLR-3489_reproducing_config.tar.gz


 If the listing of configuration files that should be replicated contains a 
 space, the following file is not replicated.
 Example:
 {code:xml}
 !-- The error in the configuration is the space before stopwords.txt.
  Because of that that file is not replicated --
 str name=confFilesschema.xml,test.txt, stopwords.txt/str
 {code}
 It would be nice, if that space simply would be ignored.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3489) Config file replication less error prone


 [ 
https://issues.apache.org/jira/browse/SOLR-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jochen Just updated SOLR-3489:
--

Attachment: SOLR-3489_reproducing_config.tar.gz

 Config file replication less error prone
 

 Key: SOLR-3489
 URL: https://issues.apache.org/jira/browse/SOLR-3489
 Project: Solr
  Issue Type: Improvement
  Components: replication (java)
Affects Versions: 3.6
Reporter: Jochen Just
Priority: Minor
 Attachments: SOLR-3489.patch, SOLR-3489_reproducing_config.tar.gz


 If the listing of configuration files that should be replicated contains a 
 space, the following file is not replicated.
 Example:
 {code:xml}
 !-- The error in the configuration is the space before stopwords.txt.
  Because of that that file is not replicated --
 str name=confFilesschema.xml,test.txt, stopwords.txt/str
 {code}
 It would be nice, if that space simply would be ignored.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3489) Config file replication less error prone


 [ 
https://issues.apache.org/jira/browse/SOLR-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jochen Just updated SOLR-3489:
--

Attachment: (was: SOLR-3489_reproducing_config.tar.gz)

 Config file replication less error prone
 

 Key: SOLR-3489
 URL: https://issues.apache.org/jira/browse/SOLR-3489
 Project: Solr
  Issue Type: Improvement
  Components: replication (java)
Affects Versions: 3.6
Reporter: Jochen Just
Priority: Minor
 Attachments: SOLR-3489.patch, SOLR-3489_reproducing_config.tar.gz


 If the listing of configuration files that should be replicated contains a 
 space, the following file is not replicated.
 Example:
 {code:xml}
 !-- The error in the configuration is the space before stopwords.txt.
  Because of that that file is not replicated --
 str name=confFilesschema.xml,test.txt, stopwords.txt/str
 {code}
 It would be nice, if that space simply would be ignored.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3489) Config file replication less error prone


[ 
https://issues.apache.org/jira/browse/SOLR-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283488#comment-13283488
 ] 

Jochen Just commented on SOLR-3489:
---

I will look into that, but not before next week i guess

 Config file replication less error prone
 

 Key: SOLR-3489
 URL: https://issues.apache.org/jira/browse/SOLR-3489
 Project: Solr
  Issue Type: Improvement
  Components: replication (java)
Affects Versions: 3.6
Reporter: Jochen Just
Priority: Minor
 Attachments: SOLR-3489.patch, SOLR-3489_reproducing_config.tar.gz


 If the listing of configuration files that should be replicated contains a 
 space, the following file is not replicated.
 Example:
 {code:xml}
 !-- The error in the configuration is the space before stopwords.txt.
  Because of that that file is not replicated --
 str name=confFilesschema.xml,test.txt, stopwords.txt/str
 {code}
 It would be nice, if that space simply would be ignored.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2923) IllegalArgumentException when using useFilterForSortedQuery on an empty index


[ 
https://issues.apache.org/jira/browse/SOLR-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283493#comment-13283493
 ] 

Mark Miller commented on SOLR-2923:
---

patch looks good to me

 IllegalArgumentException when using useFilterForSortedQuery on an empty index
 -

 Key: SOLR-2923
 URL: https://issues.apache.org/jira/browse/SOLR-2923
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 3.6, 4.0
Reporter: Adrien Grand
Assignee: Mark Miller
Priority: Trivial
 Attachments: SOLR-2923.patch


 An IllegalArgumentException can occur under the following circumstances:
  - the index is empty,
  - {{useFilterForSortedQuery}} is enabled,
  - {{queryResultsCache}} is disabled.
 Here are what the exception and its stack trace look like (Solr trunk):
 {quote}
 numHits must be  0; please use TotalHitCountCollector if you just need the 
 total hit count
 java.lang.IllegalArgumentException: numHits must be  0; please use 
 TotalHitCountCollector if you just need the total hit count
   at 
 org.apache.lucene.search.TopFieldCollector.create(TopFieldCollector.java:917)
   at 
 org.apache.solr.search.SolrIndexSearcher.sortDocSet(SolrIndexSearcher.java:1741)
   at 
 org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1211)
   at 
 org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:353)
   ...
 {quote}
 To reproduce this error from a fresh copy of Solr trunk, edit 
 {{example/solr/conf/solrconfig.xml}} to disable {{queryResultCache}} and 
 enable {{useFilterForSortedQuery}}. Then run {{ant run-example}} and issue a 
 query which sorts against any field 
 ({{http://localhost:8983/solr/select?q=*:*sort=manu+desc}} for example).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4072) CharFilter that Unicode-normalizes input

2012-05-25 Thread Robert Muir (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Robert Muir updated LUCENE-4072:

Attachment: LUCENE-4072.patch

attached is the filter, turned into a patch.

however, I added an additional random test and it currently fails... will look
into this more.

CharFilter that Unicode-normalizes input

Key: LUCENE-4072
URL: https://issues.apache.org/jira/browse/LUCENE-4072
Project: Lucene - Java
Issue Type: New Feature
Components: modules/analysis
Reporter: Ippei UKAI
Attachments: LUCENE-4072.patch,
ippeiukai-ICUNormalizer2CharFilter-4752cad.zip

I'd like to contribute a CharFilter that Unicode-normalizes input with ICU4J.
The benefit of having this process as CharFilter is that tokenizer can work
on normalised text while offset-correction ensuring fast vector highlighter
and other offset-dependent features do not break.
The implementation is available at following repository:
https://github.com/ippeiukai/ICUNormalizer2CharFilter
Unfortunately this is my unpaid side-project and cannot spend much time to
merge my work to Lucene to make appropriate patch. I'd appreciate it if
anyone could give it a go. I'm happy to relicense it to whatever that meets
your needs.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3440) FastVectorHighlighter: IDF-weighted terms for ordered fragments

2012-05-25 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283496#comment-13283496
 ] 

Koji Sekiguchi commented on LUCENE-3440:


Hi sebastian!

bq. Maybe it would be better to split the patch in several smaller ones, e.g.

This is a great idea and it helps me a lot! If you could provide them one by 
one for trunk, I think I can review the smaller patch and commit them one by 
one.

 FastVectorHighlighter: IDF-weighted terms for ordered fragments 
 

 Key: LUCENE-3440
 URL: https://issues.apache.org/jira/browse/LUCENE-3440
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/highlighter
Reporter: sebastian L.
Priority: Minor
  Labels: FastVectorHighlighter
 Fix For: 4.0

 Attachments: LUCENE-3440.patch, LUCENE-3440.patch, LUCENE-3440.patch, 
 LUCENE-3440_3.6.1-SNAPSHOT.patch, LUCENE-4.0-SNAPSHOT-3440-9.patch, 
 weight-vs-boost_table01.html, weight-vs-boost_table02.html


 The FastVectorHighlighter uses for every term found in a fragment an equal 
 weight, which causes a higher ranking for fragments with a high number of 
 words or, in the worst case, a high number of very common words than 
 fragments that contains *all* of the terms used in the original query. 
 This patch provides ordered fragments with IDF-weighted terms: 
 total weight = total weight + IDF for unique term per fragment * boost of 
 query; 
 The ranking-formula should be the same, or at least similar, to that one used 
 in org.apache.lucene.search.highlight.QueryTermScorer.
 The patch is simple, but it works for us. 
 Some ideas:
 - A better approach would be moving the whole fragments-scoring into a 
 separate class.
 - Switch scoring via parameter 
 - Exact phrases should be given a even better score, regardless if a 
 phrase-query was executed or not
 - edismax/dismax-parameters pf, ps and pf^boost should be observed and 
 corresponding fragments should be ranked higher 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3488) Create a Collections API for SolrCloud

2012-05-25 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283510#comment-13283510
 ] 

Yonik Seeley commented on SOLR-3488:


bq. should we instead add a create node to a queue in zookeeper? 

Yeah, a work queue in ZK makes perfect sense.  Perhaps serialize the params to 
a JSON map/object per line?

Possible parameters:
- name of the collection
- the config for the collection
- number of shards in the new collection
- default replication factor

Operations:
 - add a collection
 - remove a collection
- different options here... leave cores up, bring cores down, completely 
remove cores (and data)
 - change collection properties (replication factor, config)
 - expand collection (split shards)
 - add/remove a collection alias

Shard operations:
 - add a shard (more for custom sharding)
 - remove a shard
 - change shard properties (replication factor)
 - split a shard
 - add/remove a shard alias


 Create a Collections API for SolrCloud
 --

 Key: SOLR-3488
 URL: https://issues.apache.org/jira/browse/SOLR-3488
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-3488) Create a Collections API for SolrCloud

2012-05-25 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283510#comment-13283510
 ] 

Yonik Seeley edited comment on SOLR-3488 at 5/25/12 2:58 PM:
-

bq. should we instead add a create node to a queue in zookeeper? 

Yeah, a work queue in ZK makes perfect sense.  Perhaps serialize the params to 
a JSON map/object per line?
edit: or perhaps it makes more sense for each operation to be a separate file 
(which is what I think you wrote anyway)

Possible parameters:
- name of the collection
- the config for the collection
- number of shards in the new collection
- default replication factor

Operations:
 - add a collection
 - remove a collection
- different options here... leave cores up, bring cores down, completely 
remove cores (and data)
 - change collection properties (replication factor, config)
 - expand collection (split shards)
 - add/remove a collection alias

Shard operations:
 - add a shard (more for custom sharding)
 - remove a shard
 - change shard properties (replication factor)
 - split a shard
 - add/remove a shard alias


  was (Author: ysee...@gmail.com):
bq. should we instead add a create node to a queue in zookeeper? 

Yeah, a work queue in ZK makes perfect sense.  Perhaps serialize the params to 
a JSON map/object per line?

Possible parameters:
- name of the collection
- the config for the collection
- number of shards in the new collection
- default replication factor

Operations:
 - add a collection
 - remove a collection
- different options here... leave cores up, bring cores down, completely 
remove cores (and data)
 - change collection properties (replication factor, config)
 - expand collection (split shards)
 - add/remove a collection alias

Shard operations:
 - add a shard (more for custom sharding)
 - remove a shard
 - change shard properties (replication factor)
 - split a shard
 - add/remove a shard alias

  
 Create a Collections API for SolrCloud
 --

 Key: SOLR-3488
 URL: https://issues.apache.org/jira/browse/SOLR-3488
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches


 [ 
https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Harwood updated LUCENE-4069:
-

Attachment: PrimaryKey40PerformanceTestSrc.zip
BloomFilterCodec40.patch

 Segment-level Bloom filters for a 2 x speed up on rare term searches
 

 Key: LUCENE-4069
 URL: https://issues.apache.org/jira/browse/LUCENE-4069
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 3.6
Reporter: Mark Harwood
Priority: Minor
 Fix For: 3.6.1

 Attachments: BloomFilterCodec40.patch, 
 MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip


 An addition to each segment which stores a Bloom filter for selected fields 
 in order to give fast-fail to term searches, helping avoid wasted disk access.
 Best suited for low-frequency fields e.g. primary keys on big indexes with 
 many segments but also speeds up general searching in my tests.
 Overview slideshow here: 
 http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments
 Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU
 Patch based on 3.6 codebase attached.
 There are no API changes currently - to play just add a field with _blm on 
 the end of the name to invoke special indexing/querying capability. Clearly a 
 new Field or schema declaration(!) would need adding to APIs to configure the 
 service properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches


 [ 
https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Harwood updated LUCENE-4069:
-

Attachment: BloomFilterCodec40.patch
PrimaryKey40PerformanceTestSrc.zip

I've ported this Bloom Filtering code to work as a 4.0 Codec now.
I see a 35% improvement over standard Codecs on random lookups on a warmed 
index. 

I also notice that the PulsingCodec is no longer faster than standard Codec - 
is this news to people as I thought it was supposed to be the way forward?

My test rig (adapted from Mike's original primary key test rig here 
http://blog.mikemccandless.com/2010/06/lucenes-pulsingcodec-on-primary-key.html)
 is attached as a zip.
The new BloomFilteringCodec is also attached here as a patch.

Searches against plain text fields also look to be faster (using AOL500k 
queries searching Wikipedia English) but obviously that particular test rig is 
harder to include as an attachment here.

I can open a seperate JIRA issue for this 4.0 version of the code if that makes 
more sense.



 Segment-level Bloom filters for a 2 x speed up on rare term searches
 

 Key: LUCENE-4069
 URL: https://issues.apache.org/jira/browse/LUCENE-4069
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 3.6
Reporter: Mark Harwood
Priority: Minor
 Fix For: 3.6.1

 Attachments: BloomFilterCodec40.patch, 
 MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip


 An addition to each segment which stores a Bloom filter for selected fields 
 in order to give fast-fail to term searches, helping avoid wasted disk access.
 Best suited for low-frequency fields e.g. primary keys on big indexes with 
 many segments but also speeds up general searching in my tests.
 Overview slideshow here: 
 http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments
 Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU
 Patch based on 3.6 codebase attached.
 There are no API changes currently - to play just add a field with _blm on 
 the end of the name to invoke special indexing/querying capability. Clearly a 
 new Field or schema declaration(!) would need adding to APIs to configure the 
 service properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches


 [ 
https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Harwood updated LUCENE-4069:
-

Attachment: (was: BloomFilterCodec40.patch)

 Segment-level Bloom filters for a 2 x speed up on rare term searches
 

 Key: LUCENE-4069
 URL: https://issues.apache.org/jira/browse/LUCENE-4069
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 3.6
Reporter: Mark Harwood
Priority: Minor
 Fix For: 3.6.1

 Attachments: BloomFilterCodec40.patch, 
 MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip


 An addition to each segment which stores a Bloom filter for selected fields 
 in order to give fast-fail to term searches, helping avoid wasted disk access.
 Best suited for low-frequency fields e.g. primary keys on big indexes with 
 many segments but also speeds up general searching in my tests.
 Overview slideshow here: 
 http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments
 Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU
 Patch based on 3.6 codebase attached.
 There are no API changes currently - to play just add a field with _blm on 
 the end of the name to invoke special indexing/querying capability. Clearly a 
 new Field or schema declaration(!) would need adding to APIs to configure the 
 service properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches


 [ 
https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Harwood updated LUCENE-4069:
-

Attachment: (was: PrimaryKey40PerformanceTestSrc.zip)

 Segment-level Bloom filters for a 2 x speed up on rare term searches
 

 Key: LUCENE-4069
 URL: https://issues.apache.org/jira/browse/LUCENE-4069
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 3.6
Reporter: Mark Harwood
Priority: Minor
 Fix For: 3.6.1

 Attachments: BloomFilterCodec40.patch, 
 MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip


 An addition to each segment which stores a Bloom filter for selected fields 
 in order to give fast-fail to term searches, helping avoid wasted disk access.
 Best suited for low-frequency fields e.g. primary keys on big indexes with 
 many segments but also speeds up general searching in my tests.
 Overview slideshow here: 
 http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments
 Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU
 Patch based on 3.6 codebase attached.
 There are no API changes currently - to play just add a field with _blm on 
 the end of the name to invoke special indexing/querying capability. Clearly a 
 new Field or schema declaration(!) would need adding to APIs to configure the 
 service properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-3488) Create a Collections API for SolrCloud

[
https://issues.apache.org/jira/browse/SOLR-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283382#comment-13283382
]

Mark Miller edited comment on SOLR-3488 at 5/25/12 3:12 PM:

So initially, you can create a new collection and pass an existing collection
name to determine which shards it's created on. Would also be nice to be able
to explicitly pass the shard urls to use, as well as simply offer X shards, Y
replicas. In that case, perhaps the -leader- overseer could handle ensuring
that. You might also want to be able to simply say, create it on all known
shards.

Further things to look at:

was (Author: markrmil...@gmail.com):
I'll post an initial patch just for create soon. It's just a start though.
I've added a bunch of comments for TODOs or things to consider for the future.
I'd like to start simple just to get 'something' in though.

Further things to look at:

Create a Collections API for SolrCloud
--

Key: SOLR-3488
URL: https://issues.apache.org/jira/browse/SOLR-3488
Project: Solr
Issue Type: New Feature
Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-2058) Adds optional phrase slop to edismax pf2, pf3 and pf parameters with field~slop^boost syntax

2012-05-25 Thread James Dyer (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

James Dyer resolved SOLR-2058.
--

Resolution: Fixed
Assignee: James Dyer

Committed to Trunk, r1342681.

This is the May 17, 2012 patch which is a touched-up version of Ron Mayer's
work from August 31, 2010 (Thank you!).

Adds optional phrase slop to edismax pf2, pf3 and pf parameters with
field~slop^boost syntax

Key: SOLR-2058
URL: https://issues.apache.org/jira/browse/SOLR-2058
Project: Solr
Issue Type: Improvement
Components: query parsers
Environment: n/a
Reporter: Ron Mayer
Assignee: James Dyer
Priority: Minor
Fix For: 4.0

Attachments: SOLR-2058-and-3351-not-finished.patch, SOLR-2058.patch,
edismax_pf_with_slop_v2.1.patch, edismax_pf_with_slop_v2.patch,
pf2_with_slop.patch

http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3c4c659119.2010...@0ape.com%3E
{quote}
From Ron Mayer r...@0ape.com
... my results might be even better if I had a couple different pf2s with
different ps's at the same time. In particular. One with ps=0 to put a
high boost on ones the have the right ordering of words. For example
insuring that [the query]:
red hat black jacket
boosts only documents with red hats and not black hats. And another
pf2 with a more modest boost with ps=5 or so to handle the query above also
boosting docs with
red baseball hat.
{quote}
[http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3caanlktimd+v3g6d_mnhp+jykkd+dej8fvmvf_1lqoi...@mail.gmail.com%3E]
{quote}
From Yonik Seeley yo...@lucidimagination.com
Perhaps fold it into the pf/pf2 syntax?
pf=text^2// current syntax... makes phrases with a boost of 2
pf=text~1^2 // proposed syntax... makes phrases with a slop of 1 and
a boost of 2
That actually seems pretty natural given the lucene query syntax - an
actual boosted sloppy phrase query already looks like
{{text:foo bar~1^2}}
-Yonik
{quote}
[http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3calpine.deb.1.10.1008161300510.6...@radix.cryptio.net%3E]
{quote}
From Chris Hostetter hossman_luc...@fucit.org
Big +1 to this idea ... the existing ps param can stick arround as the
default for any field that doesn't specify it's own slop in the pf/pf2/pf3
fields using the ~ syntax.
-Hoss
{quote}

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-2923) IllegalArgumentException when using useFilterForSortedQuery on an empty index


 [ 
https://issues.apache.org/jira/browse/SOLR-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-2923.
---

   Resolution: Fixed
Fix Version/s: 4.0

Thanks Adrien!

 IllegalArgumentException when using useFilterForSortedQuery on an empty index
 -

 Key: SOLR-2923
 URL: https://issues.apache.org/jira/browse/SOLR-2923
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 3.6, 4.0
Reporter: Adrien Grand
Assignee: Mark Miller
Priority: Trivial
 Fix For: 4.0

 Attachments: SOLR-2923.patch


 An IllegalArgumentException can occur under the following circumstances:
  - the index is empty,
  - {{useFilterForSortedQuery}} is enabled,
  - {{queryResultsCache}} is disabled.
 Here are what the exception and its stack trace look like (Solr trunk):
 {quote}
 numHits must be  0; please use TotalHitCountCollector if you just need the 
 total hit count
 java.lang.IllegalArgumentException: numHits must be  0; please use 
 TotalHitCountCollector if you just need the total hit count
   at 
 org.apache.lucene.search.TopFieldCollector.create(TopFieldCollector.java:917)
   at 
 org.apache.solr.search.SolrIndexSearcher.sortDocSet(SolrIndexSearcher.java:1741)
   at 
 org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1211)
   at 
 org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:353)
   ...
 {quote}
 To reproduce this error from a fresh copy of Solr trunk, edit 
 {{example/solr/conf/solrconfig.xml}} to disable {{queryResultCache}} and 
 enable {{useFilterForSortedQuery}}. Then run {{ant run-example}} and issue a 
 query which sorts against any field 
 ({{http://localhost:8983/solr/select?q=*:*sort=manu+desc}} for example).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2923) IllegalArgumentException when using useFilterForSortedQuery on an empty index

2012-05-25 Thread Adrien Grand (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283559#comment-13283559
 ] 

Adrien Grand commented on SOLR-2923:


Hi Mark, thanks for the review!

 IllegalArgumentException when using useFilterForSortedQuery on an empty index
 -

 Key: SOLR-2923
 URL: https://issues.apache.org/jira/browse/SOLR-2923
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 3.6, 4.0
Reporter: Adrien Grand
Assignee: Mark Miller
Priority: Trivial
 Fix For: 4.0

 Attachments: SOLR-2923.patch


 An IllegalArgumentException can occur under the following circumstances:
  - the index is empty,
  - {{useFilterForSortedQuery}} is enabled,
  - {{queryResultsCache}} is disabled.
 Here are what the exception and its stack trace look like (Solr trunk):
 {quote}
 numHits must be  0; please use TotalHitCountCollector if you just need the 
 total hit count
 java.lang.IllegalArgumentException: numHits must be  0; please use 
 TotalHitCountCollector if you just need the total hit count
   at 
 org.apache.lucene.search.TopFieldCollector.create(TopFieldCollector.java:917)
   at 
 org.apache.solr.search.SolrIndexSearcher.sortDocSet(SolrIndexSearcher.java:1741)
   at 
 org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1211)
   at 
 org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:353)
   ...
 {quote}
 To reproduce this error from a fresh copy of Solr trunk, edit 
 {{example/solr/conf/solrconfig.xml}} to disable {{queryResultCache}} and 
 enable {{useFilterForSortedQuery}}. Then run {{ant run-example}} and issue a 
 query which sorts against any field 
 ({{http://localhost:8983/solr/select?q=*:*sort=manu+desc}} for example).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3488) Create a Collections API for SolrCloud

[
https://issues.apache.org/jira/browse/SOLR-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Mark Miller updated SOLR-3488:
--

Attachment: SOLR-3488.patch

I'm going on vacation for a week, so here is my early work on just getting
something basic going. It does not involved any overseer stuff yet.

Someone feel free to take it - commit it and iterate, or iterate in patch form
- whatever makes sense. I'll help when I get back if there is more to do, and
if no one makes any progress, I'll continue on it when I get back.

Currently, I've copied the core admin handler pattern and made a collections
handler. There is one simple test and currently the only way to choose which
nodes the collection is put on is to give an existing template collection.

The test asserts nothing at the moment - all very early work. But I imagine we
will be changing direction a fair amount, so that's good I think.

Create a Collections API for SolrCloud
--

Key: SOLR-3488
URL: https://issues.apache.org/jira/browse/SOLR-3488
Project: Solr
Issue Type: New Feature
Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Attachments: SOLR-3488.patch

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches

[
https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283574#comment-13283574
]

Michael McCandless commented on LUCENE-4069:

bq. I see a 35% improvement over standard Codecs on random lookups on a warmed
index.

Impressive! This is for primary key lookups?

It looks like the primary keys are GUID-like right? (Ie randomly generated).
I wonder if they had some structure instead (eg '%09d' % (id++)) how the
results would look...

bq. I also notice that the PulsingCodec is no longer faster than standard Codec
- is this news to people as I thought it was supposed to be the way forward?

That's baffling to me: it should only save seeks vs Lucene40 codec, so on a
cold index you should see substantial gains, and on a warm index I'd still
expect some gains. Not sure what's up...

bq. I can open a seperate JIRA issue for this 4.0 version of the code if that
makes more sense.

I think it's fine to do it here? Really 3.6.x is only for bug fixes now ... so
I think we should commit this to trunk.

I wonder if you can wrap any other PostingsFormat (ie instead of hard-coding to
Lucene40PostingsFormat)? This way users can wrap any PF they have w/ the bloom
filter...

Can you use FixedBitSet instead of OpenBitSet? Or is there a reason to use
OpenBitSet here...?

Segment-level Bloom filters for a 2 x speed up on rare term searches

Key: LUCENE-4069
URL: https://issues.apache.org/jira/browse/LUCENE-4069
Project: Lucene - Java
Issue Type: Improvement
Components: core/index
Affects Versions: 3.6
Reporter: Mark Harwood
Priority: Minor
Fix For: 3.6.1

Attachments: BloomFilterCodec40.patch,
MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip

An addition to each segment which stores a Bloom filter for selected fields
in order to give fast-fail to term searches, helping avoid wasted disk access.
Best suited for low-frequency fields e.g. primary keys on big indexes with
many segments but also speeds up general searching in my tests.
Overview slideshow here:
http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments
Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU
Patch based on 3.6 codebase attached.
There are no API changes currently - to play just add a field with _blm on
the end of the name to invoke special indexing/querying capability. Clearly a
new Field or schema declaration(!) would need adding to APIs to configure the
service properly.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3486) The memory size of Solr caches should be configurable

2012-05-25 Thread Adrien Grand (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Adrien Grand updated SOLR-3486:
---

Attachment: LFUMap.java

I've just uploaded LFUMap.java based on your implementation of LFUCache. To
have a LFU cache with configurable maximum size in bytes, just wrap an instance
of this class into a SizableCache.

I uploaded this file to show how SizableCache could be used with different
kinds of backends. But building a LFUCache is a different issue. I think we
should continue the discussion on LFUCache on SOLR-3393 and only discuss
configurability of the mem size of Solr caches here. Feel free to reuse the
code LFUMap.java if you want, just beware that I didn't test it much. :-)

The memory size of Solr caches should be configurable
-

Key: SOLR-3486
URL: https://issues.apache.org/jira/browse/SOLR-3486
Project: Solr
Issue Type: Improvement
Components: search
Reporter: Adrien Grand
Priority: Minor
Attachments: LFUMap.java, SOLR-3486.patch, SOLR-3486.patch

It is currently possible to configure the sizes of Solr caches based on the
number of entries of the cache. The problem is that the memory size of cached
values may vary a lot over time (depending on IndexReader.maxDoc and the
queries that are run) although the JVM heap size does not.
Having a configurable max size in bytes would also help optimize cache
utilization, making it possible to store more values provided that they have
a small memory footprint.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Solr-trunk - Build # 1865 - Failure

2012-05-25 Thread Dawid Weiss

 Yeah, I suppose if we could tell the test framework, for this test, ignore 
 this expected uncaught exception, that might help.

The point of handling uncaught exceptions, thread leaks etc. at the
test framework's level is in the essence to capture bad tests or
unanticipated conditions, not failures that we know of or can predict.
So is interrupting leaked threads (because there is no way to do it
otherwise if you don't know anything about a given thread).

This said, there are obviously ways to handle the above situation --
from trying to speed up jetty shutdown to capturing that uncaught
error. Why is jetty so slow to shutdown? What does it mean slow?

 Tests went from 6 minutes for me, to 33 minutes.

I don't think this can be explained by shutting down jetty... this
seems too long. Can you provide a repeatable test case what would
demonstrate the failure you mentioned? Once I have it it'll be easier
to try to come up with workarounds.

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches


[ 
https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283583#comment-13283583
 ] 

Mark Harwood commented on LUCENE-4069:
--

My current focus is speeding up primary key lookups but this may have 
applications outside of that (Zipf tells us there is a lot of low frequency 
stuff in free text).

Following the principle of the best IO is no IO the Bloom Filter helps us 
quickly understand which segments to even bother looking in. That has to be a 
win overall.

I started trying to write this Codec as a wrapper for any other Codec (it 
simply listens to a stream of terms and stores a bitset of recorded hashes in a 
.blm file). However that was trickier than I expected - I'd need to encode a 
special entry in my blm files just to know the name of the delegated codec I 
needed to instantiate at read-time because Lucene's normal Codec-instantiation 
logic would be looking for BloomCodec and I'd have to discover the delegate 
that was used to write all of the non-blm files.

Not looked at FixedBitSet but I imagine that could be used instead.

 Segment-level Bloom filters for a 2 x speed up on rare term searches
 

 Key: LUCENE-4069
 URL: https://issues.apache.org/jira/browse/LUCENE-4069
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 3.6
Reporter: Mark Harwood
Priority: Minor
 Fix For: 3.6.1

 Attachments: BloomFilterCodec40.patch, 
 MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip


 An addition to each segment which stores a Bloom filter for selected fields 
 in order to give fast-fail to term searches, helping avoid wasted disk access.
 Best suited for low-frequency fields e.g. primary keys on big indexes with 
 many segments but also speeds up general searching in my tests.
 Overview slideshow here: 
 http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments
 Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU
 Patch based on 3.6 codebase attached.
 There are no API changes currently - to play just add a field with _blm on 
 the end of the name to invoke special indexing/querying capability. Clearly a 
 new Field or schema declaration(!) would need adding to APIs to configure the 
 service properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3131) XA Resource/Transaction support


[ 
https://issues.apache.org/jira/browse/LUCENE-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283588#comment-13283588
 ] 

Michael McCandless commented on LUCENE-3131:


bq. Is XA wrapper in the roadmap?

There is no roadmap in open source.  Instead, users and devs scratch their own 
itches and contribute patches to fix things, add new features, etc.

So once someone who understands XA and Lucene contributes/iterates on a patch, 
then we'll have XA support... it could be someone out there has already built 
it but just hasn't offered it back yet...

 XA Resource/Transaction  support
 

 Key: LUCENE-3131
 URL: https://issues.apache.org/jira/browse/LUCENE-3131
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 3.1
Reporter: Magnus
Assignee: Robert Muir
Priority: Minor
 Fix For: 3.1.1


 Please add XAResoure/XATransaction support into Lucene core.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3490) When DocumentObjectBinder encounters an invalid setter method, it should add that to the runtimeexception message.

2012-05-25 Thread Nicholas DiPiazza (JIRA)

Nicholas DiPiazza created SOLR-3490:
---

 Summary: When DocumentObjectBinder encounters an invalid setter 
method, it should add that to the runtimeexception message. 
 Key: SOLR-3490
 URL: https://issues.apache.org/jira/browse/SOLR-3490
 Project: Solr
  Issue Type: Improvement
Affects Versions: 3.6
 Environment: All
Reporter: Nicholas DiPiazza
Priority: Minor


While trying to use QueryResponse.getBeans(ClassT type), I have an 
application getting the RuntimeException: Invalid setter method. Must have one 
and only one parameter.

This is from 
org.apache.solr.client.solrj.beans.DocumentObjectBinder.DocField.storeType()

I was forced to get out the debugger in order to get the name of the Pojo and 
the Setter it is referring to. 

Please add information into the RuntimeException.

throw new RuntimeException(Invalid setter method in  + setter.getName() +  
in class  + setter.getClass().getName() + . Setter method must have at least 
one parameter.);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Solr-trunk - Build # 1865 - Failure

2012-05-25 Thread Dawid Weiss

 Yeah, I suppose if we could tell the test framework, for this test, ignore 
 this expected uncaught exception, that might help.

This is not a problem technically; for a single test suite I'd just
wrap everything with a rule that would temporarily capture unhandled
exceptions, that's it. I suspect a lot of other exceptions/ problems
we're seeing are due to leaked threads and unclosed jetty/zk sockets,
so I'd rather work on trying to make this more efficient.

I am pretty swamped with other things at the moment but if you can
give me a test case that somehow shows these long jetty shutdown
times it'd be a big help.

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4062) More fine-grained control over the packed integer implementation that is chosen


[ 
https://issues.apache.org/jira/browse/LUCENE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283593#comment-13283593
 ] 

Michael McCandless commented on LUCENE-4062:


Thanks Adrien, this looks great!  I'll commit soon...

 More fine-grained control over the packed integer implementation that is 
 chosen
 ---

 Key: LUCENE-4062
 URL: https://issues.apache.org/jira/browse/LUCENE-4062
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/other
Reporter: Adrien Grand
Assignee: Michael McCandless
Priority: Minor
  Labels: performance
 Fix For: 4.1

 Attachments: LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, 
 LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch


 In order to save space, Lucene has two main PackedInts.Mutable implentations, 
 one that is very fast and is based on a byte/short/integer/long array 
 (Direct*) and another one which packs bits in a memory-efficient manner 
 (Packed*).
 The packed implementation tends to be much slower than the direct one, which 
 discourages some Lucene components to use it. On the other hand, if you store 
 21 bits integers in a Direct32, this is a space loss of (32-21)/32=35%.
 If you accept to trade some space for speed, you could store 3 of these 21 
 bits integers in a long, resulting in an overhead of 1/3 bit per value. One 
 advantage of this approach is that you never need to read more than one block 
 to read or write a value, so this can be significantly faster than Packed32 
 and Packed64 which always need to read/write two blocks in order to avoid 
 costly branches.
 I ran some tests, and for 1000 21 bits values, this implementation takes 
 less than 2% more space and has 44% faster writes and 30% faster reads. The 
 12 bits version (5 values per block) has the same performance improvement and 
 a 6% memory overhead compared to the packed implementation.
 In order to select the best implementation for a given integer size, I wrote 
 the {{PackedInts.getMutable(valueCount, bitsPerValue, 
 acceptableOverheadPerValue)}} method. This method select the fastest 
 implementation that has less than {{acceptableOverheadPerValue}} wasted bits 
 per value. For example, if you accept an overhead of 20% 
 ({{acceptableOverheadPerValue = 0.2f * bitsPerValue}}), which is pretty 
 reasonable, here is what implementations would be selected:
  * 1: Packed64SingleBlock1
  * 2: Packed64SingleBlock2
  * 3: Packed64SingleBlock3
  * 4: Packed64SingleBlock4
  * 5: Packed64SingleBlock5
  * 6: Packed64SingleBlock6
  * 7: Direct8
  * 8: Direct8
  * 9: Packed64SingleBlock9
  * 10: Packed64SingleBlock10
  * 11: Packed64SingleBlock12
  * 12: Packed64SingleBlock12
  * 13: Packed64
  * 14: Direct16
  * 15: Direct16
  * 16: Direct16
  * 17: Packed64
  * 18: Packed64SingleBlock21
  * 19: Packed64SingleBlock21
  * 20: Packed64SingleBlock21
  * 21: Packed64SingleBlock21
  * 22: Packed64
  * 23: Packed64
  * 24: Packed64
  * 25: Packed64
  * 26: Packed64
  * 27: Direct32
  * 28: Direct32
  * 29: Direct32
  * 30: Direct32
  * 31: Direct32
  * 32: Direct32
  * 33: Packed64
  * 34: Packed64
  * 35: Packed64
  * 36: Packed64
  * 37: Packed64
  * 38: Packed64
  * 39: Packed64
  * 40: Packed64
  * 41: Packed64
  * 42: Packed64
  * 43: Packed64
  * 44: Packed64
  * 45: Packed64
  * 46: Packed64
  * 47: Packed64
  * 48: Packed64
  * 49: Packed64
  * 50: Packed64
  * 51: Packed64
  * 52: Packed64
  * 53: Packed64
  * 54: Direct64
  * 55: Direct64
  * 56: Direct64
  * 57: Direct64
  * 58: Direct64
  * 59: Direct64
  * 60: Direct64
  * 61: Direct64
  * 62: Direct64
 Under 32 bits per value, only 13, 17 and 22-26 bits per value would still 
 choose the slower Packed64 implementation. Allowing a 50% overhead would 
 prevent the packed implementation to be selected for bits per value under 32. 
 Allowing an overhead of 32 bits per value would make sure that a Direct* 
 implementation is always selected.
 Next steps would be to:
  * make lucene components use this {{getMutable}} method and let users decide 
 what trade-off better suits them,
  * write a Packed32SingleBlock implementation if necessary (I didn't do it 
 because I have no 32-bits computer to test the performance improvements).
 I think this would allow more fine-grained control over the speed/space 
 trade-off, what do you think?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (SOLR-2822) don't run update processors twice

2012-05-25 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-2822.


Resolution: Fixed
  Assignee: Hoss Man

Committed revision 1342743.


 don't run update processors twice
 -

 Key: SOLR-2822
 URL: https://issues.apache.org/jira/browse/SOLR-2822
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud, update
Reporter: Yonik Seeley
Assignee: Hoss Man
 Fix For: 4.0

 Attachments: SOLR-2822.patch, SOLR-2822.patch, SOLR-2822.patch


 An update will first go through processors until it gets to the point where 
 it is forwarded to the leader (or forwarded to replicas if already on the 
 leader).
 We need a way to skip over the processors that were already run (perhaps by 
 using a processor chain dedicated to sub-updates?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches

[
https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283615#comment-13283615
]

Mark Harwood commented on LUCENE-4069:
--

Update- I've discovered this Bloom Filter Codec currently has a bug where it
doesn't handle indexes with 1 field.
It's probably all tangled up in the PerField... codec logic so I need to do
some more digging.

Segment-level Bloom filters for a 2 x speed up on rare term searches

Attachments: BloomFilterCodec40.patch,
MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java6-64 #205

See 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/205/changes

Changes:

[markrmiller] SOLR-2923: IllegalArgumentException when using 
useFilterForSortedQuery on an empty index.

--
[...truncated 10778 lines...]
   [junit4] Completed in 0.03s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.util.FileUtilsTest
   [junit4] Completed in 0.01s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.core.TestCodecSupport
   [junit4] Completed in 0.26s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.ConvertedLegacyTest
   [junit4] Completed in 5.05s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.component.DebugComponentTest
   [junit4] Completed in 1.49s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.cloud.BasicDistributedZkTest
   [junit4] Completed in 54.73s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.search.TestRangeQuery
   [junit4] Completed in 9.22s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.search.TestRecovery
   [junit4] Completed in 13.61s, 9 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.cloud.BasicZkTest
   [junit4] Completed in 8.74s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.update.AutoCommitTest
   [junit4] Completed in 9.68s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.TestGroupingSearch
   [junit4] Completed in 6.01s, 12 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.component.QueryElevationComponentTest
   [junit4] Completed in 7.61s, 7 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.update.PeerSyncTest
   [junit4] Completed in 5.34s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.spelling.suggest.SuggesterFSTTest
   [junit4] Completed in 1.65s, 4 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.MoreLikeThisHandlerTest
   [junit4] Completed in 1.30s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.StandardRequestHandlerTest
   [junit4] Completed in 1.06s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.BasicFunctionalityTest
   [junit4] IGNORED 0.00s | BasicFunctionalityTest.testDeepPaging
   [junit4] Cause: Annotated @Ignore(See SOLR-1726)
   [junit4] Completed in 3.08s, 23 tests, 1 skipped
   [junit4]  
   [junit4] Suite: org.apache.solr.SolrInfoMBeanTest
   [junit4] Completed in 0.94s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.update.SolrCmdDistributorTest
   [junit4] Completed in 2.08s, 1 test
   [junit4]  
   [junit4] Suite: 
org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest
   [junit4] Completed in 1.63s, 6 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.core.TestSolrDeletionPolicy2
   [junit4] Completed in 1.15s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.update.TestIndexingPerformance
   [junit4] Completed in 0.89s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.search.TestPseudoReturnFields
   [junit4] Completed in 1.44s, 13 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.highlight.HighlighterTest
   [junit4] Completed in 1.99s, 27 tests
   [junit4]  
   [junit4] Suite: 
org.apache.solr.update.processor.UniqFieldsUpdateProcessorFactoryTest
   [junit4] Completed in 0.81s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.search.SpatialFilterTest
   [junit4] Completed in 1.52s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.servlet.NoCacheHeaderTest
   [junit4] Completed in 1.04s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.update.SolrIndexConfigTest
   [junit4] Completed in 1.82s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.search.TestDocSet
   [junit4] Completed in 0.63s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.response.TestPHPSerializedResponseWriter
   [junit4] Completed in 1.00s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.DisMaxRequestHandlerTest
   [junit4] Completed in 1.09s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.JsonLoaderTest
   [junit4] Completed in 1.05s, 5 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.core.IndexReaderFactoryTest
   [junit4] Completed in 0.88s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.search.TestExtendedDismaxParser
   [junit4] Completed in 9.15s, 8 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.schema.TestCollationField
   [junit4] Completed in 0.45s, 8 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.schema.IndexSchemaRuntimeFieldTest
   [junit4] Completed in 1.19s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.core.AlternateDirectoryTest
   [junit4] Completed in 1.01s, 1 test
   [junit4]  
   [junit4] Suite: 
org.apache.solr.update.processor.UpdateRequestProcessorFactoryTest
   [junit4] Completed in 1.06s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.search.ReturnFieldsTest
   [junit4] Completed in 1.25s, 10 tests
   [junit4]  
   [junit4] Suite:

[jira] [Commented] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans


[ 
https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283631#comment-13283631
 ] 

Simon Willnauer commented on LUCENE-2878:
-

bq. This might take a little longer, in that it will require me to actually 
think about what I'm doing...
no worries, good job so far. Did the updated patch made sense to you? I think 
you had a good warmup phase now we can go somewhat deeper!



 Allow Scorer to expose positions and payloads aka. nuke spans 
 --

 Key: LUCENE-2878
 URL: https://issues.apache.org/jira/browse/LUCENE-2878
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/search
Affects Versions: Positions Branch
Reporter: Simon Willnauer
Assignee: Simon Willnauer
  Labels: gsoc2011, gsoc2012, lucene-gsoc-11, lucene-gsoc-12, 
 mentor
 Fix For: Positions Branch

 Attachments: LUCENE-2878-OR.patch, LUCENE-2878.patch, 
 LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, 
 LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, 
 LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, 
 LUCENE-2878_trunk.patch, LUCENE-2878_trunk.patch, PosHighlighter.patch, 
 PosHighlighter.patch


 Currently we have two somewhat separate types of queries, the one which can 
 make use of positions (mainly spans) and payloads (spans). Yet Span*Query 
 doesn't really do scoring comparable to what other queries do and at the end 
 of the day they are duplicating lot of code all over lucene. Span*Queries are 
 also limited to other Span*Query instances such that you can not use a 
 TermQuery or a BooleanQuery with SpanNear or anthing like that. 
 Beside of the Span*Query limitation other queries lacking a quiet interesting 
 feature since they can not score based on term proximity since scores doesn't 
 expose any positional information. All those problems bugged me for a while 
 now so I stared working on that using the bulkpostings API. I would have done 
 that first cut on trunk but TermScorer is working on BlockReader that do not 
 expose positions while the one in this branch does. I started adding a new 
 Positions class which users can pull from a scorer, to prevent unnecessary 
 positions enums I added ScorerContext#needsPositions and eventually 
 Scorere#needsPayloads to create the corresponding enum on demand. Yet, 
 currently only TermQuery / TermScorer implements this API and other simply 
 return null instead. 
 To show that the API really works and our BulkPostings work fine too with 
 positions I cut over TermSpanQuery to use a TermScorer under the hood and 
 nuked TermSpans entirely. A nice sideeffect of this was that the Position 
 BulkReading implementation got some exercise which now :) work all with 
 positions while Payloads for bulkreading are kind of experimental in the 
 patch and those only work with Standard codec. 
 So all spans now work on top of TermScorer ( I truly hate spans since today ) 
 including the ones that need Payloads (StandardCodec ONLY)!!  I didn't bother 
 to implement the other codecs yet since I want to get feedback on the API and 
 on this first cut before I go one with it. I will upload the corresponding 
 patch in a minute. 
 I also had to cut over SpanQuery.getSpans(IR) to 
 SpanQuery.getSpans(AtomicReaderContext) which I should probably do on trunk 
 first but after that pain today I need a break first :).
 The patch passes all core tests 
 (org.apache.lucene.search.highlight.HighlighterTest still fails but I didn't 
 look into the MemoryIndex BulkPostings API yet)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-tests-only-trunk - Build # 14333 - Failure

2012-05-25 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/14333/

1 tests failed.
REGRESSION:  org.apache.solr.cloud.FullSolrCloudTest.testDistribSearch

Error Message:
Timeout occured while waiting response from server at: 
http://localhost:56592/solr/collection1

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting 
response from server at: http://localhost:56592/solr/collection1
at 
__randomizedtesting.SeedInfo.seed([B8E9683C451CA579:390FE6243243C545]:0)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:433)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:209)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
at 
org.apache.solr.cloud.FullSolrCloudTest.index_specific(FullSolrCloudTest.java:498)
at 
org.apache.solr.cloud.FullSolrCloudTest.brindDownShardIndexSomeDocsAndRecover(FullSolrCloudTest.java:713)
at 
org.apache.solr.cloud.FullSolrCloudTest.doTest(FullSolrCloudTest.java:550)
at 
org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:680)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
at 
org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53)
at 
org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605)
at

Re: [JENKINS] Solr-trunk - Build # 1865 - Failure


On May 25, 2012, at 12:26 PM, Dawid Weiss wrote:

 I don't think this can be explained by shutting down jetty... this
 seems too long. Can you provide a repeatable test case what would
 demonstrate the failure you mentioned? Once I have it it'll be easier
 to try to come up with workarounds.
 
 Dawid

No, I don't think it would be that easy to make a repeatable test case, so I 
don't think I'll have near term time for it. This one is not really a practical 
issue, so low on my priority list. It repeats on jenkins on the rare occasion ;)

Jetty seems more likely than the test framework to me - IW#close happens well 
before the test is over, and in the main thread, and that is what is 
interrupted (waiting for merges to finish)...and Jetty will send an interrupt 
on shutdown after the graceful shutdown timeout. Increasing that timeout will 
drastically lessen the chances of it happening - but we start and shutdown 
jetty serially, and that is likely why its so much longer - some tests use a 
lot of jetties.

Trying to stop jetties in parallel might be one thing to try obviously. 

- Mark Miller
lucidimagination.com












-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4062) More fine-grained control over the packed integer implementation that is chosen


 [ 
https://issues.apache.org/jira/browse/LUCENE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-4062.


   Resolution: Fixed
Fix Version/s: (was: 4.1)
   4.0

Thanks Adrien!

 More fine-grained control over the packed integer implementation that is 
 chosen
 ---

 Key: LUCENE-4062
 URL: https://issues.apache.org/jira/browse/LUCENE-4062
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/other
Reporter: Adrien Grand
Assignee: Michael McCandless
Priority: Minor
  Labels: performance
 Fix For: 4.0

 Attachments: LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, 
 LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch


 In order to save space, Lucene has two main PackedInts.Mutable implentations, 
 one that is very fast and is based on a byte/short/integer/long array 
 (Direct*) and another one which packs bits in a memory-efficient manner 
 (Packed*).
 The packed implementation tends to be much slower than the direct one, which 
 discourages some Lucene components to use it. On the other hand, if you store 
 21 bits integers in a Direct32, this is a space loss of (32-21)/32=35%.
 If you accept to trade some space for speed, you could store 3 of these 21 
 bits integers in a long, resulting in an overhead of 1/3 bit per value. One 
 advantage of this approach is that you never need to read more than one block 
 to read or write a value, so this can be significantly faster than Packed32 
 and Packed64 which always need to read/write two blocks in order to avoid 
 costly branches.
 I ran some tests, and for 1000 21 bits values, this implementation takes 
 less than 2% more space and has 44% faster writes and 30% faster reads. The 
 12 bits version (5 values per block) has the same performance improvement and 
 a 6% memory overhead compared to the packed implementation.
 In order to select the best implementation for a given integer size, I wrote 
 the {{PackedInts.getMutable(valueCount, bitsPerValue, 
 acceptableOverheadPerValue)}} method. This method select the fastest 
 implementation that has less than {{acceptableOverheadPerValue}} wasted bits 
 per value. For example, if you accept an overhead of 20% 
 ({{acceptableOverheadPerValue = 0.2f * bitsPerValue}}), which is pretty 
 reasonable, here is what implementations would be selected:
  * 1: Packed64SingleBlock1
  * 2: Packed64SingleBlock2
  * 3: Packed64SingleBlock3
  * 4: Packed64SingleBlock4
  * 5: Packed64SingleBlock5
  * 6: Packed64SingleBlock6
  * 7: Direct8
  * 8: Direct8
  * 9: Packed64SingleBlock9
  * 10: Packed64SingleBlock10
  * 11: Packed64SingleBlock12
  * 12: Packed64SingleBlock12
  * 13: Packed64
  * 14: Direct16
  * 15: Direct16
  * 16: Direct16
  * 17: Packed64
  * 18: Packed64SingleBlock21
  * 19: Packed64SingleBlock21
  * 20: Packed64SingleBlock21
  * 21: Packed64SingleBlock21
  * 22: Packed64
  * 23: Packed64
  * 24: Packed64
  * 25: Packed64
  * 26: Packed64
  * 27: Direct32
  * 28: Direct32
  * 29: Direct32
  * 30: Direct32
  * 31: Direct32
  * 32: Direct32
  * 33: Packed64
  * 34: Packed64
  * 35: Packed64
  * 36: Packed64
  * 37: Packed64
  * 38: Packed64
  * 39: Packed64
  * 40: Packed64
  * 41: Packed64
  * 42: Packed64
  * 43: Packed64
  * 44: Packed64
  * 45: Packed64
  * 46: Packed64
  * 47: Packed64
  * 48: Packed64
  * 49: Packed64
  * 50: Packed64
  * 51: Packed64
  * 52: Packed64
  * 53: Packed64
  * 54: Direct64
  * 55: Direct64
  * 56: Direct64
  * 57: Direct64
  * 58: Direct64
  * 59: Direct64
  * 60: Direct64
  * 61: Direct64
  * 62: Direct64
 Under 32 bits per value, only 13, 17 and 22-26 bits per value would still 
 choose the slower Packed64 implementation. Allowing a 50% overhead would 
 prevent the packed implementation to be selected for bits per value under 32. 
 Allowing an overhead of 32 bits per value would make sure that a Direct* 
 implementation is always selected.
 Next steps would be to:
  * make lucene components use this {{getMutable}} method and let users decide 
 what trade-off better suits them,
  * write a Packed32SingleBlock implementation if necessary (I didn't do it 
 because I have no 32-bits computer to test the performance improvements).
 I think this would allow more fine-grained control over the speed/space 
 trade-off, what do you think?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To

Jenkins build is back to normal : Lucene-Solr-trunk-Windows-Java6-64 #206

See 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/206/changes


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans

2012-05-25 Thread Alan Woodward (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-2878:
--

Attachment: LUCENE-2878.patch

New patch, implementing positions() for ReqExclScorer and ReqOptSumScorer, with 
a couple of basic tests.

These just return Conj/Disj PositionIterators, ignoring the excluded Scorers.  
It works in the simple cases that I've got here, but they may need to be made 
more complex when we take proximity searches into account.

 Allow Scorer to expose positions and payloads aka. nuke spans 
 --

 Key: LUCENE-2878
 URL: https://issues.apache.org/jira/browse/LUCENE-2878
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/search
Affects Versions: Positions Branch
Reporter: Simon Willnauer
Assignee: Simon Willnauer
  Labels: gsoc2011, gsoc2012, lucene-gsoc-11, lucene-gsoc-12, 
 mentor
 Fix For: Positions Branch

 Attachments: LUCENE-2878-OR.patch, LUCENE-2878.patch, 
 LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, 
 LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, 
 LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, 
 LUCENE-2878_trunk.patch, LUCENE-2878_trunk.patch, PosHighlighter.patch, 
 PosHighlighter.patch


 Currently we have two somewhat separate types of queries, the one which can 
 make use of positions (mainly spans) and payloads (spans). Yet Span*Query 
 doesn't really do scoring comparable to what other queries do and at the end 
 of the day they are duplicating lot of code all over lucene. Span*Queries are 
 also limited to other Span*Query instances such that you can not use a 
 TermQuery or a BooleanQuery with SpanNear or anthing like that. 
 Beside of the Span*Query limitation other queries lacking a quiet interesting 
 feature since they can not score based on term proximity since scores doesn't 
 expose any positional information. All those problems bugged me for a while 
 now so I stared working on that using the bulkpostings API. I would have done 
 that first cut on trunk but TermScorer is working on BlockReader that do not 
 expose positions while the one in this branch does. I started adding a new 
 Positions class which users can pull from a scorer, to prevent unnecessary 
 positions enums I added ScorerContext#needsPositions and eventually 
 Scorere#needsPayloads to create the corresponding enum on demand. Yet, 
 currently only TermQuery / TermScorer implements this API and other simply 
 return null instead. 
 To show that the API really works and our BulkPostings work fine too with 
 positions I cut over TermSpanQuery to use a TermScorer under the hood and 
 nuked TermSpans entirely. A nice sideeffect of this was that the Position 
 BulkReading implementation got some exercise which now :) work all with 
 positions while Payloads for bulkreading are kind of experimental in the 
 patch and those only work with Standard codec. 
 So all spans now work on top of TermScorer ( I truly hate spans since today ) 
 including the ones that need Payloads (StandardCodec ONLY)!!  I didn't bother 
 to implement the other codecs yet since I want to get feedback on the API and 
 on this first cut before I go one with it. I will upload the corresponding 
 patch in a minute. 
 I also had to cut over SpanQuery.getSpans(IR) to 
 SpanQuery.getSpans(AtomicReaderContext) which I should probably do on trunk 
 first but after that pain today I need a break first :).
 The patch passes all core tests 
 (org.apache.lucene.search.highlight.HighlighterTest still fails but I didn't 
 look into the MemoryIndex BulkPostings API yet)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Solr-trunk - Build # 1865 - Failure


On May 25, 2012, at 2:07 PM, Mark Miller wrote:

 Trying to stop jetties in parallel might be one thing to try obviously. 

But I still expected to see an ugly slowdown on many tests (eg even 30 seconds 
* 10 tests is a significant add).

It may be we simply have to do it in this one test though (add to the graceful 
exit time) - other tests don't have enough indexing occurring to cause long end 
merges I think.

- Mark Miller
lucidimagination.com












-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4055) Refactor SegmentInfo / FieldInfo to make them extensible

2012-05-25 Thread Andrzej Bialecki (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283725#comment-13283725
]

Andrzej Bialecki commented on LUCENE-4055:
---

+1, this looks very good.

One comment re. SegmentInfoPerCommit. This class is not extensible and contains
a fixed set of attributes. In LUCENE-3837 this or similar place would be the
ideal mechanism to carry info about stacked segments, since this information is
specific to a commit point. Unfortunately, there are no MapString,String
attributes on this level, so I guess for now this type of aux data will have to
be put in SegmentInfos.userData even though it's not index global but
segment-specific.

Refactor SegmentInfo / FieldInfo to make them extensible

Key: LUCENE-4055
URL: https://issues.apache.org/jira/browse/LUCENE-4055
Project: Lucene - Java
Issue Type: Improvement
Components: core/codecs
Reporter: Andrzej Bialecki
Assignee: Robert Muir
Fix For: 4.0

Attachments: LUCENE-4055.patch

After LUCENE-4050 is done the resulting SegmentInfo / FieldInfo classes
should be made abstract so that they can be extended by Codec-s.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4055) Refactor SegmentInfo / FieldInfo to make them extensible

2012-05-25 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283738#comment-13283738
 ] 

Robert Muir commented on LUCENE-4055:
-

Right, but I think this is correct: the codec should be responsible for 
encode/decode of inverted index segments only (the whole problem here 
originally was trying to have it also look after commits).

So it really shouldn't be customizing things about the commit, as that creates 
a confusing impedance mismatch.

I think things like stacked segments in LUCENE-3837 that need to do things 
other than implement encoding/decoding of segment should be above the codec 
level: since its a separate concern, if someone wants to have updatable fields 
thats unrelated to the integer compression algorithm used or what not.


 Refactor SegmentInfo / FieldInfo to make them extensible
 

 Key: LUCENE-4055
 URL: https://issues.apache.org/jira/browse/LUCENE-4055
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/codecs
Reporter: Andrzej Bialecki 
Assignee: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-4055.patch


 After LUCENE-4050 is done the resulting SegmentInfo / FieldInfo classes 
 should be made abstract so that they can be extended by Codec-s.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4055) Refactor SegmentInfo / FieldInfo to make them extensible

2012-05-25 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283755#comment-13283755
 ] 

Andrzej Bialecki  commented on LUCENE-4055:
---

bq. stacked segments in LUCENE-3837 that need to do things other than implement 
encoding/decoding of segment should be above the codec level ..
Certainly, that's why it would make sense to put this extended info in 
SegmentInfoPerCommit and not in any file handled by Codec.

 Refactor SegmentInfo / FieldInfo to make them extensible
 

 Key: LUCENE-4055
 URL: https://issues.apache.org/jira/browse/LUCENE-4055
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/codecs
Reporter: Andrzej Bialecki 
Assignee: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-4055.patch


 After LUCENE-4050 is done the resulting SegmentInfo / FieldInfo classes 
 should be made abstract so that they can be extended by Codec-s.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-4055) Refactor SegmentInfo / FieldInfo to make them extensible

2012-05-25 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283755#comment-13283755
 ] 

Andrzej Bialecki  edited comment on LUCENE-4055 at 5/25/12 9:32 PM:


bq. stacked segments in LUCENE-3837 that need to do things other than implement 
encoding/decoding of segment should be above the codec level ..
Certainly, that's why it would make sense to put this extended info in 
SegmentInfoPerCommit and not in any file handled by Codec. My comment was about 
the lack of easy extensibility of the codec-independent per-segment data 
(SegmentInfoPerCommit - info about stacked data is per-segment and per-commit), 
so LUCENE-3837 will need to use for now the codec-independent index-global data 
(SegmentInfos). It's not ideal but not a deal breaker either, especially since 
we now have version info in both of these places.

  was (Author: ab):
bq. stacked segments in LUCENE-3837 that need to do things other than 
implement encoding/decoding of segment should be above the codec level ..
Certainly, that's why it would make sense to put this extended info in 
SegmentInfoPerCommit and not in any file handled by Codec.
  
 Refactor SegmentInfo / FieldInfo to make them extensible
 

 Key: LUCENE-4055
 URL: https://issues.apache.org/jira/browse/LUCENE-4055
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/codecs
Reporter: Andrzej Bialecki 
Assignee: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-4055.patch


 After LUCENE-4050 is done the resulting SegmentInfo / FieldInfo classes 
 should be made abstract so that they can be extended by Codec-s.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-3489) Config file replication less error prone


 [ 
https://issues.apache.org/jira/browse/SOLR-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl resolved SOLR-3489.
---

Resolution: Fixed

Thanks for reporting. You patch (which is identical with the trunk code) is 
committed to branch 3_6

 Config file replication less error prone
 

 Key: SOLR-3489
 URL: https://issues.apache.org/jira/browse/SOLR-3489
 Project: Solr
  Issue Type: Improvement
  Components: replication (java)
Affects Versions: 3.6
Reporter: Jochen Just
Assignee: Jan Høydahl
Priority: Minor
 Attachments: SOLR-3489.patch, SOLR-3489_reproducing_config.tar.gz


 If the listing of configuration files that should be replicated contains a 
 space, the following file is not replicated.
 Example:
 {code:xml}
 !-- The error in the configuration is the space before stopwords.txt.
  Because of that that file is not replicated --
 str name=confFilesschema.xml,test.txt, stopwords.txt/str
 {code}
 It would be nice, if that space simply would be ignored.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-3489) Config file replication less error prone