Sponsoring porting work

2012-06-11 Thread Itamar Syn-Hershko
Hi devs,

We are looking to sponsor porting work, to help with keeping up the pace of
development and help Lucene.Net be closer to Java Lucene. Unfortunately the
amount of work I can put on this is very limited, and being up to speed
with Lucene is important to us, hence the idea to offer sponsorship.

I'm not entirely sure how these things work under the Apache umbrella, but
I'd imagine there isn't a real issue doing that. All work will be handed
back to the project under the ASL of course. I'd appreciate any guidance if
needed.

In the meantime, interested parties are welcome to contact me privately.

Itamar.


[jira] [Commented] (LUCENENET-493) Make lucene.net culture insensitive (like the java version)

2012-06-11 Thread Christopher Currens (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293189#comment-13293189
 ] 

Christopher Currens commented on LUCENENET-493:
---

This is rather annoying, actually.  Java has tests for different cultures wired 
into the test suite.  Interestingly enough, so do we, but because of the 
differences between JUnit and NUnit (namely attribute based test discovery), we 
can't override the test running implementation in the same way java does.  So, 
the code we've ported for testing cultures does not work...period.  NUnit 
supports changing the cultures via attributes, but only a single culture.  
MbUnit allows multiple cultures and will run the test each time in that 
culture.  We should find a workaround.

 Make lucene.net culture insensitive (like the java version)
 ---

 Key: LUCENENET-493
 URL: https://issues.apache.org/jira/browse/LUCENENET-493
 Project: Lucene.Net
  Issue Type: Bug
  Components: Lucene.Net Core, Lucene.Net Test
Affects Versions: Lucene.Net 3.0.3
Reporter: Luc Vanlerberghe
  Labels: patch
 Fix For: Lucene.Net 3.0.3

 Attachments: Lucenenet-493.patch


 In Java, conversion of the basic types to and from strings is locale 
 (culture) independent. For localized input/output one needs to use the 
 classes in the java.text package.
 In .Net, conversion of the basic types to and from strings depends on the 
 default Culture.  Otherwise you have to specify CultureInfo.InvariantCulture 
 explicitly.
 Some of the testcases in lucene.net fail if they are not run on a machine 
 with culture set to US.
 In the current version of lucene.net there are patches here and there that 
 try to correct for some specific cases by using string replacement (like  
 System.Double.Parse(s.Replace(., 
 CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator)), but that 
 seems really ugly.
 I submit a patch here that removes the old workarounds and replaces them by 
 calls to classes in the Lucene.Net.Support namespace that try to handle the 
 conversions in a compatible way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (LUCENE-4118) FastVectorHighlighter fail to highlight taking in input some proximity query.

2012-06-11 Thread Koji Sekiguchi (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi reassigned LUCENE-4118:
--

Assignee: Koji Sekiguchi

 FastVectorHighlighter fail to highlight taking in input some proximity query.
 -

 Key: LUCENE-4118
 URL: https://issues.apache.org/jira/browse/LUCENE-4118
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/highlighter
Affects Versions: 3.4, 5.0
Reporter: Emanuele Lombardi
Assignee: Koji Sekiguchi
  Labels: FastVectorHighlighter
 Attachments: FVHPatch.txt


 There are 2 related bug with proximity query
 1) In a phrase there are n repeated terms the FVH module fails to highlight 
 that.
 see testRepeatedTermsWithSlop
 2) If you search the terms reversed the FVH module fails to highlight that.
 see testReversedTermsWithSlop

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1348623 - in /lucene/dev/branches/branch_4x: ./ dev-tools/ lucene/ lucene/analysis/ lucene/analysis/common/ lucene/analysis/common/src/java/org/apache/lucene/analysis/standard/std31/

2012-06-11 Thread Simon Willnauer
fixed thanks dawid!

On Sun, Jun 10, 2012 at 7:12 PM, Dawid Weiss dawid.we...@gmail.com wrote:
 Synchonizer - Synchronizer?

 D.

 On Sun, Jun 10, 2012 at 6:42 PM,  sim...@apache.org wrote:
 Author: simonw
 Date: Sun Jun 10 16:42:55 2012
 New Revision: 1348623

 URL: http://svn.apache.org/viewvc?rev=1348623view=rev
 Log:
 LUCENE-4116: fix concurrency test for DWPTStallControl

 Modified:
    lucene/dev/branches/branch_4x/   (props changed)
    lucene/dev/branches/branch_4x/dev-tools/   (props changed)
    lucene/dev/branches/branch_4x/lucene/   (props changed)
    lucene/dev/branches/branch_4x/lucene/BUILD.txt   (props changed)
    lucene/dev/branches/branch_4x/lucene/CHANGES.txt   (props changed)
    lucene/dev/branches/branch_4x/lucene/JRE_VERSION_MIGRATION.txt   (props 
 changed)
    lucene/dev/branches/branch_4x/lucene/LICENSE.txt   (props changed)
    lucene/dev/branches/branch_4x/lucene/MIGRATE.txt   (props changed)
    lucene/dev/branches/branch_4x/lucene/NOTICE.txt   (props changed)
    lucene/dev/branches/branch_4x/lucene/README.txt   (props changed)
    lucene/dev/branches/branch_4x/lucene/analysis/   (props changed)
    lucene/dev/branches/branch_4x/lucene/analysis/common/   (props changed)
    
 lucene/dev/branches/branch_4x/lucene/analysis/common/src/java/org/apache/lucene/analysis/standard/std31/package.html
    (props changed)
    
 lucene/dev/branches/branch_4x/lucene/analysis/common/src/java/org/apache/lucene/analysis/standard/std34/package.html
    (props changed)
    lucene/dev/branches/branch_4x/lucene/backwards/   (props changed)
    lucene/dev/branches/branch_4x/lucene/benchmark/   (props changed)
    lucene/dev/branches/branch_4x/lucene/build.xml   (props changed)
    lucene/dev/branches/branch_4x/lucene/common-build.xml   (props changed)
    lucene/dev/branches/branch_4x/lucene/core/   (props changed)
    
 lucene/dev/branches/branch_4x/lucene/core/src/java/org/apache/lucene/index/DocumentsWriterStallControl.java
    
 lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/index/TestDocumentsWriterStallControl.java
    lucene/dev/branches/branch_4x/lucene/demo/   (props changed)
    lucene/dev/branches/branch_4x/lucene/facet/   (props changed)
    lucene/dev/branches/branch_4x/lucene/grouping/   (props changed)
    lucene/dev/branches/branch_4x/lucene/highlighter/   (props changed)
    lucene/dev/branches/branch_4x/lucene/ivy-settings.xml   (props changed)
    lucene/dev/branches/branch_4x/lucene/join/   (props changed)
    lucene/dev/branches/branch_4x/lucene/memory/   (props changed)
    lucene/dev/branches/branch_4x/lucene/misc/   (props changed)
    lucene/dev/branches/branch_4x/lucene/module-build.xml   (props changed)
    lucene/dev/branches/branch_4x/lucene/queries/   (props changed)
    lucene/dev/branches/branch_4x/lucene/queryparser/   (props changed)
    lucene/dev/branches/branch_4x/lucene/sandbox/   (props changed)
    lucene/dev/branches/branch_4x/lucene/site/   (props changed)
    lucene/dev/branches/branch_4x/lucene/spatial/   (props changed)
    lucene/dev/branches/branch_4x/lucene/suggest/   (props changed)
    lucene/dev/branches/branch_4x/lucene/test-framework/   (props changed)
    lucene/dev/branches/branch_4x/lucene/tools/   (props changed)
    lucene/dev/branches/branch_4x/solr/   (props changed)
    lucene/dev/branches/branch_4x/solr/CHANGES.txt   (props changed)
    lucene/dev/branches/branch_4x/solr/LICENSE.txt   (props changed)
    lucene/dev/branches/branch_4x/solr/NOTICE.txt   (props changed)
    lucene/dev/branches/branch_4x/solr/README.txt   (props changed)
    lucene/dev/branches/branch_4x/solr/build.xml   (props changed)
    lucene/dev/branches/branch_4x/solr/cloud-dev/   (props changed)
    lucene/dev/branches/branch_4x/solr/common-build.xml   (props changed)
    lucene/dev/branches/branch_4x/solr/contrib/   (props changed)
    lucene/dev/branches/branch_4x/solr/core/   (props changed)
    lucene/dev/branches/branch_4x/solr/dev-tools/   (props changed)
    lucene/dev/branches/branch_4x/solr/example/   (props changed)
    lucene/dev/branches/branch_4x/solr/lib/   (props changed)
    lucene/dev/branches/branch_4x/solr/lib/httpclient-LICENSE-ASL.txt   
 (props changed)
    lucene/dev/branches/branch_4x/solr/lib/httpclient-NOTICE.txt   (props 
 changed)
    lucene/dev/branches/branch_4x/solr/lib/httpcore-LICENSE-ASL.txt   (props 
 changed)
    lucene/dev/branches/branch_4x/solr/lib/httpcore-NOTICE.txt   (props 
 changed)
    lucene/dev/branches/branch_4x/solr/lib/httpmime-LICENSE-ASL.txt   (props 
 changed)
    lucene/dev/branches/branch_4x/solr/lib/httpmime-NOTICE.txt   (props 
 changed)
    lucene/dev/branches/branch_4x/solr/scripts/   (props changed)
    lucene/dev/branches/branch_4x/solr/solrj/   (props changed)
    lucene/dev/branches/branch_4x/solr/test-framework/   (props changed)
    lucene/dev/branches/branch_4x/solr/testlogging.properties   (props 
 changed)
    lucene/dev/branches/branch_4x/solr/webapp/   (props changed)

 

[JENKINS] Lucene-Solr-trunk-Windows-Java7-64 - Build # 291 - Failure!

2012-06-11 Thread jenkins
Build: 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/291/

1 tests failed.
REGRESSION:  org.apache.solr.handler.TestReplicationHandler.test

Error Message:
expected:494 but was:0

Stack Trace:
java.lang.AssertionError: expected:494 but was:0
at 
__randomizedtesting.SeedInfo.seed([B032809CBA0B0DAB:3866BF4614F76053]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at org.junit.Assert.assertEquals(Assert.java:456)
at 
org.apache.solr.handler.TestReplicationHandler.doTestIndexAndConfigAliasReplication(TestReplicationHandler.java:716)
at 
org.apache.solr.handler.TestReplicationHandler.test(TestReplicationHandler.java:254)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
at 
org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53)
at 
org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551)




Build Log:
[...truncated 17404 lines...]
   [junit4]   2 56759 T3040 C189 REQ [collection1] webapp=/solr 
path=/replication 
params={command=filecontentchecksum=truegeneration=16wt=filestreamfile=_9.fnm}
 status=0 QTime=0 
   [junit4]   2 56764 T3040 C189 REQ 

[jira] [Commented] (SOLR-2724) Deprecate defaultSearchField and defaultOperator defined in schema.xml

2012-06-11 Thread Bernd Fehling (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292688#comment-13292688
 ] 

Bernd Fehling commented on SOLR-2724:
-

So this issue is stale since March and Solr hangs now between heaven and hell 
which means that defaultSearchField is disabled in schema.xml and marked as 
deprecated, but the method getDefaultSearchFieldName() still exists and gives 
now no fallback to a default. This is bad and breaks pieces of Solr, like 
edismax and several more.
And the solution with df in the defaults of RequestHandler is also not working.

Now what, revert and release a fixed 3.6.2 or fix getDefaultSearchFieldName() 
and release a 3.6.2?

What about defaultOperator, is this one having/producing the same kind of 
problems as defaultSearchField?



 Deprecate defaultSearchField and defaultOperator defined in schema.xml
 --

 Key: SOLR-2724
 URL: https://issues.apache.org/jira/browse/SOLR-2724
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis, search
Reporter: David Smiley
Assignee: David Smiley
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: 
 SOLR-2724_deprecateDefaultSearchField_and_defaultOperator.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 I've always been surprised to see the defaultSearchField element and 
 solrQueryParser defaultOperator=OR/ defined in the schema.xml file since 
 the first time I saw them.  They just seem out of place to me since they are 
 more query parser related than schema related. But not only are they 
 misplaced, I feel they shouldn't exist. For query parsers, we already have a 
 df parameter that works just fine, and explicit field references. And the 
 default lucene query operator should stay at OR -- if a particular query 
 wants different behavior then use q.op or simply use OR.
 similarity Seems like something better placed in solrconfig.xml than in the 
 schema. 
 In my opinion, defaultSearchField and defaultOperator configuration elements 
 should be deprecated in Solr 3.x and removed in Solr 4.  And similarity 
 should move to solrconfig.xml. I am willing to do it, provided there is 
 consensus on it of course.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3526) Remove classfile dependency on ZooKeeper from CoreContainer

2012-06-11 Thread Michael Froh (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292689#comment-13292689
 ] 

Michael Froh commented on SOLR-3526:


Oh, thanks a lot for pointing that out, Hoss! I had completely missed that part.

My wish for the removal of the KeeperException reference from CoreContainer 
still stands, but using NoOpDistributingUpdateProcessorFactory lets me remove 
my current hacky solution (adding a dummy org.apache.zookeeper.KeeperException 
in one of my libraries).

 Remove classfile dependency on ZooKeeper from CoreContainer
 ---

 Key: SOLR-3526
 URL: https://issues.apache.org/jira/browse/SOLR-3526
 Project: Solr
  Issue Type: Wish
  Components: SolrCloud
Affects Versions: 4.0
Reporter: Michael Froh

 We are using Solr as a library embedded within an existing application, and 
 are currently developing toward using 4.0 when it is released.
 We are currently instantiating SolrCores with null CoreDescriptors (and hence 
 no CoreContainer), since we don't need SolrCloud functionality (and do not 
 want to depend on ZooKeeper).
 A couple of months ago, SearchHandler was modified to try to retrieve a 
 ShardHandlerFactory from the CoreContainer. I was able to work around this by 
 specifying a dummy ShardHandlerFactory in the config.
 Now UpdateRequestProcessorChain is inserting a DistributedUpdateProcessor 
 into my chains, again triggering a NPE when trying to dereference the 
 CoreDescriptor.
 I would happily place the SolrCores in CoreContainers, except that 
 CoreContainer imports and references org.apache.zookeeper.KeeperException, 
 which we do not have (and do not want) in our classpath. Therefore, I get a 
 ClassNotFoundException when loading the CoreContainer class.
 Ideally (IMHO), ZkController should isolate the ZooKeeper dependency, and 
 simply rethrow KeeperExceptions as 
 org.apache.solr.common.cloud.ZooKeeperException (or some Solr-hosted checked 
 exception). Then CoreContainer could remove the offending import/references.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3488) Create a Collections API for SolrCloud

2012-06-11 Thread Tommaso Teofili (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292697#comment-13292697
 ] 

Tommaso Teofili commented on SOLR-3488:
---

bq.  I think though, that we should really change how things work - so that you 
just pass the number of shards and the number of replicas, and the overseer 
just ensures the collection is on the right number of nodes. Then we don't have 
to have this 'template' collection to figure out what nodes to create on - or 
explicitly pass the nodes.

sure, +1.

bq. Sami has a distributed work queue for the overseer setup now, and I'm 
working on integrating this with that.

that looks great. By the way, I think it would be good if that could be also 
(optionally) used for indexing in SolrCloud.

 Create a Collections API for SolrCloud
 --

 Key: SOLR-3488
 URL: https://issues.apache.org/jira/browse/SOLR-3488
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
 Attachments: SOLR-3488.patch, SOLR-3488_2.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3532) Promote shutdown method to SolrServer

2012-06-11 Thread Sami Siren (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sami Siren updated SOLR-3532:
-

Attachment: SOLR-3532.patch

trivial patch

 Promote shutdown method to SolrServer
 -

 Key: SOLR-3532
 URL: https://issues.apache.org/jira/browse/SOLR-3532
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Reporter: Sami Siren
Assignee: Sami Siren
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-3532.patch


 Currently every java client implements shutdown, (LBHttpSolrServer has 
 close). I think it makes sense to promote #shutdown method to SolrServer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-3532) Promote shutdown method to SolrServer

2012-06-11 Thread Sami Siren (JIRA)
Sami Siren created SOLR-3532:


 Summary: Promote shutdown method to SolrServer
 Key: SOLR-3532
 URL: https://issues.apache.org/jira/browse/SOLR-3532
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Reporter: Sami Siren
Assignee: Sami Siren
Priority: Minor
 Fix For: 4.0
 Attachments: SOLR-3532.patch

Currently every java client implements shutdown, (LBHttpSolrServer has close). 
I think it makes sense to promote #shutdown method to SolrServer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-3533) Show CharFilters in Schema Browser

2012-06-11 Thread Erik Hatcher (JIRA)
Erik Hatcher created SOLR-3533:
--

 Summary: Show CharFilters in Schema Browser
 Key: SOLR-3533
 URL: https://issues.apache.org/jira/browse/SOLR-3533
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.0
Reporter: Erik Hatcher
Priority: Minor
 Fix For: 4.0


Schema Browser (on trunk) currently does not show CharFilters.  The example/ 
schema has this definition that can be used to demonstrate, though it needs to 
be uncommented:

{code}
fieldType name=text_char_norm class=solr.TextField 
positionIncrementGap=100 
  analyzer
charFilter class=solr.MappingCharFilterFactory 
mapping=mapping-ISOLatin1Accent.txt/
tokenizer class=solr.WhitespaceTokenizerFactory/
  /analyzer
/fieldType
{code}   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3533) Show CharFilters in Schema Browser

2012-06-11 Thread Erik Hatcher (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher updated SOLR-3533:
---

Attachment: SOLR-3533.patch

This patch adds char filters to the index and query analysis sections.

 Show CharFilters in Schema Browser
 --

 Key: SOLR-3533
 URL: https://issues.apache.org/jira/browse/SOLR-3533
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.0
Reporter: Erik Hatcher
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-3533.patch


 Schema Browser (on trunk) currently does not show CharFilters.  The example/ 
 schema has this definition that can be used to demonstrate, though it needs 
 to be uncommented:
 {code}
 fieldType name=text_char_norm class=solr.TextField 
 positionIncrementGap=100 
   analyzer
 charFilter class=solr.MappingCharFilterFactory 
 mapping=mapping-ISOLatin1Accent.txt/
 tokenizer class=solr.WhitespaceTokenizerFactory/
   /analyzer
 /fieldType
 {code}   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-3533) Show CharFilters in Schema Browser

2012-06-11 Thread Erik Hatcher (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher reassigned SOLR-3533:
--

Assignee: Stefan Matheis (steffkes)

I'm going to assign to Stefan to vet/commit, to ensure my first foray into the 
new UI is on track.

 Show CharFilters in Schema Browser
 --

 Key: SOLR-3533
 URL: https://issues.apache.org/jira/browse/SOLR-3533
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.0
Reporter: Erik Hatcher
Assignee: Stefan Matheis (steffkes)
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-3533.patch


 Schema Browser (on trunk) currently does not show CharFilters.  The example/ 
 schema has this definition that can be used to demonstrate, though it needs 
 to be uncommented:
 {code}
 fieldType name=text_char_norm class=solr.TextField 
 positionIncrementGap=100 
   analyzer
 charFilter class=solr.MappingCharFilterFactory 
 mapping=mapping-ISOLatin1Accent.txt/
 tokenizer class=solr.WhitespaceTokenizerFactory/
   /analyzer
 /fieldType
 {code}   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3533) Show CharFilters in Schema Browser

2012-06-11 Thread Erik Hatcher (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292738#comment-13292738
 ] 

Erik Hatcher commented on SOLR-3533:


Another nicety would be to make the mapping parameter be special like 
synonyms and words are now in order to (eventually, I know it's not enabled 
on trunk at the moment) link them.

 Show CharFilters in Schema Browser
 --

 Key: SOLR-3533
 URL: https://issues.apache.org/jira/browse/SOLR-3533
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.0
Reporter: Erik Hatcher
Assignee: Stefan Matheis (steffkes)
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-3533.patch


 Schema Browser (on trunk) currently does not show CharFilters.  The example/ 
 schema has this definition that can be used to demonstrate, though it needs 
 to be uncommented:
 {code}
 fieldType name=text_char_norm class=solr.TextField 
 positionIncrementGap=100 
   analyzer
 charFilter class=solr.MappingCharFilterFactory 
 mapping=mapping-ISOLatin1Accent.txt/
 tokenizer class=solr.WhitespaceTokenizerFactory/
   /analyzer
 /fieldType
 {code}   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-4.x-Windows-Java6-64 - Build # 46 - Failure!

2012-06-11 Thread jenkins
Build: 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java6-64/46/

1 tests failed.
REGRESSION:  org.apache.solr.cloud.RecoveryZkTest.testDistribSearch

Error Message:
Thread threw an uncaught exception, thread: Thread[Lucene Merge Thread #0,6,]

Stack Trace:
java.lang.RuntimeException: Thread threw an uncaught exception, thread: 
Thread[Lucene Merge Thread #0,6,]
at 
com.carrotsearch.randomizedtesting.RunnerThreadGroup.processUncaught(RunnerThreadGroup.java:96)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:857)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
at 
org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53)
at 
org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551)
Caused by: org.apache.lucene.index.MergePolicy$MergeException: 
org.apache.lucene.store.AlreadyClosedException: this Directory is closed
at __randomizedtesting.SeedInfo.seed([918444B633412A98]:0)
at 
org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:507)
at 
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:480)
Caused by: org.apache.lucene.store.AlreadyClosedException: this Directory is 
closed
at org.apache.lucene.store.Directory.ensureOpen(Directory.java:244)
at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:241)
at 
org.apache.lucene.index.IndexFileDeleter.refresh(IndexFileDeleter.java:321)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3149)
at 
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:382)
at 
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:451)




Build Log:
[...truncated 22375 lines...]
   [junit4]   2 50960 T349 oascc.ZkStateReader.updateCloudState Manual update 
of cluster state initiated
   [junit4]   2 50960 T349 oascc.ZkStateReader.updateCloudState Updating cloud 
state from ZooKeeper... 
   [junit4]   2 50961 T349 oasc.Overseer$CloudStateUpdater.run Announcing new 
cluster state
   [junit4]   2 50963 T209 oascc.ZkStateReader$2.process A cluster state 
change has occurred
   [junit4]   2 50963 T205 oascc.ZkStateReader$2.process A cluster state 
change has occurred
   [junit4]   2 50965 T250 oascc.ZkStateReader$2.process A cluster state 
change has occurred
   [junit4]   2 50986 T155 oasc.CoreContainer.shutdown Shutting down 
CoreContainer instance=1794423774
   [junit4]   2 50986 T155 oasc.RecoveryStrategy.close WARNING Stopping 
recovery for core collection1 zkNodeName=127.0.0.1:61728_solr_collection1
   [junit4]   2 50986 T155 oasc.SolrCore.close [collection1]  CLOSING SolrCore 
org.apache.solr.core.SolrCore@4b919723
   [junit4]   2 50991 T155 oasc.SolrCore.closeSearcher [collection1] Closing 
main searcher on request.
   [junit4]   2 50991 T155 oasu.DirectUpdateHandler2.close closing 
DirectUpdateHandler2{commits=6,autocommits=0,soft 

[jira] [Commented] (SOLR-3533) Show CharFilters in Schema Browser

2012-06-11 Thread Erik Hatcher (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292740#comment-13292740
 ] 

Erik Hatcher commented on SOLR-3533:


bq. in order to (eventually, I know it's not enabled on trunk at the moment) 
link them

FYI - the link to the mapping file in this example is this:  
http://localhost:8983/solr/admin/file?file=mapping-ISOLatin1Accent.txt
(optionally with core name in the URL too of course), so maybe we can spin off 
another issue to add these links in to those files now?

 Show CharFilters in Schema Browser
 --

 Key: SOLR-3533
 URL: https://issues.apache.org/jira/browse/SOLR-3533
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.0
Reporter: Erik Hatcher
Assignee: Stefan Matheis (steffkes)
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-3533.patch


 Schema Browser (on trunk) currently does not show CharFilters.  The example/ 
 schema has this definition that can be used to demonstrate, though it needs 
 to be uncommented:
 {code}
 fieldType name=text_char_norm class=solr.TextField 
 positionIncrementGap=100 
   analyzer
 charFilter class=solr.MappingCharFilterFactory 
 mapping=mapping-ISOLatin1Accent.txt/
 tokenizer class=solr.WhitespaceTokenizerFactory/
   /analyzer
 /fieldType
 {code}   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3533) Show CharFilters in Schema Browser

2012-06-11 Thread Erik Hatcher (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292741#comment-13292741
 ] 

Erik Hatcher commented on SOLR-3533:


oh, and my patch also changes Filters to Token Filters to make it labeled a 
little differently than Char Filters.

 Show CharFilters in Schema Browser
 --

 Key: SOLR-3533
 URL: https://issues.apache.org/jira/browse/SOLR-3533
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.0
Reporter: Erik Hatcher
Assignee: Stefan Matheis (steffkes)
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-3533.patch


 Schema Browser (on trunk) currently does not show CharFilters.  The example/ 
 schema has this definition that can be used to demonstrate, though it needs 
 to be uncommented:
 {code}
 fieldType name=text_char_norm class=solr.TextField 
 positionIncrementGap=100 
   analyzer
 charFilter class=solr.MappingCharFilterFactory 
 mapping=mapping-ISOLatin1Accent.txt/
 tokenizer class=solr.WhitespaceTokenizerFactory/
   /analyzer
 /fieldType
 {code}   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-4129) add CodecHeader to .frq and .prx

2012-06-11 Thread Robert Muir (JIRA)
Robert Muir created LUCENE-4129:
---

 Summary: add CodecHeader to .frq and .prx
 Key: LUCENE-4129
 URL: https://issues.apache.org/jira/browse/LUCENE-4129
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Robert Muir
Assignee: Robert Muir


We did this for all other files, but not .frq/.prx.

Currently the postings writer only records itself in the blocktree terms 
dictionary, which is fine, but thats really documenting the .tim itself, that 
it is Blocktree with Lucene40Postings metadata.

I think we should put headers in .frq/.prx as well: e.g. it could detect file 
jumbling.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4129) add CodecHeader to .frq and .prx

2012-06-11 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-4129:


Attachment: LUCENE-4129.patch

patch: this found a bug in NestedPulsing in the disk full tests.

I also changed pulsing to be more clear that the inner postings reader/writer 
is being closed here: theoretically its possible the pulsingreader/writer ctor 
could throw an exception and we would have a leak.

 add CodecHeader to .frq and .prx
 

 Key: LUCENE-4129
 URL: https://issues.apache.org/jira/browse/LUCENE-4129
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Robert Muir
Assignee: Robert Muir
 Attachments: LUCENE-4129.patch


 We did this for all other files, but not .frq/.prx.
 Currently the postings writer only records itself in the blocktree terms 
 dictionary, which is fine, but thats really documenting the .tim itself, that 
 it is Blocktree with Lucene40Postings metadata.
 I think we should put headers in .frq/.prx as well: e.g. it could detect file 
 jumbling.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4129) add CodecHeader to .frq and .prx

2012-06-11 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-4129:


Attachment: LUCENE-4129.patch

updated patch to actually check the header :)

 add CodecHeader to .frq and .prx
 

 Key: LUCENE-4129
 URL: https://issues.apache.org/jira/browse/LUCENE-4129
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Robert Muir
Assignee: Robert Muir
 Attachments: LUCENE-4129.patch, LUCENE-4129.patch


 We did this for all other files, but not .frq/.prx.
 Currently the postings writer only records itself in the blocktree terms 
 dictionary, which is fine, but thats really documenting the .tim itself, that 
 it is Blocktree with Lucene40Postings metadata.
 I think we should put headers in .frq/.prx as well: e.g. it could detect file 
 jumbling.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



LBHttpSolrServer doc needs a little improvement

2012-06-11 Thread Jack Krupansky
The wiki for LBHttpSolrServer is a little out of date. It says the feature 
“experimental” and “currently being developed” even though SOLR-844 is closed. 
There are a couple of “LB!HttpSolrServer” links that point to nonexistent 
pages. The class javadoc has half but not all of the doc from the wiki. The 
simplest solution may be to move the rest of the wiki doc into the javadoc. I’m 
not sure what should be done with the wiki though. How can a wiki link to 
javadoc when the link depends on Solr release? Or, maybe just make the wiki and 
javadoc be the same.

And the SolrJ wiki makes no mention of LBHttpSolrServer.

-- Jack Krupansky

[jira] [Updated] (LUCENE-4128) add safety to preflex segmentinfo upgrade

2012-06-11 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-4128:
---

Fix Version/s: 4.0

 add safety to preflex segmentinfo upgrade
 ---

 Key: LUCENE-4128
 URL: https://issues.apache.org/jira/browse/LUCENE-4128
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Robert Muir
Assignee: Michael McCandless
 Fix For: 4.0

 Attachments: LUCENE-4128.patch


 Currently the one-time-upgrade depends on whether the upgraded .si file 
 exists. And the writing is done in a try/finally so its removed if 
 ioexception happens.
 but I think there could be a power-loss or something else in the middle of 
 this, the upgraded .si file could be bogus, then the user would have to 
 manually remove it (they probably wouldnt know).
 i think instead we should just have a marker file on completion, that we 
 create after we successfully fsync the upgraded .si file. this way if 
 something happens we just rewrite the thing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4128) add safety to preflex segmentinfo upgrade

2012-06-11 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-4128:
---

Attachment: LUCENE-4128.patch

Patch.

 add safety to preflex segmentinfo upgrade
 ---

 Key: LUCENE-4128
 URL: https://issues.apache.org/jira/browse/LUCENE-4128
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Robert Muir
Assignee: Michael McCandless
 Fix For: 4.0

 Attachments: LUCENE-4128.patch


 Currently the one-time-upgrade depends on whether the upgraded .si file 
 exists. And the writing is done in a try/finally so its removed if 
 ioexception happens.
 but I think there could be a power-loss or something else in the middle of 
 this, the upgraded .si file could be bogus, then the user would have to 
 manually remove it (they probably wouldnt know).
 i think instead we should just have a marker file on completion, that we 
 create after we successfully fsync the upgraded .si file. this way if 
 something happens we just rewrite the thing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4128) add safety to preflex segmentinfo upgrade

2012-06-11 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292765#comment-13292765
 ] 

Robert Muir commented on LUCENE-4128:
-

We write a codec header for the upgraded marker file, so instead of relying 
upon File.exists we could add a deprecated method
to SegmentInfos hasMarkerFile that just opens it and does CheckHeader, 
returning false if there is any exception?

 add safety to preflex segmentinfo upgrade
 ---

 Key: LUCENE-4128
 URL: https://issues.apache.org/jira/browse/LUCENE-4128
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Robert Muir
Assignee: Michael McCandless
 Fix For: 4.0

 Attachments: LUCENE-4128.patch


 Currently the one-time-upgrade depends on whether the upgraded .si file 
 exists. And the writing is done in a try/finally so its removed if 
 ioexception happens.
 but I think there could be a power-loss or something else in the middle of 
 this, the upgraded .si file could be bogus, then the user would have to 
 manually remove it (they probably wouldnt know).
 i think instead we should just have a marker file on completion, that we 
 create after we successfully fsync the upgraded .si file. this way if 
 something happens we just rewrite the thing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4061) Improvements to DirectoryTaxonomyWriter (synchronization and others)

2012-06-11 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-4061:
---

Attachment: LUCENE-4061.patch

Fixing the concurrency issue was hairy, and required lots of changes to 
DirTaxoWriter:

* Needed a ReaderManager, so added such in core under o.a.l.index. Separately, 
I think that we should move RefManager to o.a.l.util instead of o.a.l.search.

* DirTaxoWriter was not very well built for concurrency :), so many changes had 
to be done to it.

* TaxoWriterCache.hasRoom(int) has been replaced by isFull().

* TestDirTaxoWriter has been enhanced to sometimes, during nightly builds, use 
a NoOpCache, as it uncovered some bugs too ! (yet it makes the test horribly 
slow, hence why the nightly criteria, and very low chances still).

I ran DirTaxoWriter.testConcurrency over 1000 times and no failures, so I'm 
inclined to believe the concurrency issues are now resolved. Still, a second 
(and third and even a forth) look by someone else would be appreciated.

I'll commit it tomorrow if no one will object, and port to 4x.

 Improvements to DirectoryTaxonomyWriter (synchronization and others)
 

 Key: LUCENE-4061
 URL: https://issues.apache.org/jira/browse/LUCENE-4061
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Fix For: 4.0

 Attachments: LUCENE-4061.patch, LUCENE-4061.patch


 DirTaxoWriter synchronizes in too many places. For instance addCategory() is 
 fully synchronized, while only a small part of it needs to be.
 Additionally, getCacheMemoryUsage looks bogus - it depends on the type of the 
 TaxoWriterCache. No code uses it, so I'd like to remove it -- whoever is 
 interested can query the specific cache impl it has. Currently, only 
 Cl2oTaxoWriterCache supports it.
 If the changes will be simple, I'll port them to 3.6.1 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4128) add safety to preflex segmentinfo upgrade

2012-06-11 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292770#comment-13292770
 ] 

Michael McCandless commented on LUCENE-4128:


Good idea, I'll do that!

 add safety to preflex segmentinfo upgrade
 ---

 Key: LUCENE-4128
 URL: https://issues.apache.org/jira/browse/LUCENE-4128
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Robert Muir
Assignee: Michael McCandless
 Fix For: 4.0

 Attachments: LUCENE-4128.patch


 Currently the one-time-upgrade depends on whether the upgraded .si file 
 exists. And the writing is done in a try/finally so its removed if 
 ioexception happens.
 but I think there could be a power-loss or something else in the middle of 
 this, the upgraded .si file could be bogus, then the user would have to 
 manually remove it (they probably wouldnt know).
 i think instead we should just have a marker file on completion, that we 
 create after we successfully fsync the upgraded .si file. this way if 
 something happens we just rewrite the thing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-4130) CompoundFileDirectory.listAll is broken

2012-06-11 Thread Robert Muir (JIRA)
Robert Muir created LUCENE-4130:
---

 Summary: CompoundFileDirectory.listAll is broken
 Key: LUCENE-4130
 URL: https://issues.apache.org/jira/browse/LUCENE-4130
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Robert Muir


The files returned by listAll are not actually the files in the CFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4120) FST should use packed integer arrays

2012-06-11 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-4120:
-

Affects Version/s: (was: 5.0)
Fix Version/s: (was: 5.0)
   4.0

 FST should use packed integer arrays
 

 Key: LUCENE-4120
 URL: https://issues.apache.org/jira/browse/LUCENE-4120
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/FSTs
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Fix For: 4.0


 There are some places where an int[] could be advantageously replaced with a 
 packed integer array.
 I am thinking (at least) of:
  * FST.nodeAddress (GrowableWriter)
  * FST.inCounts (GrowableWriter)
  * FST.nodeRefToAddress (read-only Reader)
 The serialization/deserialization methods should be modified too in order to 
 take advantage of PackedInts.get{Reader,Writer}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4130) CompoundFileDirectory.listAll is broken

2012-06-11 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-4130:


Attachment: LUCENE-4130_test.patch

test case

 CompoundFileDirectory.listAll is broken
 ---

 Key: LUCENE-4130
 URL: https://issues.apache.org/jira/browse/LUCENE-4130
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Robert Muir
 Attachments: LUCENE-4130_test.patch


 The files returned by listAll are not actually the files in the CFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-4131) .cfs/.cfe should have a codecheader

2012-06-11 Thread Robert Muir (JIRA)
Robert Muir created LUCENE-4131:
---

 Summary: .cfs/.cfe should have a codecheader
 Key: LUCENE-4131
 URL: https://issues.apache.org/jira/browse/LUCENE-4131
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 4.0
Reporter: Robert Muir


The new .cfs is more tricky, but I still think we can do it. we should 
definitely fix this for .cfe

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3518) No `hits` in SolrResp. NamedList if distrib=true

2012-06-11 Thread Markus Jelsma (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-3518:


Attachment: SOLR-3518-4.0-1.patch

Patch for trunk adding the `hits` field to the SolrQueryResponse's NamedList. 
It's only returned in the final response, not in intermediate shardrequests in 
a distributed search.

Most likely not a good solution but it seems to work fine for now. Please 
improve.

 No `hits` in SolrResp. NamedList if distrib=true
 

 Key: SOLR-3518
 URL: https://issues.apache.org/jira/browse/SOLR-3518
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.0
 Environment: 5.0-SNAPSHOT 1346798 - markus - 2012-06-06 11:38:15
Reporter: Markus Jelsma
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-3518-4.0-1.patch


 The hits field in the NamedList obtained from SolrQueryResponse.toLog() is 
 not available for distrib=true requests. The hits fields is also not written 
 to the log.
 See also:: 
 http://lucene.472066.n3.nabble.com/SolrDispatchFilter-no-hits-in-response-NamedList-if-distrib-true-td3987751.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4129) add CodecHeader to .frq and .prx

2012-06-11 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-4129:


Attachment: LUCENE-4129.patch

also added a test TestAllFilesHaveCodecHeader. It currently has to ignore 
.cfs/cfe and also not recurse into them until we fix LUCENE-4130 and LUCENE-4131

 add CodecHeader to .frq and .prx
 

 Key: LUCENE-4129
 URL: https://issues.apache.org/jira/browse/LUCENE-4129
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Robert Muir
Assignee: Robert Muir
 Attachments: LUCENE-4129.patch, LUCENE-4129.patch, LUCENE-4129.patch


 We did this for all other files, but not .frq/.prx.
 Currently the postings writer only records itself in the blocktree terms 
 dictionary, which is fine, but thats really documenting the .tim itself, that 
 it is Blocktree with Lucene40Postings metadata.
 I think we should put headers in .frq/.prx as well: e.g. it could detect file 
 jumbling.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4128) add safety to preflex segmentinfo upgrade

2012-06-11 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-4128:
---

Attachment: LUCENE-4128.patch

New patch w/ separate segmentWasUpgraded method checking the codec header.

 add safety to preflex segmentinfo upgrade
 ---

 Key: LUCENE-4128
 URL: https://issues.apache.org/jira/browse/LUCENE-4128
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Robert Muir
Assignee: Michael McCandless
 Fix For: 4.0

 Attachments: LUCENE-4128.patch, LUCENE-4128.patch


 Currently the one-time-upgrade depends on whether the upgraded .si file 
 exists. And the writing is done in a try/finally so its removed if 
 ioexception happens.
 but I think there could be a power-loss or something else in the middle of 
 this, the upgraded .si file could be bogus, then the user would have to 
 manually remove it (they probably wouldnt know).
 i think instead we should just have a marker file on completion, that we 
 create after we successfully fsync the upgraded .si file. this way if 
 something happens we just rewrite the thing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Issues with whitespace tokenization in QueryParser

2012-06-11 Thread Robert Muir
Welcome John!

Basically the tricky part about this issue is how Analyzer integrates
into the parsing workflow: It is as hossman says on the issue.

You can edit the .jflex file so that _TERM_CHAR is defined differently
and regenerate, and you will see what i mean by the tests that fail.

The crux of the problem is that currently if you have +foo bar -baz,
we split on whitespace, applying operators, then run the analyzer on
each portion.
so you get +foo, bar, -baz, then we analyze foo, bar, and baz respectively.

But if you just remove the whitespace tokenization, you will get +foo
bar, -baz, which is different.

so to make this kind of thing work as expected, I think the analyzer
would be integrated at an earlier stage here before the operators are
applied, e.g. its part of the lexing process.

NOTE: I definitely don't want to discourage you from tackling this
issue, but I think its fair to mention there is a workaround, and
thats if you can preprocess your queries yourself (maybe you dont
allow all the lucene syntax to your users or something like that), you
can escape the whitespace yourself such as rain\ coat, and I think
your synonyms will work as expected.

On Sun, Jun 10, 2012 at 11:03 PM, John Berryman jfberry...@gmail.com wrote:
 According to https://issues.apache.org/jira/browse/LUCENE-2605, the Lucene
 QueryParser tokenizes on white space before giving any text to the Analyzer.
 This makes it impossible to use multi-term synonyms because the
 SynonymFilter only receives one word at a time.

 Resolution to this would really help with my current project. My project
 client sells clothing and accessories online. They have plenty of examples
 of compound words e.g.rain coat. But some of these compound words are
 really tripping them up. A prime example is that a search for dress shoes
 returns a list of dresses and random shoes (not necessarily dress shoes). I
 wish that I was able to synonym compound words to single tokens (e.g. dress
 shoes = dress_shoes), but with this whitespace tokenization issue, it's
 impossible.

 Has anything happened with this bug recently? For a short time I've got a
 client that would be willing to pay for this issues to be fixed if it's not
 too much of a rabbit hole. Anyone care to catch me up with what this might
 entail?

 --
 LinkedIn
 Twitter




-- 
lucidimagination.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4128) add safety to preflex segmentinfo upgrade

2012-06-11 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292784#comment-13292784
 ] 

Robert Muir commented on LUCENE-4128:
-

+1, thanks!

 add safety to preflex segmentinfo upgrade
 ---

 Key: LUCENE-4128
 URL: https://issues.apache.org/jira/browse/LUCENE-4128
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Robert Muir
Assignee: Michael McCandless
 Fix For: 4.0

 Attachments: LUCENE-4128.patch, LUCENE-4128.patch


 Currently the one-time-upgrade depends on whether the upgraded .si file 
 exists. And the writing is done in a try/finally so its removed if 
 ioexception happens.
 but I think there could be a power-loss or something else in the middle of 
 this, the upgraded .si file could be bogus, then the user would have to 
 manually remove it (they probably wouldnt know).
 i think instead we should just have a marker file on completion, that we 
 create after we successfully fsync the upgraded .si file. this way if 
 something happens we just rewrite the thing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-4132) IndexWriterConfig live settings

2012-06-11 Thread Shai Erera (JIRA)
Shai Erera created LUCENE-4132:
--

 Summary: IndexWriterConfig live settings
 Key: LUCENE-4132
 URL: https://issues.apache.org/jira/browse/LUCENE-4132
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Shai Erera
Priority: Minor
 Fix For: 4.0, 5.0


A while ago there was a discussion about making some IW settings live and I 
remember that RAM buffer size was one of them. Judging from IW code, I see that 
RAM buffer can be changed live as IW never caches it.

However, I don't remember which other settings were decided to be live and I 
don't see any documentation in IW nor IWC for that. IW.getConfig mentions:

{code}
* bNOTE:/b some settings may be changed on the
* returned {@link IndexWriterConfig}, and will take
* effect in the current IndexWriter instance.  See the
* javadocs for the specific setters in {@link
* IndexWriterConfig} for details.
{code}

But there's no text on e.g. IWC.setRAMBuffer mentioning that.

I think that it'd be good if we make it easier for users to tell which of the 
settings are live ones. There are few possible ways to do it:

* Introduce a custom @live.setting tag on the relevant IWC.set methods, and add 
special text for them in build.xml
** Or, drop the tag and just document it clearly.

* Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name proposals 
are welcome !), have IWC impl both, and introduce another IW.getLiveConfig 
which will return that interface, thereby clearly letting the user know which 
of the settings are live.

It'd be good if IWC itself could only expose setXYZ methods for the live 
settings though. So perhaps, off the top of my head, we can do something like 
this:
* Introduce a Config object, which is essentially what IWC is today, and pass 
it to IW.
* IW will create a different object, IWC from that Config and IW.getConfig will 
return IWC.
* IWC itself will only have setXYZ methods for the live settings.

It adds another object, but user code doesn't change - it still creates a 
Config object when initializing IW, and need to handle a different type if it 
ever calls IW.getConfig.

Maybe that's not such a bad idea?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-4.x-Windows-Java7-64 - Build # 41 - Failure!

2012-06-11 Thread jenkins
Build: 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java7-64/41/

1 tests failed.
REGRESSION:  org.apache.solr.handler.TestReplicationHandler.test

Error Message:
expected:494 but was:0

Stack Trace:
java.lang.AssertionError: expected:494 but was:0
at 
__randomizedtesting.SeedInfo.seed([5CB24A4021D11B91:D4E6759A8F2D7669]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at org.junit.Assert.assertEquals(Assert.java:456)
at 
org.apache.solr.handler.TestReplicationHandler.doTestIndexAndConfigAliasReplication(TestReplicationHandler.java:716)
at 
org.apache.solr.handler.TestReplicationHandler.test(TestReplicationHandler.java:254)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
at 
org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53)
at 
org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551)




Build Log:
[...truncated 17470 lines...]
   [junit4]   2 63741 T2697 C164 REQ [collection1] webapp=/solr 
path=/replication 
params={command=filecontentchecksum=truegeneration=16wt=filestreamfile=_9_Lucene40_0.tim}
 status=0 QTime=0 
   [junit4]   2 63746 T2697 C164 

[jira] [Commented] (LUCENE-4132) IndexWriterConfig live settings

2012-06-11 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292786#comment-13292786
 ] 

Robert Muir commented on LUCENE-4132:
-

I don't think we should add another Config object, making things complicated 
for such a very very expert use case.
Even ordinary users need to use IWC, and 99% of them don't care about changing 
things live.

I'm also nervous about documenting which things can/cannot be changed live 
unless there are unit tests for each one.
If we want to refactor indexwriter in some way that really cleans it up, but 
makes something un-live, then I think
thats totally fair game and we should be able to do it, but the docs shouldnt 
be wrong.

 IndexWriterConfig live settings
 ---

 Key: LUCENE-4132
 URL: https://issues.apache.org/jira/browse/LUCENE-4132
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Shai Erera
Priority: Minor
 Fix For: 4.0, 5.0


 A while ago there was a discussion about making some IW settings live and I 
 remember that RAM buffer size was one of them. Judging from IW code, I see 
 that RAM buffer can be changed live as IW never caches it.
 However, I don't remember which other settings were decided to be live and 
 I don't see any documentation in IW nor IWC for that. IW.getConfig mentions:
 {code}
 * bNOTE:/b some settings may be changed on the
 * returned {@link IndexWriterConfig}, and will take
 * effect in the current IndexWriter instance.  See the
 * javadocs for the specific setters in {@link
 * IndexWriterConfig} for details.
 {code}
 But there's no text on e.g. IWC.setRAMBuffer mentioning that.
 I think that it'd be good if we make it easier for users to tell which of the 
 settings are live ones. There are few possible ways to do it:
 * Introduce a custom @live.setting tag on the relevant IWC.set methods, and 
 add special text for them in build.xml
 ** Or, drop the tag and just document it clearly.
 * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name 
 proposals are welcome !), have IWC impl both, and introduce another 
 IW.getLiveConfig which will return that interface, thereby clearly letting 
 the user know which of the settings are live.
 It'd be good if IWC itself could only expose setXYZ methods for the live 
 settings though. So perhaps, off the top of my head, we can do something like 
 this:
 * Introduce a Config object, which is essentially what IWC is today, and pass 
 it to IW.
 * IW will create a different object, IWC from that Config and IW.getConfig 
 will return IWC.
 * IWC itself will only have setXYZ methods for the live settings.
 It adds another object, but user code doesn't change - it still creates a 
 Config object when initializing IW, and need to handle a different type if it 
 ever calls IW.getConfig.
 Maybe that's not such a bad idea?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4132) IndexWriterConfig live settings

2012-06-11 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292788#comment-13292788
 ] 

Shai Erera commented on LUCENE-4132:


{quote}
I don't think we should add another Config object, making things complicated 
for such a very very expert use case.
Even ordinary users need to use IWC, and 99% of them don't care about changing 
things live.
{quote}

I'm not proposing to complicate matters for 99.9% of the users. On the contrary 
-- users will still do:

{code}
IndexWriterConfig config = new IndexWriterConfig(...);
// configure it
IndexWriter writer = new IndexWriter(dir, config);
{code}

Only the expert users who will want to change some settings live, will do:
{code}
Config conf = writer.getConfig(); // NOTE: it's a different type
conf.setSomething();
{code}

Config can be an IW internal type and most users won't even be aware of it. 
Today we document that the given IWC to IW ctor is cloned and it will remain as 
such. Only instead of being cloned to an IWC type, it will be cloned to a 
Config (or LiveConfig) type.

IWC documentation isn't changed, IW.getConfig changes by removing that NOTE, 
and if you care about lively configure IW, you can do so through LiveConfig. 
And we can test that type too !

 IndexWriterConfig live settings
 ---

 Key: LUCENE-4132
 URL: https://issues.apache.org/jira/browse/LUCENE-4132
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Shai Erera
Priority: Minor
 Fix For: 4.0, 5.0


 A while ago there was a discussion about making some IW settings live and I 
 remember that RAM buffer size was one of them. Judging from IW code, I see 
 that RAM buffer can be changed live as IW never caches it.
 However, I don't remember which other settings were decided to be live and 
 I don't see any documentation in IW nor IWC for that. IW.getConfig mentions:
 {code}
 * bNOTE:/b some settings may be changed on the
 * returned {@link IndexWriterConfig}, and will take
 * effect in the current IndexWriter instance.  See the
 * javadocs for the specific setters in {@link
 * IndexWriterConfig} for details.
 {code}
 But there's no text on e.g. IWC.setRAMBuffer mentioning that.
 I think that it'd be good if we make it easier for users to tell which of the 
 settings are live ones. There are few possible ways to do it:
 * Introduce a custom @live.setting tag on the relevant IWC.set methods, and 
 add special text for them in build.xml
 ** Or, drop the tag and just document it clearly.
 * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name 
 proposals are welcome !), have IWC impl both, and introduce another 
 IW.getLiveConfig which will return that interface, thereby clearly letting 
 the user know which of the settings are live.
 It'd be good if IWC itself could only expose setXYZ methods for the live 
 settings though. So perhaps, off the top of my head, we can do something like 
 this:
 * Introduce a Config object, which is essentially what IWC is today, and pass 
 it to IW.
 * IW will create a different object, IWC from that Config and IW.getConfig 
 will return IWC.
 * IWC itself will only have setXYZ methods for the live settings.
 It adds another object, but user code doesn't change - it still creates a 
 Config object when initializing IW, and need to handle a different type if it 
 ever calls IW.getConfig.
 Maybe that's not such a bad idea?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4132) IndexWriterConfig live settings

2012-06-11 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292789#comment-13292789
 ] 

Robert Muir commented on LUCENE-4132:
-

Right, but i suppose changing live settings isnt necessarily the only use case 
for writer.getConfig() ?

Today someone can take the config off of there and set it on another writer (it 
will be privately cloned).
so i think if we want to do it this way, we could just keep getConfig as is, 
and add getLiveConfig which 
actually returns the same object, just cast through that interface.


 IndexWriterConfig live settings
 ---

 Key: LUCENE-4132
 URL: https://issues.apache.org/jira/browse/LUCENE-4132
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Shai Erera
Priority: Minor
 Fix For: 4.0, 5.0


 A while ago there was a discussion about making some IW settings live and I 
 remember that RAM buffer size was one of them. Judging from IW code, I see 
 that RAM buffer can be changed live as IW never caches it.
 However, I don't remember which other settings were decided to be live and 
 I don't see any documentation in IW nor IWC for that. IW.getConfig mentions:
 {code}
 * bNOTE:/b some settings may be changed on the
 * returned {@link IndexWriterConfig}, and will take
 * effect in the current IndexWriter instance.  See the
 * javadocs for the specific setters in {@link
 * IndexWriterConfig} for details.
 {code}
 But there's no text on e.g. IWC.setRAMBuffer mentioning that.
 I think that it'd be good if we make it easier for users to tell which of the 
 settings are live ones. There are few possible ways to do it:
 * Introduce a custom @live.setting tag on the relevant IWC.set methods, and 
 add special text for them in build.xml
 ** Or, drop the tag and just document it clearly.
 * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name 
 proposals are welcome !), have IWC impl both, and introduce another 
 IW.getLiveConfig which will return that interface, thereby clearly letting 
 the user know which of the settings are live.
 It'd be good if IWC itself could only expose setXYZ methods for the live 
 settings though. So perhaps, off the top of my head, we can do something like 
 this:
 * Introduce a Config object, which is essentially what IWC is today, and pass 
 it to IW.
 * IW will create a different object, IWC from that Config and IW.getConfig 
 will return IWC.
 * IWC itself will only have setXYZ methods for the live settings.
 It adds another object, but user code doesn't change - it still creates a 
 Config object when initializing IW, and need to handle a different type if it 
 ever calls IW.getConfig.
 Maybe that's not such a bad idea?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4132) IndexWriterConfig live settings

2012-06-11 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292792#comment-13292792
 ] 

Robert Muir commented on LUCENE-4132:
-

ok actually i was partially wrong, one can no longer actually use the IWC from 
a writer since its marked as owned.
But they can still grab it and look at stuff like getIndexDeletionPolicy, even 
though thats not live.

I guess to be less confusing we should add getLiveConfig and just remove 
getConfig completely?

 IndexWriterConfig live settings
 ---

 Key: LUCENE-4132
 URL: https://issues.apache.org/jira/browse/LUCENE-4132
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Shai Erera
Priority: Minor
 Fix For: 4.0, 5.0


 A while ago there was a discussion about making some IW settings live and I 
 remember that RAM buffer size was one of them. Judging from IW code, I see 
 that RAM buffer can be changed live as IW never caches it.
 However, I don't remember which other settings were decided to be live and 
 I don't see any documentation in IW nor IWC for that. IW.getConfig mentions:
 {code}
 * bNOTE:/b some settings may be changed on the
 * returned {@link IndexWriterConfig}, and will take
 * effect in the current IndexWriter instance.  See the
 * javadocs for the specific setters in {@link
 * IndexWriterConfig} for details.
 {code}
 But there's no text on e.g. IWC.setRAMBuffer mentioning that.
 I think that it'd be good if we make it easier for users to tell which of the 
 settings are live ones. There are few possible ways to do it:
 * Introduce a custom @live.setting tag on the relevant IWC.set methods, and 
 add special text for them in build.xml
 ** Or, drop the tag and just document it clearly.
 * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name 
 proposals are welcome !), have IWC impl both, and introduce another 
 IW.getLiveConfig which will return that interface, thereby clearly letting 
 the user know which of the settings are live.
 It'd be good if IWC itself could only expose setXYZ methods for the live 
 settings though. So perhaps, off the top of my head, we can do something like 
 this:
 * Introduce a Config object, which is essentially what IWC is today, and pass 
 it to IW.
 * IW will create a different object, IWC from that Config and IW.getConfig 
 will return IWC.
 * IWC itself will only have setXYZ methods for the live settings.
 It adds another object, but user code doesn't change - it still creates a 
 Config object when initializing IW, and need to handle a different type if it 
 ever calls IW.getConfig.
 Maybe that's not such a bad idea?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4132) IndexWriterConfig live settings

2012-06-11 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292793#comment-13292793
 ] 

Shai Erera commented on LUCENE-4132:


bq. Today someone can take the config off of there and set it on another writer 
(it will be privately cloned)

True, but I'm not aware of such use, and still someone can cache the IWC 
himself and pass it to multiple writers?

If getConfig() returns an IWC which has setters(), that'll confuse the user for 
sure, because those settings won't take effect.

I prefer that getConfig return the new LiveConfig type, with few setters and 
all getters (i.e. all getXYZ from IWC), and let whoever want to pass the same 
IWC instance to other writers handle it himself.

Alternatively, we can add another ctor which takes a LiveConfig object, that is 
returned from getConfig(), but I prefer to avoid that until someone actually 
tells us that he shares the same IWC with other writers, and he cannot cache it 
himself?

 IndexWriterConfig live settings
 ---

 Key: LUCENE-4132
 URL: https://issues.apache.org/jira/browse/LUCENE-4132
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Shai Erera
Priority: Minor
 Fix For: 4.0, 5.0


 A while ago there was a discussion about making some IW settings live and I 
 remember that RAM buffer size was one of them. Judging from IW code, I see 
 that RAM buffer can be changed live as IW never caches it.
 However, I don't remember which other settings were decided to be live and 
 I don't see any documentation in IW nor IWC for that. IW.getConfig mentions:
 {code}
 * bNOTE:/b some settings may be changed on the
 * returned {@link IndexWriterConfig}, and will take
 * effect in the current IndexWriter instance.  See the
 * javadocs for the specific setters in {@link
 * IndexWriterConfig} for details.
 {code}
 But there's no text on e.g. IWC.setRAMBuffer mentioning that.
 I think that it'd be good if we make it easier for users to tell which of the 
 settings are live ones. There are few possible ways to do it:
 * Introduce a custom @live.setting tag on the relevant IWC.set methods, and 
 add special text for them in build.xml
 ** Or, drop the tag and just document it clearly.
 * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name 
 proposals are welcome !), have IWC impl both, and introduce another 
 IW.getLiveConfig which will return that interface, thereby clearly letting 
 the user know which of the settings are live.
 It'd be good if IWC itself could only expose setXYZ methods for the live 
 settings though. So perhaps, off the top of my head, we can do something like 
 this:
 * Introduce a Config object, which is essentially what IWC is today, and pass 
 it to IW.
 * IW will create a different object, IWC from that Config and IW.getConfig 
 will return IWC.
 * IWC itself will only have setXYZ methods for the live settings.
 It adds another object, but user code doesn't change - it still creates a 
 Config object when initializing IW, and need to handle a different type if it 
 ever calls IW.getConfig.
 Maybe that's not such a bad idea?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4132) IndexWriterConfig live settings

2012-06-11 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292794#comment-13292794
 ] 

Robert Muir commented on LUCENE-4132:
-

sorry, instead of nuking getConfig make it pkg-private. Things like 
RandomIndexWriter want to peek into some
un-live settings (like codec), I think we should still be able to look at these 
things for tests :)

 IndexWriterConfig live settings
 ---

 Key: LUCENE-4132
 URL: https://issues.apache.org/jira/browse/LUCENE-4132
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Shai Erera
Priority: Minor
 Fix For: 4.0, 5.0


 A while ago there was a discussion about making some IW settings live and I 
 remember that RAM buffer size was one of them. Judging from IW code, I see 
 that RAM buffer can be changed live as IW never caches it.
 However, I don't remember which other settings were decided to be live and 
 I don't see any documentation in IW nor IWC for that. IW.getConfig mentions:
 {code}
 * bNOTE:/b some settings may be changed on the
 * returned {@link IndexWriterConfig}, and will take
 * effect in the current IndexWriter instance.  See the
 * javadocs for the specific setters in {@link
 * IndexWriterConfig} for details.
 {code}
 But there's no text on e.g. IWC.setRAMBuffer mentioning that.
 I think that it'd be good if we make it easier for users to tell which of the 
 settings are live ones. There are few possible ways to do it:
 * Introduce a custom @live.setting tag on the relevant IWC.set methods, and 
 add special text for them in build.xml
 ** Or, drop the tag and just document it clearly.
 * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name 
 proposals are welcome !), have IWC impl both, and introduce another 
 IW.getLiveConfig which will return that interface, thereby clearly letting 
 the user know which of the settings are live.
 It'd be good if IWC itself could only expose setXYZ methods for the live 
 settings though. So perhaps, off the top of my head, we can do something like 
 this:
 * Introduce a Config object, which is essentially what IWC is today, and pass 
 it to IW.
 * IW will create a different object, IWC from that Config and IW.getConfig 
 will return IWC.
 * IWC itself will only have setXYZ methods for the live settings.
 It adds another object, but user code doesn't change - it still creates a 
 Config object when initializing IW, and need to handle a different type if it 
 ever calls IW.getConfig.
 Maybe that's not such a bad idea?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4132) IndexWriterConfig live settings

2012-06-11 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292795#comment-13292795
 ] 

Shai Erera commented on LUCENE-4132:


bq. I guess to be less confusing we should add getLiveConfig and just remove 
getConfig completely?

Yes that's the proposal - either getConfig or getLiveConfig, but return a 
LiveConfig object with all the getters of IWC, and only the setters that we 
want to support.

 IndexWriterConfig live settings
 ---

 Key: LUCENE-4132
 URL: https://issues.apache.org/jira/browse/LUCENE-4132
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Shai Erera
Priority: Minor
 Fix For: 4.0, 5.0


 A while ago there was a discussion about making some IW settings live and I 
 remember that RAM buffer size was one of them. Judging from IW code, I see 
 that RAM buffer can be changed live as IW never caches it.
 However, I don't remember which other settings were decided to be live and 
 I don't see any documentation in IW nor IWC for that. IW.getConfig mentions:
 {code}
 * bNOTE:/b some settings may be changed on the
 * returned {@link IndexWriterConfig}, and will take
 * effect in the current IndexWriter instance.  See the
 * javadocs for the specific setters in {@link
 * IndexWriterConfig} for details.
 {code}
 But there's no text on e.g. IWC.setRAMBuffer mentioning that.
 I think that it'd be good if we make it easier for users to tell which of the 
 settings are live ones. There are few possible ways to do it:
 * Introduce a custom @live.setting tag on the relevant IWC.set methods, and 
 add special text for them in build.xml
 ** Or, drop the tag and just document it clearly.
 * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name 
 proposals are welcome !), have IWC impl both, and introduce another 
 IW.getLiveConfig which will return that interface, thereby clearly letting 
 the user know which of the settings are live.
 It'd be good if IWC itself could only expose setXYZ methods for the live 
 settings though. So perhaps, off the top of my head, we can do something like 
 this:
 * Introduce a Config object, which is essentially what IWC is today, and pass 
 it to IW.
 * IW will create a different object, IWC from that Config and IW.getConfig 
 will return IWC.
 * IWC itself will only have setXYZ methods for the live settings.
 It adds another object, but user code doesn't change - it still creates a 
 Config object when initializing IW, and need to handle a different type if it 
 ever calls IW.getConfig.
 Maybe that's not such a bad idea?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4132) IndexWriterConfig live settings

2012-06-11 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292797#comment-13292797
 ] 

Robert Muir commented on LUCENE-4132:
-

{quote}
True, but I'm not aware of such use, and still someone can cache the IWC 
himself and pass it to multiple writers?
{quote}

I'm just talking about the general issue that IW.getConfig is not only used to 
change settings live.
Today our tests use this to peek at the settings on the IW (see my 
RandomIndexWriter example)...


 IndexWriterConfig live settings
 ---

 Key: LUCENE-4132
 URL: https://issues.apache.org/jira/browse/LUCENE-4132
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Shai Erera
Priority: Minor
 Fix For: 4.0, 5.0


 A while ago there was a discussion about making some IW settings live and I 
 remember that RAM buffer size was one of them. Judging from IW code, I see 
 that RAM buffer can be changed live as IW never caches it.
 However, I don't remember which other settings were decided to be live and 
 I don't see any documentation in IW nor IWC for that. IW.getConfig mentions:
 {code}
 * bNOTE:/b some settings may be changed on the
 * returned {@link IndexWriterConfig}, and will take
 * effect in the current IndexWriter instance.  See the
 * javadocs for the specific setters in {@link
 * IndexWriterConfig} for details.
 {code}
 But there's no text on e.g. IWC.setRAMBuffer mentioning that.
 I think that it'd be good if we make it easier for users to tell which of the 
 settings are live ones. There are few possible ways to do it:
 * Introduce a custom @live.setting tag on the relevant IWC.set methods, and 
 add special text for them in build.xml
 ** Or, drop the tag and just document it clearly.
 * Separate IWC to two interfaces, LiveConfig and OneTimeConfig (name 
 proposals are welcome !), have IWC impl both, and introduce another 
 IW.getLiveConfig which will return that interface, thereby clearly letting 
 the user know which of the settings are live.
 It'd be good if IWC itself could only expose setXYZ methods for the live 
 settings though. So perhaps, off the top of my head, we can do something like 
 this:
 * Introduce a Config object, which is essentially what IWC is today, and pass 
 it to IW.
 * IW will create a different object, IWC from that Config and IW.getConfig 
 will return IWC.
 * IWC itself will only have setXYZ methods for the live settings.
 It adds another object, but user code doesn't change - it still creates a 
 Config object when initializing IW, and need to handle a different type if it 
 ever calls IW.getConfig.
 Maybe that's not such a bad idea?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4120) FST should use packed integer arrays

2012-06-11 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-4120:
-

Attachment: LUCENE-4120.patch

Patch. I don't fully understand how FST packing works so I would appreciate if 
someone familiar with it could review this patch.

 FST should use packed integer arrays
 

 Key: LUCENE-4120
 URL: https://issues.apache.org/jira/browse/LUCENE-4120
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/FSTs
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-4120.patch


 There are some places where an int[] could be advantageously replaced with a 
 packed integer array.
 I am thinking (at least) of:
  * FST.nodeAddress (GrowableWriter)
  * FST.inCounts (GrowableWriter)
  * FST.nodeRefToAddress (read-only Reader)
 The serialization/deserialization methods should be modified too in order to 
 take advantage of PackedInts.get{Reader,Writer}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: VOTE: Lucene/Solr 4.0-ALPHA

2012-06-11 Thread Mark Miller

On Jun 10, 2012, at 11:22 AM, Jack Krupansky wrote:

 I reviewed the Solr 4.0 wiki, which sounds as if the intent is for a single 
 alpha and a single beta

If someone is willing to assemble a release and the release can get the release 
votes, there is no reason we can't have multiple alphas or betas.

Most things put on the wiki are guidelines or hopes more than anything - 
nothing is really set in stone. It's all subject to change given who expends 
what effort and what circumstances accumulate.

Bottom line, anyone can be an RM, anyone can build an alpha, beta, release 
candidate, etc. You just need to get three PMC members to vote for your 
release. Given that, it does not make a lot of sense to put too much into 
intent or plans IMO.

If circumstances warrant it, and someone is willing to make the releases, I'm 
sure we will do whatever makes the most sense given the feedback we get from 
the first alpha.

Maybe we have one alpha and multiple betas. Maybe we have one Alpha and decide 
to release. I think it makes sense to plan (hope?) minimally - that is, one 
alpha, one beta sounds reasonable in terms of a bunch of cats stating intent - 
and let further work arise from the release response.

- Mark Miller
lucidimagination.com












-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: LBHttpSolrServer doc needs a little improvement

2012-06-11 Thread Mark Miller

On Jun 11, 2012, at 7:28 AM, Jack Krupansky wrote:

 The wiki for LBHttpSolrServer is a little out of date. It says the feature 
 “experimental” and “currently being developed” even though SOLR-844 is 
 closed. There are a couple of “LB!HttpSolrServer” links that point to 
 nonexistent pages. The class javadoc has half but not all of the doc from the 
 wiki. The simplest solution may be to move the rest of the wiki doc into the 
 javadoc. I’m not sure what should be done with the wiki though. How can a 
 wiki link to javadoc when the link depends on Solr release? Or, maybe just 
 make the wiki and javadoc be the same.
  
 And the SolrJ wiki makes no mention of LBHttpSolrServer.
 
 -- Jack Krupansky

Feel free to jump in and make improvements - anyone can edit the wiki, and 
there are many instances of out of date information, or holes in information.

- Mark












-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-trunk-Windows-Java6-64 - Build # 505 - Failure!

2012-06-11 Thread jenkins
Build: 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/505/

1 tests failed.
REGRESSION:  org.apache.solr.handler.TestReplicationHandler.test

Error Message:
expected:498 but was:0

Stack Trace:
java.lang.AssertionError: expected:498 but was:0
at 
__randomizedtesting.SeedInfo.seed([EC0818932746A69C:645C274989BACB64]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at org.junit.Assert.assertEquals(Assert.java:456)
at 
org.apache.solr.handler.TestReplicationHandler.doTestIndexAndConfigReplication(TestReplicationHandler.java:391)
at 
org.apache.solr.handler.TestReplicationHandler.test(TestReplicationHandler.java:250)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
at 
org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53)
at 
org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551)




Build Log:
[...truncated 12840 lines...]
   [junit4]   2 45148 T1174 C20 REQ [collection1] webapp=/solr 
path=/replication 
params={command=filecontentchecksum=truegeneration=7wt=filestreamfile=_3.fdx}
 status=0 QTime=0 
   [junit4]   2 45148 T1182 C21 REQ [collection1] 

[jira] [Created] (LUCENE-4133) FastVectorHighlighter: A weighted approach for ordered fragments

2012-06-11 Thread Sebastian Lutze (JIRA)
Sebastian Lutze created LUCENE-4133:
---

 Summary: FastVectorHighlighter: A weighted approach for ordered 
fragments
 Key: LUCENE-4133
 URL: https://issues.apache.org/jira/browse/LUCENE-4133
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/highlighter
Affects Versions: 4.0, 5.0
Reporter: Sebastian Lutze
Priority: Minor
 Fix For: 4.0
 Attachments: LUCENE-4133.patch

The FastVectorHighlighter currently disregards IDF-weights for matching terms 
within generated fragments. In the worst case, a fragment, which contains high 
number of very common words, is scored higher, than a fragment that contains 
*all* of the terms which have been used in the original query.

This patch provides ordered fragments with IDF-weighted terms:

*For each distinct matching term per fragment:* 
_weight = weight + IDF * boost_

*For each fragment:* 
_weight = weight * numTerms * 1 / sqrt( numTerms )_

|weight| total weight of fragment 
|IDF| inverse document frequency for each distinct matching term
|boost| query boost as provided, for example _term^2_
|numTerms| total number of matching terms per fragment 


*Method:*

{code:java}
  public void add( int startOffset, int endOffset, ListWeightedPhraseInfo 
phraseInfoList ) {

float totalBoost = 0;

ListSubInfo subInfos = new ArrayListSubInfo();
HashSetString distinctTerms = new HashSetString();

int length = 0;

for( WeightedPhraseInfo phraseInfo : phraseInfoList ){
  subInfos.add( new SubInfo( phraseInfo.getText(), 
phraseInfo.getTermsOffsets(), phraseInfo.getSeqnum() ) );
  for ( TermInfo ti :  phraseInfo.getTermsInfos()) {
if ( distinctTerms.add( ti.getText() ) )
  totalBoost += ti.getWeight() * phraseInfo.getBoost();
length++;
  }
}
totalBoost *= length * ( 1 / Math.sqrt( length ) );

getFragInfos().add( new WeightedFragInfo( startOffset, endOffset, subInfos, 
totalBoost ) );
  }
{code}

The ranking-formula should be the same, or at least similar, to that one used 
in QueryTermScorer.

*This patch contains:*

* a changed class-member in FieldPhraseList (termInfos to termsInfos)
* a changed local variable in SimpleFieldFragList (score to totalBoost)
* adds a missing @override in SimpleFragListBuilder
* class WeightedFieldFragList, a implementation of FieldFragList
* class WeightedFragListBuilder, a implementation of BaseFragListBuilder
* class WeightedFragListBuilderTest, a simple test-case 
* updated docs for FVH 

Last part (see also LUCENE-4091, LUCENE-4107, LUCENE-4113) of LUCENE-3440. 


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4133) FastVectorHighlighter: A weighted approach for ordered fragments

2012-06-11 Thread Sebastian Lutze (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebastian Lutze updated LUCENE-4133:


Attachment: LUCENE-4133.patch

 FastVectorHighlighter: A weighted approach for ordered fragments
 

 Key: LUCENE-4133
 URL: https://issues.apache.org/jira/browse/LUCENE-4133
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/highlighter
Affects Versions: 4.0, 5.0
Reporter: Sebastian Lutze
Priority: Minor
  Labels: FastVectorHighlighter
 Fix For: 4.0

 Attachments: LUCENE-4133.patch


 The FastVectorHighlighter currently disregards IDF-weights for matching terms 
 within generated fragments. In the worst case, a fragment, which contains 
 high number of very common words, is scored higher, than a fragment that 
 contains *all* of the terms which have been used in the original query.
 This patch provides ordered fragments with IDF-weighted terms:
 *For each distinct matching term per fragment:* 
 _weight = weight + IDF * boost_
 *For each fragment:* 
 _weight = weight * numTerms * 1 / sqrt( numTerms )_
 |weight| total weight of fragment 
 |IDF| inverse document frequency for each distinct matching term
 |boost| query boost as provided, for example _term^2_
 |numTerms| total number of matching terms per fragment 
 *Method:*
 {code:java}
   public void add( int startOffset, int endOffset, ListWeightedPhraseInfo 
 phraseInfoList ) {
 
 float totalBoost = 0;
 
 ListSubInfo subInfos = new ArrayListSubInfo();
 HashSetString distinctTerms = new HashSetString();
 
 int length = 0;
 for( WeightedPhraseInfo phraseInfo : phraseInfoList ){
   subInfos.add( new SubInfo( phraseInfo.getText(), 
 phraseInfo.getTermsOffsets(), phraseInfo.getSeqnum() ) );
   for ( TermInfo ti :  phraseInfo.getTermsInfos()) {
 if ( distinctTerms.add( ti.getText() ) )
   totalBoost += ti.getWeight() * phraseInfo.getBoost();
 length++;
   }
 }
 totalBoost *= length * ( 1 / Math.sqrt( length ) );
 
 getFragInfos().add( new WeightedFragInfo( startOffset, endOffset, 
 subInfos, totalBoost ) );
   }
 {code}
 The ranking-formula should be the same, or at least similar, to that one used 
 in QueryTermScorer.
 *This patch contains:*
 * a changed class-member in FieldPhraseList (termInfos to termsInfos)
 * a changed local variable in SimpleFieldFragList (score to totalBoost)
 * adds a missing @override in SimpleFragListBuilder
 * class WeightedFieldFragList, a implementation of FieldFragList
 * class WeightedFragListBuilder, a implementation of BaseFragListBuilder
 * class WeightedFragListBuilderTest, a simple test-case 
 * updated docs for FVH 
 Last part (see also LUCENE-4091, LUCENE-4107, LUCENE-4113) of LUCENE-3440. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3440) FastVectorHighlighter: IDF-weighted terms for ordered fragments

2012-06-11 Thread Sebastian Lutze (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292822#comment-13292822
 ] 

Sebastian Lutze commented on LUCENE-3440:
-

Hi Koji,
  
bq. Is the next the last one? 

almost. :) Next thing would be Solr-Integration. 

So, I just realized: trunk is not trunk anymore! 

This one is for branch_4x: 

https://issues.apache.org/jira/browse/LUCENE-4133 

Tests are fine. 


 FastVectorHighlighter: IDF-weighted terms for ordered fragments 
 

 Key: LUCENE-3440
 URL: https://issues.apache.org/jira/browse/LUCENE-3440
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/highlighter
Reporter: Sebastian Lutze
Priority: Minor
  Labels: FastVectorHighlighter
 Fix For: 4.0

 Attachments: LUCENE-3440.patch, LUCENE-3440.patch, LUCENE-3440.patch, 
 LUCENE-3440_3.6.1-SNAPSHOT.patch, LUCENE-4.0-SNAPSHOT-3440-9.patch, 
 weight-vs-boost_table01.html, weight-vs-boost_table02.html


 The FastVectorHighlighter uses for every term found in a fragment an equal 
 weight, which causes a higher ranking for fragments with a high number of 
 words or, in the worst case, a high number of very common words than 
 fragments that contains *all* of the terms used in the original query. 
 This patch provides ordered fragments with IDF-weighted terms: 
 total weight = total weight + IDF for unique term per fragment * boost of 
 query; 
 The ranking-formula should be the same, or at least similar, to that one used 
 in org.apache.lucene.search.highlight.QueryTermScorer.
 The patch is simple, but it works for us. 
 Some ideas:
 - A better approach would be moving the whole fragments-scoring into a 
 separate class.
 - Switch scoring via parameter 
 - Exact phrases should be given a even better score, regardless if a 
 phrase-query was executed or not
 - edismax/dismax-parameters pf, ps and pf^boost should be observed and 
 corresponding fragments should be ranked higher 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-2748) Convert all Lucene web properties to use the ASF CMS

2012-06-11 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll resolved LUCENE-2748.
-

Resolution: Fixed

 Convert all Lucene web properties to use the ASF CMS
 

 Key: LUCENE-2748
 URL: https://issues.apache.org/jira/browse/LUCENE-2748
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
 Attachments: modify_ui.diff


 The new CMS has a lot of nice features (and some kinks to still work out) and 
 Forrest just doesn't cut it anymore, so we should move to the ASF CMS: 
 http://apache.org/dev/cms.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-trunk-Windows-Java7-64 - Build # 293 - Failure!

2012-06-11 Thread jenkins
Build: 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/293/

1 tests failed.
REGRESSION:  org.apache.solr.handler.TestReplicationHandler.test

Error Message:
expected:498 but was:0

Stack Trace:
java.lang.AssertionError: expected:498 but was:0
at 
__randomizedtesting.SeedInfo.seed([B511A30BDD1F4E06:3D459CD173E323FE]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at org.junit.Assert.assertEquals(Assert.java:456)
at 
org.apache.solr.handler.TestReplicationHandler.doTestIndexAndConfigReplication(TestReplicationHandler.java:391)
at 
org.apache.solr.handler.TestReplicationHandler.test(TestReplicationHandler.java:250)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
at 
org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53)
at 
org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551)




Build Log:
[...truncated 13675 lines...]
   [junit4]   2 35971 T582 C20 REQ [collection1] webapp=/solr 
path=/replication params={command=filelistwt=javabingeneration=7} status=0 
QTime=1 
   [junit4]   2 35972 T599 oash.SnapPuller.fetchLatestIndex Number of files in 

[jira] [Commented] (LUCENE-4129) add CodecHeader to .frq and .prx

2012-06-11 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292842#comment-13292842
 ] 

Uwe Schindler commented on LUCENE-4129:
---

I am fine with the patch. I would like to fix the CFS issues, too. But we 
already have issue.

 add CodecHeader to .frq and .prx
 

 Key: LUCENE-4129
 URL: https://issues.apache.org/jira/browse/LUCENE-4129
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Robert Muir
Assignee: Robert Muir
 Attachments: LUCENE-4129.patch, LUCENE-4129.patch, LUCENE-4129.patch


 We did this for all other files, but not .frq/.prx.
 Currently the postings writer only records itself in the blocktree terms 
 dictionary, which is fine, but thats really documenting the .tim itself, that 
 it is Blocktree with Lucene40Postings metadata.
 I think we should put headers in .frq/.prx as well: e.g. it could detect file 
 jumbling.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4129) add CodecHeader to .frq and .prx

2012-06-11 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292846#comment-13292846
 ] 

Robert Muir commented on LUCENE-4129:
-

I will look into the CFS stuff too after this one!

 add CodecHeader to .frq and .prx
 

 Key: LUCENE-4129
 URL: https://issues.apache.org/jira/browse/LUCENE-4129
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Robert Muir
Assignee: Robert Muir
 Attachments: LUCENE-4129.patch, LUCENE-4129.patch, LUCENE-4129.patch


 We did this for all other files, but not .frq/.prx.
 Currently the postings writer only records itself in the blocktree terms 
 dictionary, which is fine, but thats really documenting the .tim itself, that 
 it is Blocktree with Lucene40Postings metadata.
 I think we should put headers in .frq/.prx as well: e.g. it could detect file 
 jumbling.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: LBHttpSolrServer doc needs a little improvement

2012-06-11 Thread Jack Krupansky

Thanks. I wasn't sure what the policy was for wiki updates by non-commiters.

I updated the wiki as I specified, including the addition of a reference on 
the SolrJ wiki. And a bunch of typos and a couple of errors in the example 
code as well.


-- Jack Krupansky

-Original Message- 
From: Mark Miller

Sent: Monday, June 11, 2012 9:48 AM
To: dev@lucene.apache.org
Subject: Re: LBHttpSolrServer doc needs a little improvement


On Jun 11, 2012, at 7:28 AM, Jack Krupansky wrote:

The wiki for LBHttpSolrServer is a little out of date. It says the feature 
“experimental” and “currently being developed” even though SOLR-844 is 
closed. There are a couple of “LB!HttpSolrServer” links that point to 
nonexistent pages. The class javadoc has half but not all of the doc from 
the wiki. The simplest solution may be to move the rest of the wiki doc 
into the javadoc. I’m not sure what should be done with the wiki though. 
How can a wiki link to javadoc when the link depends on Solr release? Or, 
maybe just make the wiki and javadoc be the same.


And the SolrJ wiki makes no mention of LBHttpSolrServer.

-- Jack Krupansky


Feel free to jump in and make improvements - anyone can edit the wiki, and 
there are many instances of out of date information, or holes in 
information.


- Mark












-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org 



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4133) FastVectorHighlighter: A weighted approach for ordered fragments

2012-06-11 Thread Sebastian Lutze (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebastian Lutze updated LUCENE-4133:


Description: 
The FastVectorHighlighter currently disregards IDF-weights for matching terms 
within generated fragments. In the worst case, a fragment, which contains high 
number of very common words, is scored higher, than a fragment that contains 
*all* of the terms which have been used in the original query.

This patch provides ordered fragments with IDF-weighted terms:

*For each distinct matching term per fragment:* 
_weight = weight + IDF * boost_

*For each fragment:* 
_weight = weight * length * 1 / sqrt( length )_

|weight| total weight of fragment 
|IDF| inverse document frequency for each distinct matching term
|boost| query boost as provided, for example _term^2_
|length| total number of non-distinct matching terms per fragment 


*Method:*

{code:java}
  public void add( int startOffset, int endOffset, ListWeightedPhraseInfo 
phraseInfoList ) {

float totalBoost = 0;

ListSubInfo subInfos = new ArrayListSubInfo();
HashSetString distinctTerms = new HashSetString();

int length = 0;

for( WeightedPhraseInfo phraseInfo : phraseInfoList ){
  subInfos.add( new SubInfo( phraseInfo.getText(), 
phraseInfo.getTermsOffsets(), phraseInfo.getSeqnum() ) );
  for ( TermInfo ti :  phraseInfo.getTermsInfos()) {
if ( distinctTerms.add( ti.getText() ) )
  totalBoost += ti.getWeight() * phraseInfo.getBoost();
length++;
  }
}
totalBoost *= length * ( 1 / Math.sqrt( length ) );

getFragInfos().add( new WeightedFragInfo( startOffset, endOffset, subInfos, 
totalBoost ) );
  }
{code}

The ranking-formula should be the same, or at least similar, to that one used 
in QueryTermScorer.

*This patch contains:*

* a changed class-member in FieldPhraseList (termInfos to termsInfos)
* a changed local variable in SimpleFieldFragList (score to totalBoost)
* adds a missing @override in SimpleFragListBuilder
* class WeightedFieldFragList, a implementation of FieldFragList
* class WeightedFragListBuilder, a implementation of BaseFragListBuilder
* class WeightedFragListBuilderTest, a simple test-case 
* updated docs for FVH 

Last part (see also LUCENE-4091, LUCENE-4107, LUCENE-4113) of LUCENE-3440. 


  was:
The FastVectorHighlighter currently disregards IDF-weights for matching terms 
within generated fragments. In the worst case, a fragment, which contains high 
number of very common words, is scored higher, than a fragment that contains 
*all* of the terms which have been used in the original query.

This patch provides ordered fragments with IDF-weighted terms:

*For each distinct matching term per fragment:* 
_weight = weight + IDF * boost_

*For each fragment:* 
_weight = weight * numTerms * 1 / sqrt( numTerms )_

|weight| total weight of fragment 
|IDF| inverse document frequency for each distinct matching term
|boost| query boost as provided, for example _term^2_
|numTerms| total number of matching terms per fragment 


*Method:*

{code:java}
  public void add( int startOffset, int endOffset, ListWeightedPhraseInfo 
phraseInfoList ) {

float totalBoost = 0;

ListSubInfo subInfos = new ArrayListSubInfo();
HashSetString distinctTerms = new HashSetString();

int length = 0;

for( WeightedPhraseInfo phraseInfo : phraseInfoList ){
  subInfos.add( new SubInfo( phraseInfo.getText(), 
phraseInfo.getTermsOffsets(), phraseInfo.getSeqnum() ) );
  for ( TermInfo ti :  phraseInfo.getTermsInfos()) {
if ( distinctTerms.add( ti.getText() ) )
  totalBoost += ti.getWeight() * phraseInfo.getBoost();
length++;
  }
}
totalBoost *= length * ( 1 / Math.sqrt( length ) );

getFragInfos().add( new WeightedFragInfo( startOffset, endOffset, subInfos, 
totalBoost ) );
  }
{code}

The ranking-formula should be the same, or at least similar, to that one used 
in QueryTermScorer.

*This patch contains:*

* a changed class-member in FieldPhraseList (termInfos to termsInfos)
* a changed local variable in SimpleFieldFragList (score to totalBoost)
* adds a missing @override in SimpleFragListBuilder
* class WeightedFieldFragList, a implementation of FieldFragList
* class WeightedFragListBuilder, a implementation of BaseFragListBuilder
* class WeightedFragListBuilderTest, a simple test-case 
* updated docs for FVH 

Last part (see also LUCENE-4091, LUCENE-4107, LUCENE-4113) of LUCENE-3440. 



 FastVectorHighlighter: A weighted approach for ordered fragments
 

 Key: LUCENE-4133
 URL: https://issues.apache.org/jira/browse/LUCENE-4133
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/highlighter
Affects Versions: 4.0, 5.0

[jira] [Resolved] (LUCENE-4129) add CodecHeader to .frq and .prx

2012-06-11 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-4129.
-

   Resolution: Fixed
Fix Version/s: 5.0
   4.0

 add CodecHeader to .frq and .prx
 

 Key: LUCENE-4129
 URL: https://issues.apache.org/jira/browse/LUCENE-4129
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Robert Muir
Assignee: Robert Muir
 Fix For: 4.0, 5.0

 Attachments: LUCENE-4129.patch, LUCENE-4129.patch, LUCENE-4129.patch


 We did this for all other files, but not .frq/.prx.
 Currently the postings writer only records itself in the blocktree terms 
 dictionary, which is fine, but thats really documenting the .tim itself, that 
 it is Blocktree with Lucene40Postings metadata.
 I think we should put headers in .frq/.prx as well: e.g. it could detect file 
 jumbling.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4131) .cfs/.cfe should have a codecheader

2012-06-11 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-4131:


Attachment: LUCENE-4131_cfe.patch

trivial patch for .cfe

Looking at the .cfs now

 .cfs/.cfe should have a codecheader
 ---

 Key: LUCENE-4131
 URL: https://issues.apache.org/jira/browse/LUCENE-4131
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 4.0
Reporter: Robert Muir
 Attachments: LUCENE-4131_cfe.patch


 The new .cfs is more tricky, but I still think we can do it. we should 
 definitely fix this for .cfe

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4120) FST should use packed integer arrays

2012-06-11 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292885#comment-13292885
 ] 

Dawid Weiss commented on LUCENE-4120:
-

I looked at the patch and it looks good to me but I didn't really analyze it 
in-depth. As for fst packing, the idea is fairly simple -- you reduce the 
overall size of the fst by moving states which have lots incoming arcs to 
offsets which compress well (in vcoding). At least I think that's what Mike 
implemented (Mike is an unpredictable genius :) ).

This presentation has some details:
http://ciaa-fsmnlp-2011.univ-tours.fr/ciaa/upload/files/Weiss-Daciuk.pdf

 FST should use packed integer arrays
 

 Key: LUCENE-4120
 URL: https://issues.apache.org/jira/browse/LUCENE-4120
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/FSTs
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-4120.patch


 There are some places where an int[] could be advantageously replaced with a 
 packed integer array.
 I am thinking (at least) of:
  * FST.nodeAddress (GrowableWriter)
  * FST.inCounts (GrowableWriter)
  * FST.nodeRefToAddress (read-only Reader)
 The serialization/deserialization methods should be modified too in order to 
 take advantage of PackedInts.get{Reader,Writer}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4130) CompoundFileDirectory.listAll is broken

2012-06-11 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-4130:


Attachment: LUCENE-4130.patch

The problem is the ad-hoc substring'ing done in listAll: it doesnt work with 
norms/dv because they use CFS filenames with segment suffixes.

Instead of this substring, i added a IndexFileNames.parseSegmentName that is 
just like stripSegmentName, except returns the other part.

 CompoundFileDirectory.listAll is broken
 ---

 Key: LUCENE-4130
 URL: https://issues.apache.org/jira/browse/LUCENE-4130
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Robert Muir
 Attachments: LUCENE-4130.patch, LUCENE-4130_test.patch


 The files returned by listAll are not actually the files in the CFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-3211) Allow parameter override in conjunction with spellcheck.maxCollationTries

2012-06-11 Thread James Dyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Dyer resolved SOLR-3211.
--

Resolution: Fixed

Committed...Trunk: 1348936, Branch_4x: r1348937

 Allow parameter override in conjunction with spellcheck.maxCollationTries
 ---

 Key: SOLR-3211
 URL: https://issues.apache.org/jira/browse/SOLR-3211
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Affects Versions: 3.6, 4.0
Reporter: James Dyer
Assignee: James Dyer
Priority: Minor
 Fix For: 4.0, 5.0

 Attachments: SOLR-3211.patch


 A couple users on the mailing list recently asked about being able to 
 override the mm parameter when SpellCheckComponent issues queries to check 
 for # hits for a collation candidate.  The issue is if the query had mm=0, 
 pretty much everything will generate hits.  But for collation checking 
 purposes, a low mm is almost never desirable.
 It might be worthwhile to generalize this to let other parameters be 
 overridden as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4130) CompoundFileDirectory.listAll is broken

2012-06-11 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292918#comment-13292918
 ] 

Michael McCandless commented on LUCENE-4130:


+1, sneaky.

 CompoundFileDirectory.listAll is broken
 ---

 Key: LUCENE-4130
 URL: https://issues.apache.org/jira/browse/LUCENE-4130
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Robert Muir
 Attachments: LUCENE-4130.patch, LUCENE-4130_test.patch


 The files returned by listAll are not actually the files in the CFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-4128) add safety to preflex segmentinfo upgrade

2012-06-11 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-4128.


Resolution: Fixed

 add safety to preflex segmentinfo upgrade
 ---

 Key: LUCENE-4128
 URL: https://issues.apache.org/jira/browse/LUCENE-4128
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Robert Muir
Assignee: Michael McCandless
 Fix For: 4.0

 Attachments: LUCENE-4128.patch, LUCENE-4128.patch


 Currently the one-time-upgrade depends on whether the upgraded .si file 
 exists. And the writing is done in a try/finally so its removed if 
 ioexception happens.
 but I think there could be a power-loss or something else in the middle of 
 this, the upgraded .si file could be bogus, then the user would have to 
 manually remove it (they probably wouldnt know).
 i think instead we should just have a marker file on completion, that we 
 create after we successfully fsync the upgraded .si file. this way if 
 something happens we just rewrite the thing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4130) CompoundFileDirectory.listAll is broken

2012-06-11 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-4130:


Attachment: LUCENE-4130.patch

cleaned-up patch, removing the duplicate code of 'find segment 
boundary/indexOf' between stripSegmentName and parseSegmentName (so its not 
easy to break the relationship between the two), and returning empty string 
from parse (which is more correct, also means CFS is transparent for files 
without a segment prefix). also removed the TODO from 
TestAllFilesHaveCodecHeader to recurse into CFS.

I think this is ready to commit

 CompoundFileDirectory.listAll is broken
 ---

 Key: LUCENE-4130
 URL: https://issues.apache.org/jira/browse/LUCENE-4130
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Robert Muir
 Attachments: LUCENE-4130.patch, LUCENE-4130.patch, 
 LUCENE-4130_test.patch


 The files returned by listAll are not actually the files in the CFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-4130) CompoundFileDirectory.listAll is broken

2012-06-11 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-4130.
-

   Resolution: Fixed
Fix Version/s: 5.0
   4.0

 CompoundFileDirectory.listAll is broken
 ---

 Key: LUCENE-4130
 URL: https://issues.apache.org/jira/browse/LUCENE-4130
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Robert Muir
 Fix For: 4.0, 5.0

 Attachments: LUCENE-4130.patch, LUCENE-4130.patch, 
 LUCENE-4130_test.patch


 The files returned by listAll are not actually the files in the CFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-trunk-Windows-Java7-64 - Build # 294 - Still Failing!

2012-06-11 Thread jenkins
Build: 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/294/

1 tests failed.
FAILED:  org.apache.solr.handler.TestReplicationHandler.test

Error Message:
expected:494 but was:0

Stack Trace:
java.lang.AssertionError: expected:494 but was:0
at 
__randomizedtesting.SeedInfo.seed([7C7B1BC6CDCC693D:F42F241C633004C5]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at org.junit.Assert.assertEquals(Assert.java:456)
at 
org.apache.solr.handler.TestReplicationHandler.doTestIndexAndConfigAliasReplication(TestReplicationHandler.java:716)
at 
org.apache.solr.handler.TestReplicationHandler.test(TestReplicationHandler.java:254)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
at 
org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53)
at 
org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551)




Build Log:
[...truncated 17504 lines...]
   [junit4]   2 60601 T2776 C193 REQ [collection1] webapp=/solr 
path=/replication 
params={command=filecontentchecksum=truegeneration=16wt=filestreamfile=_9.fnm}
 status=0 QTime=0 
   [junit4]   2 60604 T2776 C193 REQ 

Grouping - Boosting large groups

2012-06-11 Thread corwin
Hi forum,

I've implemented grouping using the TermFirstPassGroupingCollector and
TermSecondPassGroupingCollector, pretty much exactly as the example at the
API. This works really well. I'm getting a the groups sorted by the computed
relevance, within each groups the docs are sorted by a numeric field. So
far, so good.

Now I want to make things more complicated by boosting larger groups in
addition to the existing relevance sort. For example, if the first result
has a relevancy score of 1 and the group has 2 docs and the second group has
a score of 0.9 and 4 docs, I want to boost the second group so it will
appear before the first.

Basically I'm trying to boost the groups according to the number of elements
in the groups.

I couldn't figure out how to do that or find an example anywhere.

I hope I'm making sense 

Thanks in advance.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Grouping-Boosting-large-groups-tp3988959.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4120) FST should use packed integer arrays

2012-06-11 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292958#comment-13292958
 ] 

Michael McCandless commented on LUCENE-4120:


Patch looks great!

Kuromoji's TokenInfoDictionaryBuilder doesn't compile w/ the patch
... it just needs the added arg to FST.pack.

It seems sort of odd to have the new .save method on ReaderImpl... can
it be on Mutable/Impl instead, or, maybe FST does its own saving or
something?

In all the places we now pass random.nextFloat() for
acceptableOverheadRatio (to FST.pack or MemoryPostingsFormat),
shouldn't it be COMPACT .. FASTEST instead of 0.0 .. 1.0?

Can you fix the comment for FST.pack?  It's no longer necessarily 8
bytes per node ... maybe just say up to 8 bytes per node, depending
on acceptableOverheadRatio?

Maybe rename the new PackedInts.getWriter method to eg
getWriterByFormat?  I was confused on just staring at it...


 FST should use packed integer arrays
 

 Key: LUCENE-4120
 URL: https://issues.apache.org/jira/browse/LUCENE-4120
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/FSTs
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-4120.patch


 There are some places where an int[] could be advantageously replaced with a 
 packed integer array.
 I am thinking (at least) of:
  * FST.nodeAddress (GrowableWriter)
  * FST.inCounts (GrowableWriter)
  * FST.nodeRefToAddress (read-only Reader)
 The serialization/deserialization methods should be modified too in order to 
 take advantage of PackedInts.get{Reader,Writer}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4120) FST should use packed integer arrays

2012-06-11 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292966#comment-13292966
 ] 

Michael McCandless commented on LUCENE-4120:


bq. As for fst packing, the idea is fairly simple – you reduce the overall size 
of the fst by moving states which have lots incoming arcs to offsets which 
compress well (in vcoding).

That's all I did, inspired by your talk/paper... I think we could do more :)

 FST should use packed integer arrays
 

 Key: LUCENE-4120
 URL: https://issues.apache.org/jira/browse/LUCENE-4120
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/FSTs
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-4120.patch


 There are some places where an int[] could be advantageously replaced with a 
 packed integer array.
 I am thinking (at least) of:
  * FST.nodeAddress (GrowableWriter)
  * FST.inCounts (GrowableWriter)
  * FST.nodeRefToAddress (read-only Reader)
 The serialization/deserialization methods should be modified too in order to 
 take advantage of PackedInts.get{Reader,Writer}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4131) .cfs/.cfe should have a codecheader

2012-06-11 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-4131:


Attachment: LUCENE-4131.patch

patch also including CFS (for branch_4x). Trunk won't need the ugly stuff 
because it won't need to support 3.0 indexes

 .cfs/.cfe should have a codecheader
 ---

 Key: LUCENE-4131
 URL: https://issues.apache.org/jira/browse/LUCENE-4131
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 4.0
Reporter: Robert Muir
 Attachments: LUCENE-4131.patch, LUCENE-4131_cfe.patch


 The new .cfs is more tricky, but I still think we can do it. we should 
 definitely fix this for .cfe

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4061) Improvements to DirectoryTaxonomyWriter (synchronization and others)

2012-06-11 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292977#comment-13292977
 ] 

Simon Willnauer commented on LUCENE-4061:
-

patch looks good. I wonder if you can't create the ReaderManager in advance and 
make it final. I mean if you do add categories which seems to be the purpose of 
that writer you need it anyway and the costs should be considerably low. That 
would remove the need for locking on it entirely. 

 Improvements to DirectoryTaxonomyWriter (synchronization and others)
 

 Key: LUCENE-4061
 URL: https://issues.apache.org/jira/browse/LUCENE-4061
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Fix For: 4.0

 Attachments: LUCENE-4061.patch, LUCENE-4061.patch


 DirTaxoWriter synchronizes in too many places. For instance addCategory() is 
 fully synchronized, while only a small part of it needs to be.
 Additionally, getCacheMemoryUsage looks bogus - it depends on the type of the 
 TaxoWriterCache. No code uses it, so I'd like to remove it -- whoever is 
 interested can query the specific cache impl it has. Currently, only 
 Cl2oTaxoWriterCache supports it.
 If the changes will be simple, I'll port them to 3.6.1 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-4134) modify release process/scripts to use svn for rc/release publishing (svnpubsub)

2012-06-11 Thread Hoss Man (JIRA)
Hoss Man created LUCENE-4134:


 Summary: modify release process/scripts to use svn for rc/release 
publishing (svnpubsub)
 Key: LUCENE-4134
 URL: https://issues.apache.org/jira/browse/LUCENE-4134
 Project: Lucene - Java
  Issue Type: Task
Reporter: Hoss Man


By the end of 2012, all of www.apache.org *INCLUDING THE DIST DIR* must be 
entirely managed using svnpubsub ... our use of the Apache CMS for 
lucene.apache.org puts us in compliance for our main website, but the dist dir 
use for publishing release artifacts also needs to be manaved via svn.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4131) .cfs/.cfe should have a codecheader

2012-06-11 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292981#comment-13292981
 ] 

Michael McCandless commented on LUCENE-4131:


+1

 .cfs/.cfe should have a codecheader
 ---

 Key: LUCENE-4131
 URL: https://issues.apache.org/jira/browse/LUCENE-4131
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 4.0
Reporter: Robert Muir
 Attachments: LUCENE-4131.patch, LUCENE-4131_cfe.patch


 The new .cfs is more tricky, but I still think we can do it. we should 
 definitely fix this for .cfe

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-3533) Show CharFilters in Schema Browser

2012-06-11 Thread Erik Hatcher (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher reassigned SOLR-3533:
--

Assignee: Erik Hatcher  (was: Stefan Matheis (steffkes))

 Show CharFilters in Schema Browser
 --

 Key: SOLR-3533
 URL: https://issues.apache.org/jira/browse/SOLR-3533
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.0
Reporter: Erik Hatcher
Assignee: Erik Hatcher
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-3533.patch


 Schema Browser (on trunk) currently does not show CharFilters.  The example/ 
 schema has this definition that can be used to demonstrate, though it needs 
 to be uncommented:
 {code}
 fieldType name=text_char_norm class=solr.TextField 
 positionIncrementGap=100 
   analyzer
 charFilter class=solr.MappingCharFilterFactory 
 mapping=mapping-ISOLatin1Accent.txt/
 tokenizer class=solr.WhitespaceTokenizerFactory/
   /analyzer
 /fieldType
 {code}   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-3533) Show CharFilters in Schema Browser

2012-06-11 Thread Erik Hatcher (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher resolved SOLR-3533.


Resolution: Fixed

Went ahead and committed this to trunk and branch_4x

 Show CharFilters in Schema Browser
 --

 Key: SOLR-3533
 URL: https://issues.apache.org/jira/browse/SOLR-3533
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.0
Reporter: Erik Hatcher
Assignee: Erik Hatcher
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-3533.patch


 Schema Browser (on trunk) currently does not show CharFilters.  The example/ 
 schema has this definition that can be used to demonstrate, though it needs 
 to be uncommented:
 {code}
 fieldType name=text_char_norm class=solr.TextField 
 positionIncrementGap=100 
   analyzer
 charFilter class=solr.MappingCharFilterFactory 
 mapping=mapping-ISOLatin1Accent.txt/
 tokenizer class=solr.WhitespaceTokenizerFactory/
   /analyzer
 /fieldType
 {code}   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4134) modify release process/scripts to use svn for rc/release publishing (svnpubsub)

2012-06-11 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292986#comment-13292986
 ] 

Hoss Man commented on LUCENE-4134:
--


Recent email from INFRA...
{noformat}
 FYI: infrastructure policy regarding website hosting has
 changed as of November 2011: we are requiring all websites
 and dist/ dirs to be svnpubsub or ASF CMS backed by the end of 2012.
 If your PMC has already met this requirement congratulations,
 you can ignore the remainder of this post.
 
 As stated on http://www.apache.org/dev/project-site.html#svnpubsub
 we are migrating our webserver infrastructure to 100% svnpubsub
 over the course of 2012.  If your site does not currently make
 use of this technology, it is time to consider a migration effort,
 as rsync-based sites will be PERMANENTLY FROZEN in Jan 2013 due 

...

 NOTE: the policy for dist/ dirs for managing project releases is
 similar.  We have setup a dedicated svn server for handling this,
 please contact infra when you are ready to start using it.
{noformat}

Some docs...

http://www.apache.org/dev/release.html#upload-ci

At a minimum we need to open a Jira with INFRA when we are ready for them to 
setup https://dist.apache.org/repos/dist/release/lucene; and start using it 
for subsequent release publishing (instead of copying to the magic dist dir 
on people.apache.org and waiting for rsync.  But as part of this new process 
there will also be a https://dist.apache.org/repos/dist/dev/lucene; directory 
where release candidates can be put for review (instead of 
people.apache.org/~releasemanager/...), and if/when they are voted successfully 
a simple svn mv to dist/release/lucene makes them official and pushes them to 
the mirrors.

So we should also change our release scripts to start svn committing the 
release candidates there instead of scping to people.apache.org


 modify release process/scripts to use svn for rc/release publishing 
 (svnpubsub)
 ---

 Key: LUCENE-4134
 URL: https://issues.apache.org/jira/browse/LUCENE-4134
 Project: Lucene - Java
  Issue Type: Task
Reporter: Hoss Man

 By the end of 2012, all of www.apache.org *INCLUDING THE DIST DIR* must be 
 entirely managed using svnpubsub ... our use of the Apache CMS for 
 lucene.apache.org puts us in compliance for our main website, but the dist 
 dir use for publishing release artifacts also needs to be manaved via svn.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4134) modify release process/scripts to use svn for rc/release publishing (svnpubsub)

2012-06-11 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292988#comment-13292988
 ] 

Robert Muir commented on LUCENE-4134:
-

I agree we should do the first part. As for the second part, i personally dont 
want to use any scripts that ssh or svn commit automatically so its no problem 
for me. 

I think instead we should just have instructions on where we should commit 
things manually in ReleaseTODO etc. If someone wants to add automation thats 
great, but I just don't like automation when it comes to my passwords.

 modify release process/scripts to use svn for rc/release publishing 
 (svnpubsub)
 ---

 Key: LUCENE-4134
 URL: https://issues.apache.org/jira/browse/LUCENE-4134
 Project: Lucene - Java
  Issue Type: Task
Reporter: Hoss Man

 By the end of 2012, all of www.apache.org *INCLUDING THE DIST DIR* must be 
 entirely managed using svnpubsub ... our use of the Apache CMS for 
 lucene.apache.org puts us in compliance for our main website, but the dist 
 dir use for publishing release artifacts also needs to be manaved via svn.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4134) modify release process/scripts to use svn for rc/release publishing (svnpubsub)

2012-06-11 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292994#comment-13292994
 ] 

Hoss Man commented on LUCENE-4134:
--

rmuir: doesn't the automation already exist in buildAndPushrelease.py ? ... 
doesn't that automatically scp to RCs to 
people.apache.org:~you/public_html/staging_area/ ?

i'm just suggesting we change that to do the svn commit to 
https://dist.apache.org/repos/dist/dev/lucene ... the RCs are still uploaded 
automaticly, they would just start geting uploaded to an INFRA blessed 
location that would make it easier to (manually) publish them post-VOTE with an 
svn mv

 modify release process/scripts to use svn for rc/release publishing 
 (svnpubsub)
 ---

 Key: LUCENE-4134
 URL: https://issues.apache.org/jira/browse/LUCENE-4134
 Project: Lucene - Java
  Issue Type: Task
Reporter: Hoss Man

 By the end of 2012, all of www.apache.org *INCLUDING THE DIST DIR* must be 
 entirely managed using svnpubsub ... our use of the Apache CMS for 
 lucene.apache.org puts us in compliance for our main website, but the dist 
 dir use for publishing release artifacts also needs to be manaved via svn.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-4135) TestNumericQueryParser fails on java 7

2012-06-11 Thread Robert Muir (JIRA)
Robert Muir created LUCENE-4135:
---

 Summary: TestNumericQueryParser fails on java 7
 Key: LUCENE-4135
 URL: https://issues.apache.org/jira/browse/LUCENE-4135
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/queryparser
Affects Versions: 4.0
Reporter: Robert Muir


http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Linux-Java7-64/49/

Seed reproduces on branch-4x for me on linux as well:
ant test  -Dtestcase=TestNumericQueryParser -Dtests.seed=E6EC0E1871B28E1E 
-Dtests.multiplier=3 -Dtests.locale=es_PE -Dtests.timezone=Africa/Tunis 
-Dargs=-Dfile.encoding=UTF-8

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4134) modify release process/scripts to use svn for rc/release publishing (svnpubsub)

2012-06-11 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292997#comment-13292997
 ] 

Robert Muir commented on LUCENE-4134:
-

Hoss: right I'm saying i dont use that script :)

I think we should still lay out the new instructions on the wiki for people who 
dont want scripts svn committing for them, thats all.

 modify release process/scripts to use svn for rc/release publishing 
 (svnpubsub)
 ---

 Key: LUCENE-4134
 URL: https://issues.apache.org/jira/browse/LUCENE-4134
 Project: Lucene - Java
  Issue Type: Task
Reporter: Hoss Man

 By the end of 2012, all of www.apache.org *INCLUDING THE DIST DIR* must be 
 entirely managed using svnpubsub ... our use of the Apache CMS for 
 lucene.apache.org puts us in compliance for our main website, but the dist 
 dir use for publishing release artifacts also needs to be manaved via svn.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4134) modify release process/scripts to use svn for rc/release publishing (svnpubsub)

2012-06-11 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293001#comment-13293001
 ] 

Hoss Man commented on LUCENE-4134:
--

Ah ... sorry ... i thought you were saying we shouldn't add automation for 
this ... didn't realize you ment i don't use the automation we currently have

bq. I think we should still lay out the new instructions on the wiki for people 
who dont want scripts svn committing for them, thats all.

+1

 modify release process/scripts to use svn for rc/release publishing 
 (svnpubsub)
 ---

 Key: LUCENE-4134
 URL: https://issues.apache.org/jira/browse/LUCENE-4134
 Project: Lucene - Java
  Issue Type: Task
Reporter: Hoss Man

 By the end of 2012, all of www.apache.org *INCLUDING THE DIST DIR* must be 
 entirely managed using svnpubsub ... our use of the Apache CMS for 
 lucene.apache.org puts us in compliance for our main website, but the dist 
 dir use for publishing release artifacts also needs to be manaved via svn.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4120) FST should use packed integer arrays

2012-06-11 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293012#comment-13293012
 ] 

Dawid Weiss commented on LUCENE-4120:
-

bq. That's all I did, inspired by your talk/paper... I think we could do more 

Remember I didn't talk about my _failed_ attempts, there is a very likely 
chance you may be thinking about those ;) 

 FST should use packed integer arrays
 

 Key: LUCENE-4120
 URL: https://issues.apache.org/jira/browse/LUCENE-4120
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/FSTs
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-4120.patch


 There are some places where an int[] could be advantageously replaced with a 
 packed integer array.
 I am thinking (at least) of:
  * FST.nodeAddress (GrowableWriter)
  * FST.inCounts (GrowableWriter)
  * FST.nodeRefToAddress (read-only Reader)
 The serialization/deserialization methods should be modified too in order to 
 take advantage of PackedInts.get{Reader,Writer}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-trunk-Linux-Java6-64 - Build # 862 - Failure!

2012-06-11 Thread jenkins
Build: 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux-Java6-64/862/

2 tests failed.
REGRESSION:  
org.apache.solr.handler.dataimport.TestSqlEntityProcessorDelta3.testCompositePk_DeltaImport_empty

Error Message:
org.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed

Stack Trace:
org.apache.solr.common.SolrException: 
org.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed
at 
__randomizedtesting.SeedInfo.seed([722383488ECB025B:B52018272F5E5F59]:0)
at org.apache.solr.util.TestHarness.update(TestHarness.java:260)
at 
org.apache.solr.util.TestHarness.checkUpdateStatus(TestHarness.java:304)
at org.apache.solr.util.TestHarness.validateUpdate(TestHarness.java:274)
at org.apache.solr.SolrTestCaseJ4.checkUpdateU(SolrTestCaseJ4.java:413)
at org.apache.solr.SolrTestCaseJ4.assertU(SolrTestCaseJ4.java:392)
at org.apache.solr.SolrTestCaseJ4.assertU(SolrTestCaseJ4.java:386)
at org.apache.solr.SolrTestCaseJ4.clearIndex(SolrTestCaseJ4.java:758)
at 
org.apache.solr.handler.dataimport.TestSqlEntityProcessorDelta3.setUp(TestSqlEntityProcessorDelta3.java:67)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:873)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
at 
org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53)
at 
org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551)
Caused by: 

[jira] [Resolved] (LUCENE-3949) Fix license headers in all Java files to not be in Javadocs /** format

2012-06-11 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved LUCENE-3949.
--

Resolution: Fixed

Committed revision 1348980. - trunk
Committed revision 1348984. - 4x


 Fix license headers in all Java files to not be in Javadocs /** format
 --

 Key: LUCENE-3949
 URL: https://issues.apache.org/jira/browse/LUCENE-3949
 Project: Lucene - Java
  Issue Type: Task
Reporter: Uwe Schindler
Assignee: Hoss Man
 Fix For: 4.0

 Attachments: LUCENE-3949.patch, fix-license-jdoc.pl


 Our current License headers in all .java files are (for a reason I don't 
 know) in Javadocs format. Means, when you have a class without javadocs, the 
 License header is used as Javadocs.
 I reviewed lots of other Apache projects, most of them use the correct /* 
 header, but some (including Lucene+Solr) the Javadocs one. We should change 
 this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Issues with whitespace tokenization in QueryParser

2012-06-11 Thread Chris Hostetter
: 
: NOTE: I definitely don't want to discourage you from tackling this
: issue, but I think its fair to mention there is a workaround, and
: thats if you can preprocess your queries yourself (maybe you dont
: allow all the lucene syntax to your users or something like that), you
: can escape the whitespace yourself such as rain\ coat, and I think
: your synonyms will work as expected.

Alternatively: use a QueryParser that doesn't know/care about any special 
markup and just analyzes the entire input against a single (configured) 
field and generates the appropriate query -- Solr's FieldQParser works 
this way for example.

You have to pick a tradeoff between i want to support query operators 
like ':', '+', '-', and ' ' that let me build up BooleanQuery objects and 
query specific fields vs i want the entire query string analyzed as one 
chunk

:  really tripping them up. A prime example is that a search for dress shoes
:  returns a list of dresses and random shoes (not necessarily dress shoes). I
:  wish that I was able to synonym compound words to single tokens (e.g. dress
:  shoes = dress_shoes), but with this whitespace tokenization issue, it's
:  impossible.

this is one of the main use cases of the DismaxQParser (and now 
EDismaxQParser as well) with the pf param in solr ... you can have it 
query for both dress and/or shoes in som set of fields (qf) but also 
for the entire phrase dress shoes in a distinct set of fields (pf) which 
get a higher score.

http://wiki.apache.org/solr/DisMax
http://wiki.apache.org/solr/DisMaxQParserPlugin
http://www.lucidimagination.com/blog/2010/05/23/whats-a-dismax/



-Hoss

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4131) .cfs/.cfe should have a codecheader

2012-06-11 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293028#comment-13293028
 ] 

Uwe Schindler commented on LUCENE-4131:
---

+1, the crazy code is funny to read  understand ;-]

 .cfs/.cfe should have a codecheader
 ---

 Key: LUCENE-4131
 URL: https://issues.apache.org/jira/browse/LUCENE-4131
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 4.0
Reporter: Robert Muir
 Attachments: LUCENE-4131.patch, LUCENE-4131_cfe.patch


 The new .cfs is more tricky, but I still think we can do it. we should 
 definitely fix this for .cfe

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-trunk-Windows-Java6-64 - Build # 507 - Failure!

2012-06-11 Thread jenkins
Build: 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/507/

1 tests failed.
REGRESSION:  org.apache.solr.handler.TestReplicationHandler.test

Error Message:
expected:498 but was:0

Stack Trace:
java.lang.AssertionError: expected:498 but was:0
at 
__randomizedtesting.SeedInfo.seed([FB5AB0EC651FBCE9:730E8F36CBE3D111]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at org.junit.Assert.assertEquals(Assert.java:456)
at 
org.apache.solr.handler.TestReplicationHandler.doTestIndexAndConfigReplication(TestReplicationHandler.java:391)
at 
org.apache.solr.handler.TestReplicationHandler.test(TestReplicationHandler.java:250)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
at 
org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53)
at 
org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551)




Build Log:
[...truncated 17145 lines...]
   [junit4]   2 43351 T3021 oash.SnapPuller.fetchLatestIndex Number of files 
in latest index in master: 10
   [junit4]   2 43356 T3004 C168 REQ [collection1] webapp=/solr 
path=/replication 

[jira] [Created] (SOLR-3534) dismax and edismax should default to df when qf is absent.

2012-06-11 Thread David Smiley (JIRA)
David Smiley created SOLR-3534:
--

 Summary: dismax and edismax should default to df when qf is 
absent.
 Key: SOLR-3534
 URL: https://issues.apache.org/jira/browse/SOLR-3534
 Project: Solr
  Issue Type: Improvement
  Components: query parsers
Affects Versions: 4.0
Reporter: David Smiley
Assignee: David Smiley
Priority: Minor


The dismax and edismax query parsers should default to df when the qf 
parameter is absent.  They only use the defaultSearchField in schema.xml as a 
fallback now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2724) Deprecate defaultSearchField and defaultOperator defined in schema.xml

2012-06-11 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293054#comment-13293054
 ] 

David Smiley commented on SOLR-2724:


Bernd:
Support for a default search field name still exists mostly for compatibility, 
and in perhaps some peoples' views as a matter of preference.  It wasn't 
actually required before its deprecation.  I thought it was only the fallback 
for parsing a lucene query but you indeed point out dismax has it as a fallback 
for 'qf', and it's used by the highligher as a fallback for  'hl.fl' although 
it appears the highlighter consults 'df' too.  The main point behind its 
deprecation is that I think you should be explicit in a request which field(s) 
apply to what query strings or other features because the schema (schema.xml) 
can't know.  The same applies to the default query operator which is even more 
of an odd duck sitting in schema.xml.

Bernd, simply define qf in your request handler definition to make Solr 
respond correctly to the same queries you had before.  Arguably, Dismax/Edismax 
should consult df as a default when qf isn't specified.  I created 
SOLR-3534 for this issue.

 Deprecate defaultSearchField and defaultOperator defined in schema.xml
 --

 Key: SOLR-2724
 URL: https://issues.apache.org/jira/browse/SOLR-2724
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis, search
Reporter: David Smiley
Assignee: David Smiley
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: 
 SOLR-2724_deprecateDefaultSearchField_and_defaultOperator.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 I've always been surprised to see the defaultSearchField element and 
 solrQueryParser defaultOperator=OR/ defined in the schema.xml file since 
 the first time I saw them.  They just seem out of place to me since they are 
 more query parser related than schema related. But not only are they 
 misplaced, I feel they shouldn't exist. For query parsers, we already have a 
 df parameter that works just fine, and explicit field references. And the 
 default lucene query operator should stay at OR -- if a particular query 
 wants different behavior then use q.op or simply use OR.
 similarity Seems like something better placed in solrconfig.xml than in the 
 schema. 
 In my opinion, defaultSearchField and defaultOperator configuration elements 
 should be deprecated in Solr 3.x and removed in Solr 4.  And similarity 
 should move to solrconfig.xml. I am willing to do it, provided there is 
 consensus on it of course.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4061) Improvements to DirectoryTaxonomyWriter (synchronization and others)

2012-06-11 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293063#comment-13293063
 ] 

Shai Erera commented on LUCENE-4061:


Believe me, I wanted to avoid it too, but ReaderManager is allocated like that 
for few reasons:

* It's lazy, as the comment in the code says -- it's a waste to open an IR if 
your DirTaxoWriter session is going to be short living.
** Personally, I think this is a minor issue, and if it were only that, I'd 
make it final.

* The TaxoWriterCache can be 'complete' which means all the categories 
currently known to DirTW are cached. In that case, it is a waste to keep the 
reader open and we close it.
** This is true for Cl2oCache, since it keeps all categories in memory.
** But LruCache is not like that, since it potentially evicts entries from the 
cache. So it can be 'complete' until it evicts the first entry, in which case 
it will never be complete, and we'll need to keep the reader open.

Currently, when we don't need ReaderManager, we close it. We also don't open it 
until few cache misses occur. To change it would mean to sacrifice efficiency 
by always keeping a Reader open, even if it's not needed. It wastes RAM, file 
handles and what not.

Not sure it's worth it. What do you think?

 Improvements to DirectoryTaxonomyWriter (synchronization and others)
 

 Key: LUCENE-4061
 URL: https://issues.apache.org/jira/browse/LUCENE-4061
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Fix For: 4.0

 Attachments: LUCENE-4061.patch, LUCENE-4061.patch


 DirTaxoWriter synchronizes in too many places. For instance addCategory() is 
 fully synchronized, while only a small part of it needs to be.
 Additionally, getCacheMemoryUsage looks bogus - it depends on the type of the 
 TaxoWriterCache. No code uses it, so I'd like to remove it -- whoever is 
 interested can query the specific cache impl it has. Currently, only 
 Cl2oTaxoWriterCache supports it.
 If the changes will be simple, I'll port them to 3.6.1 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3534) dismax and edismax should default to df when qf is absent.

2012-06-11 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293073#comment-13293073
 ] 

Jack Krupansky commented on SOLR-3534:
--

I would also suggest that the default if neither qf or df is present should be 
text preferably as a symbolic constant.

 dismax and edismax should default to df when qf is absent.
 --

 Key: SOLR-3534
 URL: https://issues.apache.org/jira/browse/SOLR-3534
 Project: Solr
  Issue Type: Improvement
  Components: query parsers
Affects Versions: 4.0
Reporter: David Smiley
Assignee: David Smiley
Priority: Minor

 The dismax and edismax query parsers should default to df when the qf 
 parameter is absent.  They only use the defaultSearchField in schema.xml as a 
 fallback now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3534) dismax and edismax should default to df when qf is absent.

2012-06-11 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293074#comment-13293074
 ] 

David Smiley commented on SOLR-3534:


RE text default -- that would be yet another default and worse, IMO, is it 
would be too hidden of a default.  Being explicit by specifying a parameter on 
the request is best, IMO.

 dismax and edismax should default to df when qf is absent.
 --

 Key: SOLR-3534
 URL: https://issues.apache.org/jira/browse/SOLR-3534
 Project: Solr
  Issue Type: Improvement
  Components: query parsers
Affects Versions: 4.0
Reporter: David Smiley
Assignee: David Smiley
Priority: Minor

 The dismax and edismax query parsers should default to df when the qf 
 parameter is absent.  They only use the defaultSearchField in schema.xml as a 
 fallback now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-3536) How can I treat special character as alpha. like !@#$%^*().

2012-06-11 Thread phatak.prachi (JIRA)
phatak.prachi created SOLR-3536:
---

 Summary: How can I treat special character as alpha. like 
!@#$%^*().
 Key: SOLR-3536
 URL: https://issues.apache.org/jira/browse/SOLR-3536
 Project: Solr
  Issue Type: Wish
Reporter: phatak.prachi


I need to allow search on the special characters. Example if I have
Wi-Fi
RET-34
Wi fi

and user enters only -, then it should return Wi-fi and RET-34

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-3535) Add block support for XMLLoader

2012-06-11 Thread Mikhail Khludnev (JIRA)
Mikhail Khludnev created SOLR-3535:
--

 Summary: Add block support for XMLLoader
 Key: SOLR-3535
 URL: https://issues.apache.org/jira/browse/SOLR-3535
 Project: Solr
  Issue Type: Sub-task
  Components: update
Affects Versions: 4.1, 5.0
Reporter: Mikhail Khludnev
Priority: Minor


I'd like to add the following update xml message:

add-block
doc/doc
doc/doc
/add-block

out of scope for now: 
* other update formats
* update log support (NRT), should not be a big deal
* overwrite feature support for block updates - it's more complicated, I'll 
tell you why

Alt
* wdyt about adding attribute to the current tag {pre}add block=true{pre} 
* or we can establish RunBlockUpdateProcessor which treat every add 
/add as a block.

*Test is included!!*
How you'd suggest to improve the patch?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3535) Add block support for XMLLoader

2012-06-11 Thread Mikhail Khludnev (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Khludnev updated SOLR-3535:
---

Attachment: SOLR-3535.patch

 Add block support for XMLLoader
 ---

 Key: SOLR-3535
 URL: https://issues.apache.org/jira/browse/SOLR-3535
 Project: Solr
  Issue Type: Sub-task
  Components: update
Affects Versions: 4.1, 5.0
Reporter: Mikhail Khludnev
Priority: Minor
 Attachments: SOLR-3535.patch


 I'd like to add the following update xml message:
 add-block
 doc/doc
 doc/doc
 /add-block
 out of scope for now: 
 * other update formats
 * update log support (NRT), should not be a big deal
 * overwrite feature support for block updates - it's more complicated, I'll 
 tell you why
 Alt
 * wdyt about adding attribute to the current tag {pre}add block=true{pre} 
 * or we can establish RunBlockUpdateProcessor which treat every add 
 /add as a block.
 *Test is included!!*
 How you'd suggest to improve the patch?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-3536) How can I treat special character as alpha. like !@#$%^*().

2012-06-11 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-3536.


Resolution: Not A Problem

please ask questions like this on the solr-user mailing list...

http://lucene.apache.org/solr/discussion.html

 How can I treat special character as alpha. like !@#$%^*().
 

 Key: SOLR-3536
 URL: https://issues.apache.org/jira/browse/SOLR-3536
 Project: Solr
  Issue Type: Wish
Reporter: phatak.prachi
  Labels: newbie

 I need to allow search on the special characters. Example if I have
 Wi-Fi
 RET-34
 Wi fi
 and user enters only -, then it should return Wi-fi and RET-34

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-2352) TermVectorComponent fails with Undefined Field errors for score, *, or any Solr 4x psuedo-fields used in the fl param.

2012-06-11 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-2352.


Resolution: Fixed

Committed revision 1349012. - trunk
Committed revision 1349013.  - 4x


 TermVectorComponent fails with Undefined Field errors for score, *, or any 
 Solr 4x psuedo-fields used in the fl param.
 --

 Key: SOLR-2352
 URL: https://issues.apache.org/jira/browse/SOLR-2352
 Project: Solr
  Issue Type: Bug
  Components: SearchComponents - other
Affects Versions: 3.1
 Environment: Ubuntu 10.04/Arch solr 3.x branch r1058326
Reporter: Jed Glazner
Assignee: Hoss Man
 Fix For: 4.0

 Attachments: SOLR-2352.patch


 When searching using the term vector components and setting fl=*,score the 
 result is a http 400 error 'undefined field: *'. If you disable the tvc the 
 search works properly.
 Example bad request...
 {code}http://localhost:8983/solr/select/?qt=tvrhq=includes:[*+TO+*]fl=*{code}
 3.1 stack trace:
 {noformat}
 SEVERE: org.apache.solr.common.SolrException: undefined field: *
at 
 org.apache.solr.handler.component.TermVectorComponent.process(TermVectorComponent.java:142)
 ...
 {noformat}
 The work around is to explicitly use the tv.fl param when using psuedo-fields 
 in the fl...
 {code}http://localhost:8983/solr/select/?qt=tvrhq=includes:[*+TO+*]fl=*tv.fl=includes{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-3537) TermVectorComponent should support globs in fl and tv.fl combined with per-field overrides of other params

2012-06-11 Thread Hoss Man (JIRA)
Hoss Man created SOLR-3537:
--

 Summary: TermVectorComponent should support globs in fl and tv.fl 
combined with per-field overrides of other params
 Key: SOLR-3537
 URL: https://issues.apache.org/jira/browse/SOLR-3537
 Project: Solr
  Issue Type: Task
Reporter: Hoss Man


TermVectorComponent should be improved so that it fields can be specified in 
tv.fl (or fl) using globs ala the ReturnFields helper class.  per field 
overrides for the various options TVC supports should work will all fields, 
even if specified as part of a glob.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >