[HUDSON] Lucene-Solr-tests-only-3.x - Build # 6965 - Failure

2011-04-11 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/6965/

1 tests failed.
REGRESSION:  org.apache.lucene.collation.TestCollationKeyAnalyzer.testThreadSafe

Error Message:
Java heap space

Stack Trace:
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2894)
at 
java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:117)
at 
java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:589)
at java.lang.StringBuffer.append(StringBuffer.java:337)
at 
java.text.RuleBasedCollator.getCollationKey(RuleBasedCollator.java:617)
at 
org.apache.lucene.collation.CollationKeyFilter.incrementToken(CollationKeyFilter.java:93)
at 
org.apache.lucene.collation.CollationTestBase.assertThreadSafe(CollationTestBase.java:304)
at 
org.apache.lucene.collation.TestCollationKeyAnalyzer.testThreadSafe(TestCollationKeyAnalyzer.java:89)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1082)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1010)




Build Log (for compile errors):
[...truncated 5272 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2276) Support for cologne phonetic

2011-04-11 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018217#comment-13018217
 ] 

Uwe Schindler commented on SOLR-2276:
-

Hi Marc,
to remove the unneeded cast leading to an unchecked warning, also calling 
static methods on instances is a no-go:
Replace:
{code}
clazz = (Class? extends Encoder) this.getClass().forName(name);
{code}
By:
{code}
clazz = Class.forName(name).asSubclass(Encoder.class);
{code}

 Support for cologne phonetic
 

 Key: SOLR-2276
 URL: https://issues.apache.org/jira/browse/SOLR-2276
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 1.4.1
 Environment: Apache Commons Codec 1.5
Reporter: Marc Pompl
 Fix For: 4.0

 Attachments: ColognePhonetic-patch-with-reflection.txt

   Original Estimate: 2h
  Remaining Estimate: 2h

 As soon as Apache Commons Codec 1.5 is released, support new encoder 
 ColognePhonetic please.
 See JIRA for CODEC-106.
 It is fundamental for phonetic searches if you are indexing german names. 
 Other indexers are optimizied for english (words).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Edited] (SOLR-2276) Support for cologne phonetic

2011-04-11 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018217#comment-13018217
 ] 

Uwe Schindler edited comment on SOLR-2276 at 4/11/11 6:51 AM:
--

Hi Marc,
to remove the unneeded cast leading to an unchecked warning, also calling 
static methods on instances is a no-go:
Replace:
{code}
clazz = (Class? extends Encoder) this.getClass().forName(name);
{code}
By:
{code}
clazz = Class.forName(name).asSubclass(Encoder.class);
{code}

Aditionally, maybe the reflection should try twice:
- First try by prepending the package name 
(org.apache.commons.codec.language.), as the user expects the encoder in this 
package. With your code, the user has to add the full class name.
- Use name param as full class name.

The only problem with the reflection code is that it is case sensitive, so this 
is different from the registry map.

  was (Author: thetaphi):
Hi Marc,
to remove the unneeded cast leading to an unchecked warning, also calling 
static methods on instances is a no-go:
Replace:
{code}
clazz = (Class? extends Encoder) this.getClass().forName(name);
{code}
By:
{code}
clazz = Class.forName(name).asSubclass(Encoder.class);
{code}
  
 Support for cologne phonetic
 

 Key: SOLR-2276
 URL: https://issues.apache.org/jira/browse/SOLR-2276
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 1.4.1
 Environment: Apache Commons Codec 1.5
Reporter: Marc Pompl
 Fix For: 4.0

 Attachments: ColognePhonetic-patch-with-reflection.txt

   Original Estimate: 2h
  Remaining Estimate: 2h

 As soon as Apache Commons Codec 1.5 is released, support new encoder 
 ColognePhonetic please.
 See JIRA for CODEC-106.
 It is fundamental for phonetic searches if you are indexing german names. 
 Other indexers are optimizied for english (words).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-2463) Using an evaluator outside the scope of an entity results in a null context

2011-04-11 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar reassigned SOLR-2463:
---

Assignee: Shalin Shekhar Mangar

 Using an evaluator outside the scope of an entity results in a null context
 ---

 Key: SOLR-2463
 URL: https://issues.apache.org/jira/browse/SOLR-2463
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 3.1, 3.1.1, 4.0
Reporter: Robert Zotter
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 3.1.1


 When using an Evaluator outside an entity element the Context argument is 
 null.
 {code:title=foo.LowerCaseFunctionEvaluator.java|borderStyle=solid}
 public class LowerCaseFunctionEvaluator extends Evaluator {
  public String evaluate(String expression, Context context) {
List l = EvaluatorBag.parseParams(expression, 
 context.getVariableResolver());

if (l.size() != 1) {
  throw new RuntimeException('toLowerCase' must have only one parameter 
 );
}
return l.get(0).toString().toLowerCase();
  }
 }
 {code}
 {code:title=data-config.xml|borderStyle=solid}
 dataSource name=...
 type=...
 driver=...
 url=...
 user=${dataimporter.functions.toLowerCase('THIS_WILL_NOT_WORK')}
 password=.../
 {code}
 {code:title=data-config.xml|borderStyle=solid}
 entity name=...
 dataSource=...
 query=select * from 
 ${dataimporter.functions.toLowerCase('THIS_WILL_WORK')}/
 {code}
 This use case worked in 1.4

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-2798) Randomize indexed collation key testing

2011-04-11 Thread Steven Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rowe updated LUCENE-2798:


Attachment: LUCENE-2798.patch

work in progress: JDK-only Analyzer-only test: 
{{TestCollationKeyAnalyzer.testRandomizedCollationKeySort()}}.

The test succeeds most of the times I run it, but sometimes fails, e.g. for 
these seeds:

* 3253903552510972177:-5236779063463918718
* 1469913545269555695:-7929666046197505961

Robert, would you please take a look at the code and see if you can figure out 
why the test fails?

 Randomize indexed collation key testing
 ---

 Key: LUCENE-2798
 URL: https://issues.apache.org/jira/browse/LUCENE-2798
 Project: Lucene - Java
  Issue Type: Test
  Components: Analysis
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-2798.patch


 Robert Muir noted on #lucene IRC channel today that Lucene's indexed 
 collation key testing is currently fragile (for example, they had to be 
 revisited when Robert upgraded the ICU dependency in LUCENE-2797 because of 
 Unicode 6.0 collation changes) and coverage is trivial (only 5 locales 
 tested, and no collator options are exercised).  This affects both the JDK 
 implementation in {{modules/analysis/common/}} and the ICU implementation 
 under {{modules/icu/}}.
 The key thing to test is that the order of the indexed terms is the same as 
 that provided by the Collator itself.  Instead of the current set of static 
 tests, this could be achieved via indexing randomly generated terms' 
 collation keys (and collator options) and then comparing the index terms' 
 order to the order provided by the Collator over the original terms.
 Since different terms may produce the same collation key, however, the order 
 of indexed terms is inherently unstable.  When performing runtime collation, 
 the Collator addresses the sort stability issue by adding a secondary sort 
 over the normalized original terms.  In order to directly compare Collator's 
 sort with Lucene's collation key sort, a secondary sort will need to be 
 applied to Lucene's indexed terms as well. Robert has suggested indexing the 
 original terms in addition to their collation keys, then using a Sort over 
 the original terms as the secondary sort.
 Another complication: Lucene 3.X uses Java's UTF-16 term comparison, and 
 trunk uses UTF-8 order, so the implemented secondary sort will need to 
 respect that.
 From #lucene:
 {quote}
 rmuir__: so i think we have to on 3.x, sort the 'expected list' with 
 Collator.compare, if thats equal, then as a tiebreak use String.compareTo
 rmuir__: and in the index sort on the collated field, followed by the 
 original term
 rmuir__: in 4.x we do the same thing, but dont use String.compareTo as the 
 tiebreak for the expected list
 rmuir__: instead compare codepoints (iterating character.codepointAt, or 
 comparing .getBytes(UTF-8))
 {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[HUDSON] Lucene-Solr-tests-only-trunk - Build # 6977 - Failure

2011-04-11 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/6977/

1 tests failed.
REGRESSION:  org.apache.solr.cloud.CloudStateUpdateTest.testCoreRegistration

Error Message:
null

Stack Trace:
org.apache.solr.common.cloud.ZooKeeperException: 
at org.apache.solr.cloud.ZkController.init(ZkController.java:301)
at org.apache.solr.cloud.ZkController.init(ZkController.java:133)
at 
org.apache.solr.core.CoreContainer.initZooKeeper(CoreContainer.java:164)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:333)
at 
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:242)
at 
org.apache.solr.cloud.CloudStateUpdateTest.testCoreRegistration(CloudStateUpdateTest.java:216)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1232)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1160)
Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: 
KeeperErrorCode = ConnectionLoss for 
/live_nodes/lucene.zones.apache.org:8983_solr
at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
at 
org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:347)
at 
org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:308)
at 
org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:290)
at 
org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:260)
at 
org.apache.solr.cloud.ZkController.createEphemeralLiveNode(ZkController.java:372)
at org.apache.solr.cloud.ZkController.init(ZkController.java:285)




Build Log (for compile errors):
[...truncated 8844 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[HUDSON] Lucene-Solr-tests-only-trunk - Build # 6981 - Failure

2011-04-11 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/6981/

1 tests failed.
REGRESSION:  org.apache.lucene.index.TestIndexReaderReopen.testThreadSafety

Error Message:
Error occurred in thread Thread-70: 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/test/1/test3311650109925518976tmp/_c_2.pyl
 (Too many open files in system)

Stack Trace:
junit.framework.AssertionFailedError: Error occurred in thread Thread-70:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/test/1/test3311650109925518976tmp/_c_2.pyl
 (Too many open files in system)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1232)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1160)
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/test/1/test3311650109925518976tmp/_c_2.pyl
 (Too many open files in system)
at 
org.apache.lucene.index.TestIndexReaderReopen.testThreadSafety(TestIndexReaderReopen.java:833)




Build Log (for compile errors):
[...truncated 3216 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3017) FST should differentiate between final vs non-final stop nodes

2011-04-11 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3017:
---

Attachment: LUCENE-3017.patch

Patch.

 FST should differentiate between final vs non-final stop nodes
 --

 Key: LUCENE-3017
 URL: https://issues.apache.org/jira/browse/LUCENE-3017
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-3017.patch


 I'm breaking out this one improvement from LUCENE-2948...
 Currently, if a node has no outgoing edges (a stop node) the FST
 forcefully marks this as a final node, but it need not do this.  Ie,
 whether that node is final or not should be orthogonal to whether it
 has arcs leaving or not.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3017) FST should differentiate between final vs non-final stop nodes

2011-04-11 Thread Michael McCandless (JIRA)
FST should differentiate between final vs non-final stop nodes
--

 Key: LUCENE-3017
 URL: https://issues.apache.org/jira/browse/LUCENE-3017
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 4.0
 Attachments: LUCENE-3017.patch

I'm breaking out this one improvement from LUCENE-2948...

Currently, if a node has no outgoing edges (a stop node) the FST
forcefully marks this as a final node, but it need not do this.  Ie,
whether that node is final or not should be orthogonal to whether it
has arcs leaving or not.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3017) FST should differentiate between final vs non-final stop nodes

2011-04-11 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018282#comment-13018282
 ] 

Dawid Weiss commented on LUCENE-3017:
-

Ehm... an automaton with zero-arc nodes that are not final is no no longer an 
automaton, but a graph of some sort... I mean -- what is the interpretation of 
an empty non-final node? There is no sequence in the input that corresponds to 
this path and it is a prefix of some path in the input that you can't get from 
this automaton, right? 

This slowly becomes very confusing... the patch looks all right, but I'm 
thinking if the API overall is still clear.

 FST should differentiate between final vs non-final stop nodes
 --

 Key: LUCENE-3017
 URL: https://issues.apache.org/jira/browse/LUCENE-3017
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-3017.patch


 I'm breaking out this one improvement from LUCENE-2948...
 Currently, if a node has no outgoing edges (a stop node) the FST
 forcefully marks this as a final node, but it need not do this.  Ie,
 whether that node is final or not should be orthogonal to whether it
 has arcs leaving or not.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2798) Randomize indexed collation key testing

2011-04-11 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018283#comment-13018283
 ] 

Robert Muir commented on LUCENE-2798:
-

just a glance: 

it may be the use of _TestUtil.randomUnicodeString here.
it is not just avoiding supplementaries, but also avoiding things like U+

bottom line: there are serious bugs in this stuff, and even my current 
testThreadSafe i think is not completely avoiding them (I seem to trigger a 
OOM from the jre impl every few days)

I've thought about @Ignore'ing our current testThreadSafe for this reason... I 
don't like dancing around known bugs in a test like this, it makes the test 
stupid. Somehow this stuff needs to get fixed in ICU/OpenJDK.


 Randomize indexed collation key testing
 ---

 Key: LUCENE-2798
 URL: https://issues.apache.org/jira/browse/LUCENE-2798
 Project: Lucene - Java
  Issue Type: Test
  Components: Analysis
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-2798.patch


 Robert Muir noted on #lucene IRC channel today that Lucene's indexed 
 collation key testing is currently fragile (for example, they had to be 
 revisited when Robert upgraded the ICU dependency in LUCENE-2797 because of 
 Unicode 6.0 collation changes) and coverage is trivial (only 5 locales 
 tested, and no collator options are exercised).  This affects both the JDK 
 implementation in {{modules/analysis/common/}} and the ICU implementation 
 under {{modules/icu/}}.
 The key thing to test is that the order of the indexed terms is the same as 
 that provided by the Collator itself.  Instead of the current set of static 
 tests, this could be achieved via indexing randomly generated terms' 
 collation keys (and collator options) and then comparing the index terms' 
 order to the order provided by the Collator over the original terms.
 Since different terms may produce the same collation key, however, the order 
 of indexed terms is inherently unstable.  When performing runtime collation, 
 the Collator addresses the sort stability issue by adding a secondary sort 
 over the normalized original terms.  In order to directly compare Collator's 
 sort with Lucene's collation key sort, a secondary sort will need to be 
 applied to Lucene's indexed terms as well. Robert has suggested indexing the 
 original terms in addition to their collation keys, then using a Sort over 
 the original terms as the secondary sort.
 Another complication: Lucene 3.X uses Java's UTF-16 term comparison, and 
 trunk uses UTF-8 order, so the implemented secondary sort will need to 
 respect that.
 From #lucene:
 {quote}
 rmuir__: so i think we have to on 3.x, sort the 'expected list' with 
 Collator.compare, if thats equal, then as a tiebreak use String.compareTo
 rmuir__: and in the index sort on the collated field, followed by the 
 original term
 rmuir__: in 4.x we do the same thing, but dont use String.compareTo as the 
 tiebreak for the expected list
 rmuir__: instead compare codepoints (iterating character.codepointAt, or 
 comparing .getBytes(UTF-8))
 {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2939) Highlighter should try and use maxDocCharsToAnalyze in WeightedSpanTermExtractor when adding a new field to MemoryIndex as well as when using CachingTokenStream

2011-04-11 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018319#comment-13018319
 ] 

Grant Ingersoll commented on LUCENE-2939:
-

Mark,

Seems like we can move forward with this now that the release is out.  Do you 
have time or do you want me to take it?

 Highlighter should try and use maxDocCharsToAnalyze in 
 WeightedSpanTermExtractor when adding a new field to MemoryIndex as well as 
 when using CachingTokenStream
 

 Key: LUCENE-2939
 URL: https://issues.apache.org/jira/browse/LUCENE-2939
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/highlighter
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 3.1.1, 3.2, 4.0

 Attachments: LUCENE-2939.patch, LUCENE-2939.patch, LUCENE-2939.patch, 
 LUCENE-2939.patch


 huge documents can be drastically slower than need be because the entire 
 field is added to the memory index
 this cost can be greatly reduced in many cases if we try and respect 
 maxDocCharsToAnalyze
 things can be improved even further by respecting this setting with 
 CachingTokenStream

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2276) Support for cologne phonetic

2011-04-11 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018326#comment-13018326
 ] 

Robert Muir commented on SOLR-2276:
---

hi Marc, thanks for notifying us of the commons codec release.

Uwe: I agree with your suggestions, I just want to mention i think we should 
still keep the registry map, because for example Caverphone is deprecated in 
the 1.5 release (and there is Caverphone1 and Caverphone2). So the map is 
useful for us to shield our users from changes (we can map Caverphone to 
Caverphone2 or whichever one is equivalent). Even if this one is removed in 
commons codec 1.6 this string value can still work.



 Support for cologne phonetic
 

 Key: SOLR-2276
 URL: https://issues.apache.org/jira/browse/SOLR-2276
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 1.4.1
 Environment: Apache Commons Codec 1.5
Reporter: Marc Pompl
 Fix For: 4.0

 Attachments: ColognePhonetic-patch-with-reflection.txt

   Original Estimate: 2h
  Remaining Estimate: 2h

 As soon as Apache Commons Codec 1.5 is released, support new encoder 
 ColognePhonetic please.
 See JIRA for CODEC-106.
 It is fundamental for phonetic searches if you are indexing german names. 
 Other indexers are optimizied for english (words).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2276) Support for cologne phonetic

2011-04-11 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018330#comment-13018330
 ] 

Uwe Schindler commented on SOLR-2276:
-

One addition about the registry: The registry is not synchronized, but now it 
is lazy updated as soon as a new encoder class occurs - registry.put() could be 
called from different threads and break the HashMap.

 Support for cologne phonetic
 

 Key: SOLR-2276
 URL: https://issues.apache.org/jira/browse/SOLR-2276
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 1.4.1
 Environment: Apache Commons Codec 1.5
Reporter: Marc Pompl
 Fix For: 4.0

 Attachments: ColognePhonetic-patch-with-reflection.txt

   Original Estimate: 2h
  Remaining Estimate: 2h

 As soon as Apache Commons Codec 1.5 is released, support new encoder 
 ColognePhonetic please.
 See JIRA for CODEC-106.
 It is fundamental for phonetic searches if you are indexing german names. 
 Other indexers are optimizied for english (words).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3018) Lucene Native Directory implementation need automated build

2011-04-11 Thread Simon Willnauer (JIRA)
Lucene Native Directory implementation need automated build
---

 Key: LUCENE-3018
 URL: https://issues.apache.org/jira/browse/LUCENE-3018
 Project: Lucene - Java
  Issue Type: Wish
  Components: Build
Affects Versions: 4.0
Reporter: Simon Willnauer
Priority: Minor
 Fix For: 4.0


Currently the native directory impl in contrib/misc require manual action to 
compile the c code (partially) documented in 
 
https://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/contrib/misc/src/java/overview.html

yet it would be nice if we had an ant task and documentation for all platforms 
how to compile them and set up the prerequisites.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Fwd: Hadoop patch builds for other Projects

2011-04-11 Thread Grant Ingersoll
FYI

Begin forwarded message:

 From: Nigel Daley nda...@mac.com
 Date: April 10, 2011 11:20:41 AM EDT
 To: bui...@apache.org
 Subject: Re: Hadoop patch builds for other Projects
 Reply-To: bui...@apache.org
 
 Hey Grant.  Sorry for the late reply.
 
 I revamped the precommit testing in the fall so that it doesn't use Jira 
 email anymore to trigger a build.  The process is controlled by
 https://builds.apache.org/hudson/job/PreCommit-Admin/
 which has some documentation up at the top of the job.  You can look at the 
 config of the job (do you have access?) to see what it's doing.  Any project 
 could use this same admin job -- you just need to ask me to add the project 
 to the Jira filter used by the admin job 
 (https://issues.apache.org/jira/sr/jira.issueviews:searchrequest-xml/12313474/SearchRequest-12313474.xml?tempMax=100
  ) once you have the downstream job(s) setup for your specific project.  For 
 Hadoop we have 3 downstream builds configured which also have some 
 documentation:
 https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/
 https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/
 https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/
 
 Let me know if you have questions or can't see these job configs.
 
 Cheers,
 Nige
 
 On Mar 30, 2011, at 8:37 AM, Grant Ingersoll wrote:
 
 Over in Lucene, we interested in setting up a patch testing framework for 
 Lucene similar to what Hadoop does.  That is, when a new patch comes in, we 
 would like to apply it to the trunk, test it and check it if it meets our 
 requirements and then post a comment on the JIRA issue giving it a 
 preliminary vote.
 
 Does anyone know what the process is for setting this up?  Is there a wiki 
 or other instructions for it anywhere?  Or does, perhaps, Jenkins have a 
 plugin that supports this kind of thing?  As I recall from talking w/ Nigel 
 about this before, it involves a fair amount of scripting and some mail 
 processing work.
 
 Thanks,
 Grant
 




[jira] [Commented] (LUCENE-544) MultiFieldQueryParser field boost multiplier

2011-04-11 Thread Rene Scheibe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018332#comment-13018332
 ] 

Rene Scheibe commented on LUCENE-544:
-

There is a problem with parsing the query one* two*.

Using the tests from the patch, I would expect:

(b:one*^5.0 t:one*^10.0) (b:two*^5.0 t:two*^10.0)

But I get:

(b:one* t:one*) (b:two* t:two*)

 MultiFieldQueryParser field boost multiplier
 

 Key: LUCENE-544
 URL: https://issues.apache.org/jira/browse/LUCENE-544
 Project: Lucene - Java
  Issue Type: Improvement
  Components: QueryParser
Reporter: Karl Wettin
Priority: Trivial
 Attachments: MultiFieldQueryParser.java, MultiFieldQueryParser.java, 
 MultiFieldQueryParser.java.diff, MultiFieldQueryParser.java.diff, 
 QueryParserPatch


 Allows specific boosting per field, e.g. +(name:foo^1 description:foo^0.1).
 Went from String[] field to MultiFieldQueryParser.FieldSetting[] field in 
 constructor. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3018) Lucene Native Directory implementation need automated build

2011-04-11 Thread Varun Thacker (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018335#comment-13018335
 ] 

Varun Thacker commented on LUCENE-3018:
---

I'll take up this task as it would help me understand the community process 
involved submitting a patch. 
I can use this experience to patch LUCENE-2793 and LUCENE-2795 over the summer 
. 

 Lucene Native Directory implementation need automated build
 ---

 Key: LUCENE-3018
 URL: https://issues.apache.org/jira/browse/LUCENE-3018
 Project: Lucene - Java
  Issue Type: Wish
  Components: Build
Affects Versions: 4.0
Reporter: Simon Willnauer
Priority: Minor
 Fix For: 4.0


 Currently the native directory impl in contrib/misc require manual action to 
 compile the c code (partially) documented in 
  
 https://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/contrib/misc/src/java/overview.html
 yet it would be nice if we had an ant task and documentation for all 
 platforms how to compile them and set up the prerequisites.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2798) Randomize indexed collation key testing

2011-04-11 Thread Steven Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018353#comment-13018353
 ] 

Steven Rowe commented on LUCENE-2798:
-

bq. it may be the use of _TestUtil.randomUnicodeString here.

It may, but the first above-listed seed produces this mismatch (strings are 
printed out as arrays of codepoints):

{noformat}
java.lang.AssertionError: ---
Indexed string #45: [141]
 Sorted string #45: [141]
---
Indexed string #46: [32]
 Sorted string #46: [28, 777]
---
Indexed string #47: [28, 777]
 Sorted string #47: [32]

Collator strength: SECONDARY  Collator decomposition: CANONICAL_DECOMPOSITION
{noformat}

#46 and #47 include neither supplementary chars nor problematic BMP chars.

I wrote a test including just [32] and [28,777] as indexed strings, and the 
same mismatch occurs for random locales, regardless of collator decomposition, 
and for all collator strengths except PRIMARY.


 Randomize indexed collation key testing
 ---

 Key: LUCENE-2798
 URL: https://issues.apache.org/jira/browse/LUCENE-2798
 Project: Lucene - Java
  Issue Type: Test
  Components: Analysis
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-2798.patch


 Robert Muir noted on #lucene IRC channel today that Lucene's indexed 
 collation key testing is currently fragile (for example, they had to be 
 revisited when Robert upgraded the ICU dependency in LUCENE-2797 because of 
 Unicode 6.0 collation changes) and coverage is trivial (only 5 locales 
 tested, and no collator options are exercised).  This affects both the JDK 
 implementation in {{modules/analysis/common/}} and the ICU implementation 
 under {{modules/icu/}}.
 The key thing to test is that the order of the indexed terms is the same as 
 that provided by the Collator itself.  Instead of the current set of static 
 tests, this could be achieved via indexing randomly generated terms' 
 collation keys (and collator options) and then comparing the index terms' 
 order to the order provided by the Collator over the original terms.
 Since different terms may produce the same collation key, however, the order 
 of indexed terms is inherently unstable.  When performing runtime collation, 
 the Collator addresses the sort stability issue by adding a secondary sort 
 over the normalized original terms.  In order to directly compare Collator's 
 sort with Lucene's collation key sort, a secondary sort will need to be 
 applied to Lucene's indexed terms as well. Robert has suggested indexing the 
 original terms in addition to their collation keys, then using a Sort over 
 the original terms as the secondary sort.
 Another complication: Lucene 3.X uses Java's UTF-16 term comparison, and 
 trunk uses UTF-8 order, so the implemented secondary sort will need to 
 respect that.
 From #lucene:
 {quote}
 rmuir__: so i think we have to on 3.x, sort the 'expected list' with 
 Collator.compare, if thats equal, then as a tiebreak use String.compareTo
 rmuir__: and in the index sort on the collated field, followed by the 
 original term
 rmuir__: in 4.x we do the same thing, but dont use String.compareTo as the 
 tiebreak for the expected list
 rmuir__: instead compare codepoints (iterating character.codepointAt, or 
 comparing .getBytes(UTF-8))
 {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2798) Randomize indexed collation key testing

2011-04-11 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018365#comment-13018365
 ] 

Robert Muir commented on LUCENE-2798:
-

{quote}
I wrote a test including just [32] and [28,777] as indexed strings, and the 
same mismatch occurs for random locales, regardless of collator decomposition, 
and for all collator strengths except PRIMARY.
{quote}

Without looking too hard (are these hex values?) in your debugging it would be 
useful to print the sort key as well. Are the sort keys the same?

But FYI the bugs i found in collation, somehow corrupted the internal state of 
RuleBasedCollator, so the exact strings you are looking at might simply be a 
symptom.


 Randomize indexed collation key testing
 ---

 Key: LUCENE-2798
 URL: https://issues.apache.org/jira/browse/LUCENE-2798
 Project: Lucene - Java
  Issue Type: Test
  Components: Analysis
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-2798.patch


 Robert Muir noted on #lucene IRC channel today that Lucene's indexed 
 collation key testing is currently fragile (for example, they had to be 
 revisited when Robert upgraded the ICU dependency in LUCENE-2797 because of 
 Unicode 6.0 collation changes) and coverage is trivial (only 5 locales 
 tested, and no collator options are exercised).  This affects both the JDK 
 implementation in {{modules/analysis/common/}} and the ICU implementation 
 under {{modules/icu/}}.
 The key thing to test is that the order of the indexed terms is the same as 
 that provided by the Collator itself.  Instead of the current set of static 
 tests, this could be achieved via indexing randomly generated terms' 
 collation keys (and collator options) and then comparing the index terms' 
 order to the order provided by the Collator over the original terms.
 Since different terms may produce the same collation key, however, the order 
 of indexed terms is inherently unstable.  When performing runtime collation, 
 the Collator addresses the sort stability issue by adding a secondary sort 
 over the normalized original terms.  In order to directly compare Collator's 
 sort with Lucene's collation key sort, a secondary sort will need to be 
 applied to Lucene's indexed terms as well. Robert has suggested indexing the 
 original terms in addition to their collation keys, then using a Sort over 
 the original terms as the secondary sort.
 Another complication: Lucene 3.X uses Java's UTF-16 term comparison, and 
 trunk uses UTF-8 order, so the implemented secondary sort will need to 
 respect that.
 From #lucene:
 {quote}
 rmuir__: so i think we have to on 3.x, sort the 'expected list' with 
 Collator.compare, if thats equal, then as a tiebreak use String.compareTo
 rmuir__: and in the index sort on the collated field, followed by the 
 original term
 rmuir__: in 4.x we do the same thing, but dont use String.compareTo as the 
 tiebreak for the expected list
 rmuir__: instead compare codepoints (iterating character.codepointAt, or 
 comparing .getBytes(UTF-8))
 {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[Lucene.Net] Request help

2011-04-11 Thread Rafael Bueno
Good morning. I wonder if you can help me with a problem. Working on a project 
that uses version 1.0 of Castle ActiveRecords and we are migrating to version 
3.0 RC, but this version requires the assembly Lucene.Net, Version = 2.9.2.2, 
Culture = neutral, PublicKeyToken = f5940d1699e37ff1, but the website 
Lucene.NET's just found the assembly Lucene.Net, Version = 2.9.2.2, Culture = 
neutral, PublicKeyToken = null. The dependencies occur in that order 
Castle.ActiveRecord.dll - NHibernate.Search.dll - Lucene. Net.dll. await 
contact with the place where I can download the assembly.

[jira] [Created] (LUCENE-3019) FVH: uncontrollable color tags

2011-04-11 Thread Koji Sekiguchi (JIRA)
FVH: uncontrollable color tags
--

 Key: LUCENE-3019
 URL: https://issues.apache.org/jira/browse/LUCENE-3019
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/highlighter
Affects Versions: 3.1, 3.0.3, 2.9.4, 4.0
Reporter: Koji Sekiguchi
Priority: Trivial
 Fix For: 3.2, 4.0


The multi-colored tags is a feature of FVH. But it is uncontrollable (or more 
precisely, unexpected by users) that which color is used for each terms.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2798) Randomize indexed collation key testing

2011-04-11 Thread Steven Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018374#comment-13018374
 ] 

Steven Rowe commented on LUCENE-2798:
-

bq. Without looking too hard (are these hex values?) 

No, it's just the output from Arrays.toString(int[]), which outputs decimal.

bq. in your debugging it would be useful to print the sort key as well.

Agreed. Here's the output:

{quote}
java.lang.AssertionError: ---
Indexed string #0: [32]
Indexed collation key: [0, 0, 0, 119, 0, 0]
 Sorted string #0: [28, 777]
Sorted collation key: [0, 0, 0, -101, 0, 0]
---
Indexed string #1: [28, 777]
Indexed collation key: [0, 0, 0, -101, 0, 0]
 Sorted string #1: [32]
Sorted collation key: [0, 0, 0, 119, 0, 0]

Collator strength: SECONDARY  Collator decomposition: NO_DECOMPOSITION
{quote}

(again with the Arrays.toString() for the byte array from the collation keys - 
obviously not ideal in that they're first converted to signed integers...)

bq. Are the sort keys the same?

No.

 Randomize indexed collation key testing
 ---

 Key: LUCENE-2798
 URL: https://issues.apache.org/jira/browse/LUCENE-2798
 Project: Lucene - Java
  Issue Type: Test
  Components: Analysis
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-2798.patch


 Robert Muir noted on #lucene IRC channel today that Lucene's indexed 
 collation key testing is currently fragile (for example, they had to be 
 revisited when Robert upgraded the ICU dependency in LUCENE-2797 because of 
 Unicode 6.0 collation changes) and coverage is trivial (only 5 locales 
 tested, and no collator options are exercised).  This affects both the JDK 
 implementation in {{modules/analysis/common/}} and the ICU implementation 
 under {{modules/icu/}}.
 The key thing to test is that the order of the indexed terms is the same as 
 that provided by the Collator itself.  Instead of the current set of static 
 tests, this could be achieved via indexing randomly generated terms' 
 collation keys (and collator options) and then comparing the index terms' 
 order to the order provided by the Collator over the original terms.
 Since different terms may produce the same collation key, however, the order 
 of indexed terms is inherently unstable.  When performing runtime collation, 
 the Collator addresses the sort stability issue by adding a secondary sort 
 over the normalized original terms.  In order to directly compare Collator's 
 sort with Lucene's collation key sort, a secondary sort will need to be 
 applied to Lucene's indexed terms as well. Robert has suggested indexing the 
 original terms in addition to their collation keys, then using a Sort over 
 the original terms as the secondary sort.
 Another complication: Lucene 3.X uses Java's UTF-16 term comparison, and 
 trunk uses UTF-8 order, so the implemented secondary sort will need to 
 respect that.
 From #lucene:
 {quote}
 rmuir__: so i think we have to on 3.x, sort the 'expected list' with 
 Collator.compare, if thats equal, then as a tiebreak use String.compareTo
 rmuir__: and in the index sort on the collated field, followed by the 
 original term
 rmuir__: in 4.x we do the same thing, but dont use String.compareTo as the 
 tiebreak for the expected list
 rmuir__: instead compare codepoints (iterating character.codepointAt, or 
 comparing .getBytes(UTF-8))
 {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [Lucene.Net] Request help

2011-04-11 Thread Robert Jordan

On 11.04.2011 15:18, Rafael Bueno wrote:

Good morning. I wonder if you can help me with a problem. Working on
a project that uses version 1.0 of Castle ActiveRecords and we are
migrating to version 3.0 RC, but this version requires the assembly
Lucene.Net, Version = 2.9.2.2, Culture = neutral, PublicKeyToken =
f5940d1699e37ff1, but the website Lucene.NET's just found the
assembly Lucene.Net, Version = 2.9.2.2, Culture = neutral,
PublicKeyToken = null. The dependencies occur in that order
Castle.ActiveRecord.dll -  NHibernate.Search.dll -  Lucene.
Net.dll. await contact with the place where I can download the
assembly.


Lucene.Net was never distributed with a strong name (PublicKeyToken
!= null). This means that you have to ask the NHibernate devs
where to obtain this special Lucene.Net version, because no one
else would be able to provide it.

Robert



[jira] [Updated] (LUCENE-3019) FVH: uncontrollable color tags

2011-04-11 Thread Koji Sekiguchi (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated LUCENE-3019:
---

Attachment: LUCENE-3019.patch

The patch. It fixes the problem when usePhraseHighlighter=true.

When the flag is false and FVH works on N-gram field, not a few terms may be 
created in tree, then it causes uncontrollable.

But I think the case of using usePhraseHighlighter=false with N-gram field is 
rare, the attached patch will be enough.

 FVH: uncontrollable color tags
 --

 Key: LUCENE-3019
 URL: https://issues.apache.org/jira/browse/LUCENE-3019
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/highlighter
Affects Versions: 2.9.4, 3.0.3, 3.1, 4.0
Reporter: Koji Sekiguchi
Priority: Trivial
 Fix For: 3.2, 4.0

 Attachments: LUCENE-3019.patch


 The multi-colored tags is a feature of FVH. But it is uncontrollable (or more 
 precisely, unexpected by users) that which color is used for each terms.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2798) Randomize indexed collation key testing

2011-04-11 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018383#comment-13018383
 ] 

Robert Muir commented on LUCENE-2798:
-

also i don't see any check that preflex codec isn't in use for this test?



 Randomize indexed collation key testing
 ---

 Key: LUCENE-2798
 URL: https://issues.apache.org/jira/browse/LUCENE-2798
 Project: Lucene - Java
  Issue Type: Test
  Components: Analysis
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-2798.patch


 Robert Muir noted on #lucene IRC channel today that Lucene's indexed 
 collation key testing is currently fragile (for example, they had to be 
 revisited when Robert upgraded the ICU dependency in LUCENE-2797 because of 
 Unicode 6.0 collation changes) and coverage is trivial (only 5 locales 
 tested, and no collator options are exercised).  This affects both the JDK 
 implementation in {{modules/analysis/common/}} and the ICU implementation 
 under {{modules/icu/}}.
 The key thing to test is that the order of the indexed terms is the same as 
 that provided by the Collator itself.  Instead of the current set of static 
 tests, this could be achieved via indexing randomly generated terms' 
 collation keys (and collator options) and then comparing the index terms' 
 order to the order provided by the Collator over the original terms.
 Since different terms may produce the same collation key, however, the order 
 of indexed terms is inherently unstable.  When performing runtime collation, 
 the Collator addresses the sort stability issue by adding a secondary sort 
 over the normalized original terms.  In order to directly compare Collator's 
 sort with Lucene's collation key sort, a secondary sort will need to be 
 applied to Lucene's indexed terms as well. Robert has suggested indexing the 
 original terms in addition to their collation keys, then using a Sort over 
 the original terms as the secondary sort.
 Another complication: Lucene 3.X uses Java's UTF-16 term comparison, and 
 trunk uses UTF-8 order, so the implemented secondary sort will need to 
 respect that.
 From #lucene:
 {quote}
 rmuir__: so i think we have to on 3.x, sort the 'expected list' with 
 Collator.compare, if thats equal, then as a tiebreak use String.compareTo
 rmuir__: and in the index sort on the collated field, followed by the 
 original term
 rmuir__: in 4.x we do the same thing, but dont use String.compareTo as the 
 tiebreak for the expected list
 rmuir__: instead compare codepoints (iterating character.codepointAt, or 
 comparing .getBytes(UTF-8))
 {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2798) Randomize indexed collation key testing

2011-04-11 Thread Steven Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018386#comment-13018386
 ] 

Steven Rowe commented on LUCENE-2798:
-

bq. also i don't see any check that preflex codec isn't in use for this test?

{{TestCollationKeyAnalyzer.setUp()}} handles it:
{code:java}
  @Override
  public void setUp() throws Exception {
super.setUp();
assumeFalse(preflex format only supports UTF-8 encoded bytes, 
PreFlex.equals(CodecProvider.getDefault().getDefaultFieldCodec()));
  }
{code}

And in practice, the test gets skipped 25% of the time as a result of this.


 Randomize indexed collation key testing
 ---

 Key: LUCENE-2798
 URL: https://issues.apache.org/jira/browse/LUCENE-2798
 Project: Lucene - Java
  Issue Type: Test
  Components: Analysis
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-2798.patch


 Robert Muir noted on #lucene IRC channel today that Lucene's indexed 
 collation key testing is currently fragile (for example, they had to be 
 revisited when Robert upgraded the ICU dependency in LUCENE-2797 because of 
 Unicode 6.0 collation changes) and coverage is trivial (only 5 locales 
 tested, and no collator options are exercised).  This affects both the JDK 
 implementation in {{modules/analysis/common/}} and the ICU implementation 
 under {{modules/icu/}}.
 The key thing to test is that the order of the indexed terms is the same as 
 that provided by the Collator itself.  Instead of the current set of static 
 tests, this could be achieved via indexing randomly generated terms' 
 collation keys (and collator options) and then comparing the index terms' 
 order to the order provided by the Collator over the original terms.
 Since different terms may produce the same collation key, however, the order 
 of indexed terms is inherently unstable.  When performing runtime collation, 
 the Collator addresses the sort stability issue by adding a secondary sort 
 over the normalized original terms.  In order to directly compare Collator's 
 sort with Lucene's collation key sort, a secondary sort will need to be 
 applied to Lucene's indexed terms as well. Robert has suggested indexing the 
 original terms in addition to their collation keys, then using a Sort over 
 the original terms as the secondary sort.
 Another complication: Lucene 3.X uses Java's UTF-16 term comparison, and 
 trunk uses UTF-8 order, so the implemented secondary sort will need to 
 respect that.
 From #lucene:
 {quote}
 rmuir__: so i think we have to on 3.x, sort the 'expected list' with 
 Collator.compare, if thats equal, then as a tiebreak use String.compareTo
 rmuir__: and in the index sort on the collated field, followed by the 
 original term
 rmuir__: in 4.x we do the same thing, but dont use String.compareTo as the 
 tiebreak for the expected list
 rmuir__: instead compare codepoints (iterating character.codepointAt, or 
 comparing .getBytes(UTF-8))
 {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-2798) Randomize indexed collation key testing

2011-04-11 Thread Steven Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rowe updated LUCENE-2798:


Attachment: LUCENE-2798.patch

Added two-term collation sort test; added collation key debug printing.

 Randomize indexed collation key testing
 ---

 Key: LUCENE-2798
 URL: https://issues.apache.org/jira/browse/LUCENE-2798
 Project: Lucene - Java
  Issue Type: Test
  Components: Analysis
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-2798.patch, LUCENE-2798.patch


 Robert Muir noted on #lucene IRC channel today that Lucene's indexed 
 collation key testing is currently fragile (for example, they had to be 
 revisited when Robert upgraded the ICU dependency in LUCENE-2797 because of 
 Unicode 6.0 collation changes) and coverage is trivial (only 5 locales 
 tested, and no collator options are exercised).  This affects both the JDK 
 implementation in {{modules/analysis/common/}} and the ICU implementation 
 under {{modules/icu/}}.
 The key thing to test is that the order of the indexed terms is the same as 
 that provided by the Collator itself.  Instead of the current set of static 
 tests, this could be achieved via indexing randomly generated terms' 
 collation keys (and collator options) and then comparing the index terms' 
 order to the order provided by the Collator over the original terms.
 Since different terms may produce the same collation key, however, the order 
 of indexed terms is inherently unstable.  When performing runtime collation, 
 the Collator addresses the sort stability issue by adding a secondary sort 
 over the normalized original terms.  In order to directly compare Collator's 
 sort with Lucene's collation key sort, a secondary sort will need to be 
 applied to Lucene's indexed terms as well. Robert has suggested indexing the 
 original terms in addition to their collation keys, then using a Sort over 
 the original terms as the secondary sort.
 Another complication: Lucene 3.X uses Java's UTF-16 term comparison, and 
 trunk uses UTF-8 order, so the implemented secondary sort will need to 
 respect that.
 From #lucene:
 {quote}
 rmuir__: so i think we have to on 3.x, sort the 'expected list' with 
 Collator.compare, if thats equal, then as a tiebreak use String.compareTo
 rmuir__: and in the index sort on the collated field, followed by the 
 original term
 rmuir__: in 4.x we do the same thing, but dont use String.compareTo as the 
 tiebreak for the expected list
 rmuir__: instead compare codepoints (iterating character.codepointAt, or 
 comparing .getBytes(UTF-8))
 {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-2464) potential slowness in QueryValueSource

2011-04-11 Thread Yonik Seeley (JIRA)
potential slowness in QueryValueSource
--

 Key: SOLR-2464
 URL: https://issues.apache.org/jira/browse/SOLR-2464
 Project: Solr
  Issue Type: Bug
Affects Versions: 3.1
Reporter: Yonik Seeley
Priority: Minor


If the scorer returns null for a segment in QueryValueSource, we'll attempt to 
create a new scorer each time we're consulted about a doc in that segment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-2464) potential slowness in QueryValueSource

2011-04-11 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley reassigned SOLR-2464:
--

Assignee: Yonik Seeley

 potential slowness in QueryValueSource
 --

 Key: SOLR-2464
 URL: https://issues.apache.org/jira/browse/SOLR-2464
 Project: Solr
  Issue Type: Bug
Affects Versions: 3.1
Reporter: Yonik Seeley
Assignee: Yonik Seeley
Priority: Minor

 If the scorer returns null for a segment in QueryValueSource, we'll attempt 
 to create a new scorer each time we're consulted about a doc in that segment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-3018) Lucene Native Directory implementation need automated build

2011-04-11 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer reassigned LUCENE-3018:
---

Assignee: Varun Thacker

I added you to the contributors list. you can now assign issues to you

 Lucene Native Directory implementation need automated build
 ---

 Key: LUCENE-3018
 URL: https://issues.apache.org/jira/browse/LUCENE-3018
 Project: Lucene - Java
  Issue Type: Wish
  Components: Build
Affects Versions: 4.0
Reporter: Simon Willnauer
Assignee: Varun Thacker
Priority: Minor
 Fix For: 4.0


 Currently the native directory impl in contrib/misc require manual action to 
 compile the c code (partially) documented in 
  
 https://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/contrib/misc/src/java/overview.html
 yet it would be nice if we had an ant task and documentation for all 
 platforms how to compile them and set up the prerequisites.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-2464) potential slowness in QueryValueSource

2011-04-11 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley resolved SOLR-2464.


Resolution: Fixed

Committed a fix for 3x.  For trunk, the fix is part of SOLR-2443.

 potential slowness in QueryValueSource
 --

 Key: SOLR-2464
 URL: https://issues.apache.org/jira/browse/SOLR-2464
 Project: Solr
  Issue Type: Bug
Affects Versions: 3.1
Reporter: Yonik Seeley
Assignee: Yonik Seeley
Priority: Minor

 If the scorer returns null for a segment in QueryValueSource, we'll attempt 
 to create a new scorer each time we're consulted about a doc in that segment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3017) FST should differentiate between final vs non-final stop nodes

2011-04-11 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018401#comment-13018401
 ] 

Michael McCandless commented on LUCENE-3017:


Well... for LUCENE-2948, I need this to handle term prefixes that are
in the terms index and are also valid terms.

For example, I could have term foo, a prefix of many other terms
(foobar, foobaz, etc), and so the path f-o-o is in the terms index
(pointing to a block that has all these other terms), ie ending on a
zero-arc node.

If that ending zero-arc node is final, I know foo is a valid term
and I must seek to the block to load it, but if it's not final, I know
it cannot exist in the index, and I can fail-fast (return NOT_FOUND
from seek(foo)).


 FST should differentiate between final vs non-final stop nodes
 --

 Key: LUCENE-3017
 URL: https://issues.apache.org/jira/browse/LUCENE-3017
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-3017.patch


 I'm breaking out this one improvement from LUCENE-2948...
 Currently, if a node has no outgoing edges (a stop node) the FST
 forcefully marks this as a final node, but it need not do this.  Ie,
 whether that node is final or not should be orthogonal to whether it
 has arcs leaving or not.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1307) Provide a standard way to reload plugins

2011-04-11 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-1307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018417#comment-13018417
 ] 

Jan Høydahl commented on SOLR-1307:
---

Monitoring files is not enough. It must work also for SolrCloud/ZK. A hook for 
coreReloaded() and one for configChanged() sounds reasonable

 Provide a standard way to reload plugins
 

 Key: SOLR-1307
 URL: https://issues.apache.org/jira/browse/SOLR-1307
 Project: Solr
  Issue Type: New Feature
  Components: search, update
Reporter: Shalin Shekhar Mangar
 Fix For: Next


 Currently, Solr plugins have no standard way to reload themselves. Each 
 plugin invents its own mechanism e.g. SpellCheckComponent. For others, even 
 small changes to configuration files are visible only after a core reload. 
 Examples include changing elevate.xml, stopwords.txt etc.
 We should provide a standard way for plugins to reload themselves on events 
 relevant to them.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: [Lucene.Net] Board Report for March

2011-04-11 Thread Scott Lombard
I copied last month's report and updated it.  Please review my changes and
revise as necessary.

Thanks
Scott


 -Original Message-
 From: Stefan Bodewig [mailto:bode...@apache.org]
 Sent: Monday, April 11, 2011 8:31 AM
 To: lucene-net-...@lucene.apache.org
 Subject: [Lucene.Net] Board Report for March
 
 Hi all,
 
 a quick heads up, since I can't find any content so far.  Your board
 report input is needed by Wednesday in
 http://wiki.apache.org/incubator/April2011
 
 Thanks
 
 Stefan



[jira] [Created] (LUCENE-3020) better payload testing with mockanalyzer

2011-04-11 Thread Robert Muir (JIRA)
better payload testing with mockanalyzer


 Key: LUCENE-3020
 URL: https://issues.apache.org/jira/browse/LUCENE-3020
 Project: Lucene - Java
  Issue Type: Test
Reporter: Robert Muir
 Fix For: 3.2, 4.0
 Attachments: LUCENE-3020.patch

MockAnalyzer currently always indexes some fixed-length payloads.

Instead it should decide for each field randomly (and remember it for that 
field):
* if the field should index no payloads at all
* field should index fixed length payloads
* field should index variable length payloads.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3020) better payload testing with mockanalyzer

2011-04-11 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3020:


Attachment: LUCENE-3020.patch

 better payload testing with mockanalyzer
 

 Key: LUCENE-3020
 URL: https://issues.apache.org/jira/browse/LUCENE-3020
 Project: Lucene - Java
  Issue Type: Test
Reporter: Robert Muir
 Fix For: 3.2, 4.0

 Attachments: LUCENE-3020.patch


 MockAnalyzer currently always indexes some fixed-length payloads.
 Instead it should decide for each field randomly (and remember it for that 
 field):
 * if the field should index no payloads at all
 * field should index fixed length payloads
 * field should index variable length payloads.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Images on Wiki kaput

2011-04-11 Thread Jan Høydahl
All attachments on the wiki are disabled, I think for security reasons?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 11. apr. 2011, at 18.39, Otis Gospodnetic wrote:

 Hi,
 
 I was looking at 
 http://wiki.apache.org/solr/SolrReplication#Admin_Page_for_Replication the 
 other 
 day and noticed the screenshots are all missing/broken.  I think I saw some 
 other Wiki pages with the same problem.  Has something changed about the Wiki 
 that caused this?  I think not all images on the Wiki are broken, but some 
 are.
 
 Otis
 
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
 Lucene ecosystem search :: http://search-lucene.com/
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3020) better payload testing with mockanalyzer

2011-04-11 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018444#comment-13018444
 ] 

Robert Muir commented on LUCENE-3020:
-

I committed this (revision 1091132). I want hudson to chug away on this... if 
there are lots of false fails or problems i can revert.

 better payload testing with mockanalyzer
 

 Key: LUCENE-3020
 URL: https://issues.apache.org/jira/browse/LUCENE-3020
 Project: Lucene - Java
  Issue Type: Test
Reporter: Robert Muir
 Fix For: 3.2, 4.0

 Attachments: LUCENE-3020.patch


 MockAnalyzer currently always indexes some fixed-length payloads.
 Instead it should decide for each field randomly (and remember it for that 
 field):
 * if the field should index no payloads at all
 * field should index fixed length payloads
 * field should index variable length payloads.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[HUDSON] Lucene-Solr-tests-only-trunk - Build # 6994 - Failure

2011-04-11 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/6994/

11 tests failed.
REGRESSION:  
org.apache.lucene.benchmark.byTask.TestPerfTasksLogic.testLineDocFile

Error Message:
Error: cannot init PerfRunData!

Stack Trace:
java.lang.Exception: Error: cannot init PerfRunData!
at 
org.apache.lucene.benchmark.byTask.Benchmark.init(Benchmark.java:56)
at 
org.apache.lucene.benchmark.BenchmarkTestCase.execBenchmark(BenchmarkTestCase.java:66)
at 
org.apache.lucene.benchmark.byTask.TestPerfTasksLogic.testLineDocFile(TestPerfTasksLogic.java:424)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1232)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1160)
Caused by: java.lang.InstantiationException: 
org.apache.lucene.analysis.MockAnalyzer
at java.lang.Class.newInstance0(Class.java:357)
at java.lang.Class.newInstance(Class.java:325)
at 
org.apache.lucene.benchmark.byTask.tasks.NewAnalyzerTask.createAnalyzer(NewAnalyzerTask.java:47)
at 
org.apache.lucene.benchmark.byTask.PerfRunData.init(PerfRunData.java:81)
at 
org.apache.lucene.benchmark.byTask.Benchmark.init(Benchmark.java:53)


REGRESSION:  
org.apache.lucene.benchmark.byTask.TestPerfTasksLogic.testReadTokens

Error Message:
Error: cannot init PerfRunData!

Stack Trace:
java.lang.Exception: Error: cannot init PerfRunData!
at 
org.apache.lucene.benchmark.byTask.Benchmark.init(Benchmark.java:56)
at 
org.apache.lucene.benchmark.BenchmarkTestCase.execBenchmark(BenchmarkTestCase.java:66)
at 
org.apache.lucene.benchmark.byTask.TestPerfTasksLogic.testReadTokens(TestPerfTasksLogic.java:463)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1232)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1160)
Caused by: java.lang.InstantiationException: 
org.apache.lucene.analysis.MockAnalyzer
at java.lang.Class.newInstance0(Class.java:357)
at java.lang.Class.newInstance(Class.java:325)
at 
org.apache.lucene.benchmark.byTask.tasks.NewAnalyzerTask.createAnalyzer(NewAnalyzerTask.java:47)
at 
org.apache.lucene.benchmark.byTask.PerfRunData.init(PerfRunData.java:81)
at 
org.apache.lucene.benchmark.byTask.Benchmark.init(Benchmark.java:53)


REGRESSION:  
org.apache.lucene.benchmark.byTask.TestPerfTasksLogic.testShingleAnalyzer

Error Message:
Error creating Analyzer

Stack Trace:
java.lang.RuntimeException: Error creating Analyzer
at 
org.apache.lucene.benchmark.byTask.tasks.NewShingleAnalyzerTask.doLogic(NewShingleAnalyzerTask.java:81)
at 
org.apache.lucene.benchmark.byTask.tasks.PerfTask.runAndMaybeStats(PerfTask.java:143)
at 
org.apache.lucene.benchmark.byTask.tasks.TaskSequence.doSerialTasks(TaskSequence.java:197)
at 
org.apache.lucene.benchmark.byTask.tasks.TaskSequence.doLogic(TaskSequence.java:138)
at 
org.apache.lucene.benchmark.byTask.tasks.PerfTask.runAndMaybeStats(PerfTask.java:143)
at 
org.apache.lucene.benchmark.byTask.utils.Algorithm.execute(Algorithm.java:301)
at 
org.apache.lucene.benchmark.byTask.Benchmark.execute(Benchmark.java:76)
at 
org.apache.lucene.benchmark.BenchmarkTestCase.execBenchmark(BenchmarkTestCase.java:67)
at 
org.apache.lucene.benchmark.byTask.TestPerfTasksLogic.testShingleAnalyzer(TestPerfTasksLogic.java:1025)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1232)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1160)
Caused by: java.lang.InstantiationException: 
org.apache.lucene.analysis.MockAnalyzer
at java.lang.Class.newInstance0(Class.java:357)
at java.lang.Class.newInstance(Class.java:325)
at 
org.apache.lucene.benchmark.byTask.tasks.NewAnalyzerTask.createAnalyzer(NewAnalyzerTask.java:47)
at 
org.apache.lucene.benchmark.byTask.tasks.NewShingleAnalyzerTask.setAnalyzer(NewShingleAnalyzerTask.java:59)
at 
org.apache.lucene.benchmark.byTask.tasks.NewShingleAnalyzerTask.doLogic(NewShingleAnalyzerTask.java:76)


REGRESSION:  
org.apache.lucene.benchmark.byTask.feeds.DocMakerTest.testIndexProperties

Error Message:
org.apache.lucene.analysis.MockAnalyzer

Stack Trace:
java.lang.InstantiationException: org.apache.lucene.analysis.MockAnalyzer
at java.lang.Class.newInstance0(Class.java:357)
at java.lang.Class.newInstance(Class.java:325)
at 
org.apache.lucene.benchmark.byTask.tasks.NewAnalyzerTask.createAnalyzer(NewAnalyzerTask.java:47)
at 
org.apache.lucene.benchmark.byTask.PerfRunData.init(PerfRunData.java:81)
at 
org.apache.lucene.benchmark.byTask.feeds.DocMakerTest.doTestIndexProperties(DocMakerTest.java:82)
at 

[HUDSON] Lucene-Solr-tests-only-trunk - Build # 6995 - Still Failing

2011-04-11 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/6995/

1 tests failed.
REGRESSION:  org.apache.lucene.index.TestIndexWriter.testCommitOnCloseDiskUsage

Error Message:
writer used too much space after close: endDiskUsage=694768 startDiskUsage=3730 
max=559500

Stack Trace:
junit.framework.AssertionFailedError: writer used too much space after close: 
endDiskUsage=694768 startDiskUsage=3730 max=559500
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1232)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1160)
at 
org.apache.lucene.index.TestIndexWriter.testCommitOnCloseDiskUsage(TestIndexWriter.java:535)




Build Log (for compile errors):
[...truncated 3155 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-2465) QueryElevationComponent should be reloadable w/o commit

2011-04-11 Thread Grant Ingersoll (JIRA)
QueryElevationComponent should be reloadable w/o commit
---

 Key: SOLR-2465
 URL: https://issues.apache.org/jira/browse/SOLR-2465
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Priority: Minor


It would be helpful if you could reload the elevation rules without having to 
do a commit and reloading the core.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2465) QueryElevationComponent should be reloadable w/o commit

2011-04-11 Thread Otis Gospodnetic (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018491#comment-13018491
 ] 

Otis Gospodnetic commented on SOLR-2465:


Related: reload synonyms without having to do a commit and reloading the core 
(n.b. reloading the core doesn't seem to actually reload synonyms.txt currently 
anyway!)
See 
http://search-lucene.com/m/2ExL22IsDPk1/otissubj=Reloading+synonyms+txt+without+downtime

SOLR-1307 seems to be related.


 QueryElevationComponent should be reloadable w/o commit
 ---

 Key: SOLR-2465
 URL: https://issues.apache.org/jira/browse/SOLR-2465
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Priority: Minor

 It would be helpful if you could reload the elevation rules without having to 
 do a commit and reloading the core.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



termInfosIndexDivisor typo in the Solr-UIMA config?

2011-04-11 Thread Otis Gospodnetic
Hi,

I was looking at term index divisor and spotted this:

.../lucene-solr-3.1$ ffxg -i IndexDivisor 
./solr/src/test-files/solr/conf/solrconfig-termindex.xml:int 
name=setTermIndexDivisor12/int
./solr/src/test-files/solr/conf/solrconfig-xinclude.xml:int 
name=setTermIndexDivisor12/int
./solr/contrib/uima/src/test/resources/solr-uima/conf/solrconfig.xml:  !-- To 
set the termInfosIndexDivisor, do this: --
./solr/contrib/uima/src/test/resources/solr-uima/conf/solrconfig.xml:
name=termInfosIndexDivisor12/int /indexReaderFactory   HERE
./solr/example/solr/conf/solrconfig.xml:  !-- By explicitly declaring the 
Factory, the termIndexDivisor can
./solr/example/solr/conf/solrconfig.xml:   int 
name=setTermIndexDivisor12/int


Is that termInfosIndexDivisor a typo in there?  Should it be 
setTermIndexDivisor like in the other configs?


Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-2276) Support for cologne phonetic

2011-04-11 Thread Marc Pompl (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marc Pompl updated SOLR-2276:
-

Attachment: (was: ColognePhonetic-patch-with-reflection.txt)

 Support for cologne phonetic
 

 Key: SOLR-2276
 URL: https://issues.apache.org/jira/browse/SOLR-2276
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 1.4.1
 Environment: Apache Commons Codec 1.5
Reporter: Marc Pompl
 Fix For: 4.0

   Original Estimate: 2h
  Remaining Estimate: 2h

 As soon as Apache Commons Codec 1.5 is released, support new encoder 
 ColognePhonetic please.
 See JIRA for CODEC-106.
 It is fundamental for phonetic searches if you are indexing german names. 
 Other indexers are optimizied for english (words).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-874) Dismax parser exceptions on trailing OPERATOR

2011-04-11 Thread James Gilliland (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018502#comment-13018502
 ] 

James Gilliland commented on SOLR-874:
--

I don't know if its directly related to this issue but I found the same error 
with people searching for foo AND - AND bar

 Dismax parser exceptions on trailing OPERATOR
 -

 Key: SOLR-874
 URL: https://issues.apache.org/jira/browse/SOLR-874
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 1.3
Reporter: Erik Hatcher
Assignee: Erik Hatcher
 Fix For: Next

 Attachments: SOLR-874-1.3.patch, SOLR-874-1.4.1.patch, SOLR-874.patch


 Dismax is supposed to be immune to parse exceptions, but alas it's not:
 http://localhost:8983/solr/select?defType=dismaxqf=nameq=ipod+AND
 kaboom!
 Caused by: org.apache.lucene.queryParser.ParseException: Cannot parse 'ipod 
 AND': Encountered EOF at line 1, column 8.
 Was expecting one of:
 NOT ...
 + ...
 - ...
 ( ...
 * ...
 QUOTED ...
 TERM ...
 PREFIXTERM ...
 WILDTERM ...
 [ ...
 { ...
 NUMBER ...
 TERM ...
 * ...
 
   at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:175)
   at 
 org.apache.solr.search.DismaxQParser.parse(DisMaxQParserPlugin.java:138)
   at org.apache.solr.search.QParser.getQuery(QParser.java:88)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-2276) Support for cologne phonetic

2011-04-11 Thread Marc Pompl (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marc Pompl updated SOLR-2276:
-

Attachment: ColognePhonetic-patch-with-reflection.txt

Thanks for the useful hints. Java's API always has some nifty news, even for 
experienced developers. ;-)
Code was modified.
Besides, the exception handling is really ugly, in order to fulfill the 
requirement to do reflection twice. 

 Support for cologne phonetic
 

 Key: SOLR-2276
 URL: https://issues.apache.org/jira/browse/SOLR-2276
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 1.4.1
 Environment: Apache Commons Codec 1.5
Reporter: Marc Pompl
 Fix For: 4.0

 Attachments: ColognePhonetic-patch-with-reflection.txt

   Original Estimate: 2h
  Remaining Estimate: 2h

 As soon as Apache Commons Codec 1.5 is released, support new encoder 
 ColognePhonetic please.
 See JIRA for CODEC-106.
 It is fundamental for phonetic searches if you are indexing german names. 
 Other indexers are optimizied for english (words).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3017) FST should differentiate between final vs non-final stop nodes

2011-04-11 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018511#comment-13018511
 ] 

Dawid Weiss commented on LUCENE-3017:
-

I know. I was just pointing out the fact that it does get fairly complex, but I 
don't have any constructive ideas how to make it simpler, so I'll simply shut 
up now :)

 FST should differentiate between final vs non-final stop nodes
 --

 Key: LUCENE-3017
 URL: https://issues.apache.org/jira/browse/LUCENE-3017
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-3017.patch


 I'm breaking out this one improvement from LUCENE-2948...
 Currently, if a node has no outgoing edges (a stop node) the FST
 forcefully marks this as a final node, but it need not do this.  Ie,
 whether that node is final or not should be orthogonal to whether it
 has arcs leaving or not.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Edited] (SOLR-1307) Provide a standard way to reload plugins

2011-04-11 Thread Otis Gospodnetic (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018531#comment-13018531
 ] 

Otis Gospodnetic edited comment on SOLR-1307 at 4/11/11 7:37 PM:
-

+1
Related:
* 
http://search-lucene.com/m/2ExL22IsDPk1/solr-1307subj=Reloading+synonyms+txt+without+downtime
* SOLR-2465

  was (Author: otis):
+1
Related:
http://search-lucene.com/m/2ExL22IsDPk1/solr-1307subj=Reloading+synonyms+txt+without+downtime
SOLR-2465
  
 Provide a standard way to reload plugins
 

 Key: SOLR-1307
 URL: https://issues.apache.org/jira/browse/SOLR-1307
 Project: Solr
  Issue Type: New Feature
  Components: search, update
Reporter: Shalin Shekhar Mangar
 Fix For: Next


 Currently, Solr plugins have no standard way to reload themselves. Each 
 plugin invents its own mechanism e.g. SpellCheckComponent. For others, even 
 small changes to configuration files are visible only after a core reload. 
 Examples include changing elevate.xml, stopwords.txt etc.
 We should provide a standard way for plugins to reload themselves on events 
 relevant to them.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1307) Provide a standard way to reload plugins

2011-04-11 Thread Otis Gospodnetic (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018531#comment-13018531
 ] 

Otis Gospodnetic commented on SOLR-1307:


+1
Related:
http://search-lucene.com/m/2ExL22IsDPk1/solr-1307subj=Reloading+synonyms+txt+without+downtime
SOLR-2465

 Provide a standard way to reload plugins
 

 Key: SOLR-1307
 URL: https://issues.apache.org/jira/browse/SOLR-1307
 Project: Solr
  Issue Type: New Feature
  Components: search, update
Reporter: Shalin Shekhar Mangar
 Fix For: Next


 Currently, Solr plugins have no standard way to reload themselves. Each 
 plugin invents its own mechanism e.g. SpellCheckComponent. For others, even 
 small changes to configuration files are visible only after a core reload. 
 Examples include changing elevate.xml, stopwords.txt etc.
 We should provide a standard way for plugins to reload themselves on events 
 relevant to them.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-2383) Velocity: Generalize range and date facet display

2011-04-11 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SOLR-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-2383:
--

Attachment: SOLR-2383.patch

Updated to trunk

 Velocity: Generalize range and date facet display
 -

 Key: SOLR-2383
 URL: https://issues.apache.org/jira/browse/SOLR-2383
 Project: Solr
  Issue Type: Bug
  Components: Response Writers
Reporter: Jan Høydahl
  Labels: facet, range, velocity
 Attachments: SOLR-2383.patch, SOLR-2383.patch, SOLR-2383.patch, 
 SOLR-2383.patch


 Velocity (/browse) GUI has hardcoded price range facet and a hardcoded 
 manufacturedate_dt date facet. Need general solution which work for any 
 facet.range and facet.date.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Commit SOLR-2383 ?

2011-04-11 Thread Jan Høydahl
SOLR-2383 fixes some annoying limitations of Velocity GUI. It's not perfect, 
but better than the current hardcoding. I'd love for someone to test it and 
consider committing.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



returning raw payloads in Solr

2011-04-11 Thread Peter Wilkins
Hi All:
I have found a wealth of information online about using payloads to influence 
the rank of search results, which seems to be the prevalent use case.

I want to do something a little different.  I want to return the raw payload I 
have associated with the token(s) contained in the search query.  I think this 
is Jira issue LUCENE-1888. I have used the examples I have found online to 
store the payloads in the index, which I can verify using Luke.   What isn't 
apparent is the chain of objects I need to return that data in the result set.

Could someone indicate which objects I need to use to get the results back in 
Solr?  
I appears I need to implement a QParserPlugin, and a QParser, but what do I 
need to implement to get my payload results from the QParser the results?
Do I need a similarity() method if I don't intend to use the payload for 
boosting?  

Any guidance would be greatly appreciated,
Peter
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-2956) Support updateDocument() with DWPTs

2011-04-11 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-2956:


Attachment: LUCENE-2956.patch

Attaching an initial patch. This patch uses a entirely non-blocking approach to 
deletes based on a specialized linked list that only uses CAS operations.

Since this issue is quiet complex I tried to add as many useful comments as 
possible inline in the patch to make reviewing easier. So for details check out 
the patch.

All test on realtime branch pass with this patch. (once in a while I have a 
failure in the healthiness test but the assumptions in that test seem to be too 
strict and I need to fix that separately)

Reviews are very very much appreciated!



 Support updateDocument() with DWPTs
 ---

 Key: LUCENE-2956
 URL: https://issues.apache.org/jira/browse/LUCENE-2956
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: Realtime Branch
Reporter: Michael Busch
Assignee: Simon Willnauer
Priority: Minor
 Fix For: Realtime Branch

 Attachments: LUCENE-2956.patch


 With separate DocumentsWriterPerThreads (DWPT) it can currently happen that 
 the delete part of an updateDocument() is flushed and committed separately 
 from the corresponding new document.
 We need to make sure that updateDocument() is always an atomic operation from 
 a IW.commit() and IW.getReader() perspective.  See LUCENE-2324 for more 
 details.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Solr 1.4.1 compatible with Lucene 3.0.1?

2011-04-11 Thread RichSimon

Short story: I am using Lucene 3.0.1, and I'm trying to run Solr 1.4.1. I
get an error starting the embedded Solr server that says it cannot find the
method FSDirectory.getDirectory. The release notes for Solr 1.4.1 says it is
compatible with Lucene 2.9.3, and I see that Lucene 3.0.1 does not have the
FSDirectory.getDirectory method any more. Dorwngrading Lucene to 2.9.x is
not an option for me. What version of Solr should I use for Lucene 3.0.1?
(We're just starting with Solr, so changing that version is not hard.) Or,
do I have to upgrade both Solr and Lucene?

Thanks,

-Rich

Here's the long story:
I am using Lucene 3.0.1, and I'm trying to run Solr 1.4.1. I have not used
any other version of Lucene. We have an existing project using Lucene 3.0.1,
and we want to start using Solr. When I try to initialize an embedded Solr
server, like so:


String solrHome = PATH_TO_SOLR_HOME;
File home = new File(solrHome);
File solrXML = new File(home, solr.xml);

coreContainer = new CoreContainer();
coreContainer.load(solrHome, solrXML);
 
embeddedSolr = new EmbeddedSolrServer(coreContainer, 
SOLR_CORE);



[04-08 11:48:39] ERROR CoreContainer [main]: java.lang.NoSuchMethodError:
org.apache.lucene.store.FSDirectory.getDirectory(Ljava/lang/String;)Lorg/apache/lucene/store/FSDirectory;
at
org.apache.solr.spelling.AbstractLuceneSpellChecker.initIndex(AbstractLuceneSpellChecker.java:186)
at
org.apache.solr.spelling.AbstractLuceneSpellChecker.init(AbstractLuceneSpellChecker.java:101)
at
org.apache.solr.spelling.IndexBasedSpellChecker.init(IndexBasedSpellChecker.java:56)
at
org.apache.solr.handler.component.SpellCheckComponent.inform(SpellCheckComponent.java:274)
at
org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:508)
at org.apache.solr.core.SolrCore.(SolrCore.java:588)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:428)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:278)


Looking at Google posts about this, it seemed that this can be caused by a
version mismatch between the Lucene version in use and the one Solr tries to
use. I noticed a Lucene version tag in the example solrconfig.xml that I’m
modifying:
 
  LUCENE_40
 
I changing it to LUCENE_301, changing it to LUCENE_30, and commenting it
out, but I still get the same error. Using
LucenePackage.get().getImplementationVersion() shows this as the Lucene
version:
  
Lucene version: 3.0.1 912433 - 2010-02-21 23:51:22

I also printed my classpath and found the following lucene jars:
lucene-analyzers-3.0.1.jar
lucene-core-3.0.1.jar
lucene-highlighter-3.0.1.jar
lucene-memory-3.0.1.jar
lucene-misc-2.9.3.jar
lucene-queries-2.9.3.jar
lucene-snowball-2.9.3.jar
lucene-spellchecker-2.9.3.jar

The FSDirectory class is in lucene-core. I decompiled the class file in the
jar, and did not see a getDirectory method. Also, I used a ClassLoader
statement to get an instance of the FSDirectory class my code is using, and
printed out the methods; no getDirectory method.

I gather from the Lucene Javadoc that the getDirectory method is in
FSDirectory for 2.4.0 and for 2.9.0, but is gone in 3.0.1 (the version I'm
using). 

Is Lucene 3.0.1 completely incompatible with Solr 1.4.1? Is there some way
to use the luceneMatchVersion tag to tell Solr what version I want to use?


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-1-4-1-compatible-with-Lucene-3-0-1-tp2806828p2806828.html
Sent from the Solr - Dev mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2956) Support updateDocument() with DWPTs

2011-04-11 Thread Jason Rutherglen (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018645#comment-13018645
 ] 

Jason Rutherglen commented on LUCENE-2956:
--

I think I have an idea, however can you explain the ticketQueue?

 Support updateDocument() with DWPTs
 ---

 Key: LUCENE-2956
 URL: https://issues.apache.org/jira/browse/LUCENE-2956
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: Realtime Branch
Reporter: Michael Busch
Assignee: Simon Willnauer
Priority: Minor
 Fix For: Realtime Branch

 Attachments: LUCENE-2956.patch


 With separate DocumentsWriterPerThreads (DWPT) it can currently happen that 
 the delete part of an updateDocument() is flushed and committed separately 
 from the corresponding new document.
 We need to make sure that updateDocument() is always an atomic operation from 
 a IW.commit() and IW.getReader() perspective.  See LUCENE-2324 for more 
 details.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1566) Allow components to add fields to outgoing documents

2011-04-11 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018654#comment-13018654
 ] 

Yonik Seeley commented on SOLR-1566:


Hmmm, I think I just hit another issue:

http://localhost:8983/solr/browse

{code}
java.lang.ClassCastException: org.apache.solr.response.ResultContext cannot be 
cast to org.apache.solr.common.SolrDocumentList
at org.apache.solr.response.PageTool.init(PageTool.java:46)
at 
org.apache.solr.response.VelocityResponseWriter.write(VelocityResponseWriter.java:62)
{code}

 Allow components to add fields to outgoing documents
 

 Key: SOLR-1566
 URL: https://issues.apache.org/jira/browse/SOLR-1566
 Project: Solr
  Issue Type: New Feature
  Components: search
Reporter: Noble Paul
 Fix For: 4.0

 Attachments: SOLR-1566-DocTransformer.patch, 
 SOLR-1566-DocTransformer.patch, SOLR-1566-DocTransformer.patch, 
 SOLR-1566-DocTransformer.patch, SOLR-1566-DocTransformer.patch, 
 SOLR-1566-DocTransformer.patch, SOLR-1566-gsi.patch, SOLR-1566-rm.patch, 
 SOLR-1566-rm.patch, SOLR-1566-rm.patch, SOLR-1566-rm.patch, 
 SOLR-1566-rm.patch, SOLR-1566.patch, SOLR-1566.patch, SOLR-1566.patch, 
 SOLR-1566.patch, SOLR-1566_parsing.patch


 Currently it is not possible for components to add fields to outgoing 
 documents which are not in the the stored fields of the document.  This makes 
 it cumbersome to add computed fields/metadata .

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2383) Velocity: Generalize range and date facet display

2011-04-11 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018657#comment-13018657
 ] 

Yonik Seeley commented on SOLR-2383:


I started to take a look at this, but ran into exceptions due to SOLR-1566

 Velocity: Generalize range and date facet display
 -

 Key: SOLR-2383
 URL: https://issues.apache.org/jira/browse/SOLR-2383
 Project: Solr
  Issue Type: Bug
  Components: Response Writers
Reporter: Jan Høydahl
  Labels: facet, range, velocity
 Attachments: SOLR-2383.patch, SOLR-2383.patch, SOLR-2383.patch, 
 SOLR-2383.patch


 Velocity (/browse) GUI has hardcoded price range facet and a hardcoded 
 manufacturedate_dt date facet. Need general solution which work for any 
 facet.range and facet.date.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[HUDSON] Lucene-trunk - Build # 1527 - Failure

2011-04-11 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-trunk/1527/

No tests ran.

Build Log (for compile errors):
[...truncated 10134 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3020) better payload testing with mockanalyzer

2011-04-11 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-3020.
-

Resolution: Fixed

finished the monster merge to branch_3x, revision 1091277

 better payload testing with mockanalyzer
 

 Key: LUCENE-3020
 URL: https://issues.apache.org/jira/browse/LUCENE-3020
 Project: Lucene - Java
  Issue Type: Test
Reporter: Robert Muir
 Fix For: 3.2, 4.0

 Attachments: LUCENE-3020.patch


 MockAnalyzer currently always indexes some fixed-length payloads.
 Instead it should decide for each field randomly (and remember it for that 
 field):
 * if the field should index no payloads at all
 * field should index fixed length payloads
 * field should index variable length payloads.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3021) randomize skipInterval in tests

2011-04-11 Thread Robert Muir (JIRA)
randomize skipInterval in tests
---

 Key: LUCENE-3021
 URL: https://issues.apache.org/jira/browse/LUCENE-3021
 Project: Lucene - Java
  Issue Type: Test
Reporter: Robert Muir


we probably don't test the multi-level skipping very well, but skipInterval etc 
is now private to the codec, so for better test coverage we should parameterize 
it to the postings writers, and randomize it via mockrandomcodec.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3021) randomize skipInterval in tests

2011-04-11 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3021:


Attachment: LUCENE-3021.patch

just a really quick patch, i hit lots of problems with Standard with this 
(maybe just bad asserts? havent even looked).


 randomize skipInterval in tests
 ---

 Key: LUCENE-3021
 URL: https://issues.apache.org/jira/browse/LUCENE-3021
 Project: Lucene - Java
  Issue Type: Test
Reporter: Robert Muir
 Attachments: LUCENE-3021.patch


 we probably don't test the multi-level skipping very well, but skipInterval 
 etc is now private to the codec, so for better test coverage we should 
 parameterize it to the postings writers, and randomize it via mockrandomcodec.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3021) randomize skipInterval in tests

2011-04-11 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3021:


Attachment: LUCENE-3021.patch

oh duh, i sometimes created skipInterval=1

everything seems fine now with -Dtests.codec=MockRandom

 randomize skipInterval in tests
 ---

 Key: LUCENE-3021
 URL: https://issues.apache.org/jira/browse/LUCENE-3021
 Project: Lucene - Java
  Issue Type: Test
Reporter: Robert Muir
 Attachments: LUCENE-3021.patch, LUCENE-3021.patch


 we probably don't test the multi-level skipping very well, but skipInterval 
 etc is now private to the codec, so for better test coverage we should 
 parameterize it to the postings writers, and randomize it via mockrandomcodec.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3010) Add the ability for the Lucene Benchmark code to read Solr configuration information for testing Analyzer/Filter Chains

2011-04-11 Thread Doron Cohen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doron Cohen updated LUCENE-3010:


Description: I would like to be able to use the Lucene Benchmark code in 
Lucene contrib with Solr to run some indexing tests.  It would be nice if 
Lucene Benchmark could read my Solr configuration rather than having to 
translate my filter chain and other parameters into Lucene java code.  This 
relates to LUCENE-2845,   (was: I would like to be able to use the Lucene 
Benchmark code in Lucene contrib with Solr to run some indexing tests.  It 
would be nice if Lucene Benchmark could read my Solr configuration rather than 
having to translate my filter chain and other parameters into Lucene java code. 
 This relates to Lucene 2845, )

 Add the ability for the  Lucene Benchmark code to read Solr configuration 
 information for testing Analyzer/Filter Chains
 

 Key: LUCENE-3010
 URL: https://issues.apache.org/jira/browse/LUCENE-3010
 Project: Lucene - Java
  Issue Type: Wish
  Components: contrib/benchmark
Reporter: Tom Burton-West
Priority: Trivial

 I would like to be able to use the Lucene Benchmark code in Lucene contrib 
 with Solr to run some indexing tests.  It would be nice if Lucene Benchmark 
 could read my Solr configuration rather than having to translate my filter 
 chain and other parameters into Lucene java code.  This relates to 
 LUCENE-2845, 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[Lucene.Net] Board Report for March

2011-04-11 Thread Stefan Bodewig
Hi all,

a quick heads up, since I can't find any content so far.  Your board
report input is needed by Wednesday in
http://wiki.apache.org/incubator/April2011

Thanks

Stefan


[Lucene.Net] [jira] [Created] (LUCENENET-409) Invalid Base exception in DateField.StringToTime()

2011-04-11 Thread Neal Granroth (JIRA)
Invalid Base exception in DateField.StringToTime()
--

 Key: LUCENENET-409
 URL: https://issues.apache.org/jira/browse/LUCENENET-409
 Project: Lucene.Net
  Issue Type: Bug
  Components: Lucene.Net Core
Affects Versions: Lucene.Net 2.9.4
Reporter: Neal Granroth


The Lucene.Net.Documents.DateField.StringToTime() method called by 
StringToDate() appears to specify an invalid value for the base in the .NET 
Convert.ToInt64() call.  When a DateField value in a legacy index is read, or 
Lucene.NET 2.9.4 is used with legacy code that relies upon DateField, the 
following exception occurs whenever StringToDate() is called:

System.ArgumentException: Invalid Base.
   at System.Convert.ToInt64(String value, Int32 fromBase)
   at Lucene.Net.Documents.DateField.StringToTime(String s)
   at Lucene.Net.Documents.DateField.StringToDate(String s)


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira