[jira] [Closed] (LUCENENET-479) QueryParser.SetEnablePositionIncrements(false) doesn't work

2012-03-22 Thread Christopher Currens (Closed) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christopher Currens closed LUCENENET-479.
-

Resolution: Fixed

This was fixed along with re-porting the parser in LUCENENET-478.

Additionally, SetEnablePositionIncrements and GetEnablePositionIncrements now 
uses a bool instead of a class, and is now a public property with a getter and 
setter (EnablePositionIncrements)

 QueryParser.SetEnablePositionIncrements(false) doesn't work
 ---

 Key: LUCENENET-479
 URL: https://issues.apache.org/jira/browse/LUCENENET-479
 Project: Lucene.Net
  Issue Type: Bug
  Components: Lucene.Net Core
Affects Versions: Lucene.Net 2.9.2, Lucene.Net 2.9.4, Lucene.Net 2.9.4g
Reporter: Christopher Currens
 Fix For: Lucene.Net 3.0.3


 Trying to disable position increments via SetEnablePositionIncrements(false) 
 has no effect, at least on phrase queries.
 The parsed query returned from the QueryParser with this input, should by 
 default return a phrase query whose terms look like: Query with Stopwords 
 should look silmilar to this if converted to a string: query ? stopwords, 
 where ? is a null term query in the phrase query.
 With EnablePositionIncrements set to false, the resulting query should be 
 similary to query stopwords.  However, calling 
 SetEnablePositionIncrements(false) has no effect on the resulting query.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Closed] (LUCENENET-466) optimisation for the GermanStemmer.vb‏

2012-03-22 Thread Christopher Currens (Closed) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christopher Currens closed LUCENENET-466.
-

Resolution: Fixed

I've added a new stemmer in trunk called GermanDIN2Stemmer.  You can specify 
GermanAnalyzer use it via some new constructors that take a bool indicating if 
you want to use the DIN-5007-2 stemmer instead of the default DIN-5007-1 
stemmer.

This won't break compatibility with users who want to use the old default DIN1 
stemmer, but enables anyone who wants to use the other.

 optimisation for the GermanStemmer.vb‏
 --

 Key: LUCENENET-466
 URL: https://issues.apache.org/jira/browse/LUCENENET-466
 Project: Lucene.Net
  Issue Type: Improvement
  Components: Lucene.Net Contrib
Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g, Lucene.Net 3.0.3
Reporter: Prescott Nasser
Priority: Minor
 Fix For: Lucene.Net 3.0.3


 I have a little optimisation for the GermanStemmer.vb (in 
 Contrib.Analyzers) class. At the moment the function Substitute 
 converts the german Umlaute ä in a, ö ino and ü in u. This 
 is not the correct german translation. They must be converted to ae, 
 oe and ue. So I can write the name Björn or Bjoern but not 
 Bjorn. With this optimization a user can search for Björn and also 
 find Bjoern.
  
 Here is the optimized code snippet:
  
 else if ( buffer[c] == 'ä' )
  {
  buffer[c] = 'a';
  buffer.Insert(c + 1, 'e');
  }
  else if ( buffer[c] == 'ö' )
  {
  buffer[c] = 'o';
  buffer.Insert(c + 1,'e');
  }
  else if ( buffer[c] == 'ü' )
  {
  buffer[c] = 'u';
  buffer.Insert(c + 1,'e');
  }
  
 Thank You
 Björn

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 2042 - Failure

2012-03-22 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/2042/

1 tests failed.
FAILED:  junit.framework.TestSuite.org.apache.solr.cloud.BasicDistributedZkTest

Error Message:
ERROR: SolrIndexSearcher opens=93 closes=92

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=93 
closes=92
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner$3.addError(JUnitTestRunner.java:974)
at junit.framework.TestResult.addError(TestResult.java:38)
at 
junit.framework.JUnit4TestAdapterCache$1.testFailure(JUnit4TestAdapterCache.java:51)
at 
org.junit.runner.notification.RunNotifier$4.notifyListener(RunNotifier.java:100)
at 
org.junit.runner.notification.RunNotifier$SafeNotifier.run(RunNotifier.java:41)
at 
org.junit.runner.notification.RunNotifier.fireTestFailure(RunNotifier.java:97)
at 
org.junit.internal.runners.model.EachTestNotifier.addFailure(EachTestNotifier.java:26)
at org.junit.runners.ParentRunner.run(ParentRunner.java:306)
at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:743)
Caused by: java.lang.AssertionError: ERROR: SolrIndexSearcher opens=93 closes=92
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:211)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:100)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:36)
at 
org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:37)
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75)
at 
org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:38)
at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:39)
at org.junit.rules.RunRules.evaluate(RunRules.java:18)
at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
... 4 more




Build Log (for compile errors):
[...truncated 9494 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2983) Unable to load custom MergePolicy

2012-03-22 Thread Tommaso Teofili (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235413#comment-13235413
 ] 

Tommaso Teofili commented on SOLR-2983:
---

I just noticed also the toIndexWriter method should be explicitly tested, going 
to work on it and attach a new patch

 Unable to load custom MergePolicy
 -

 Key: SOLR-2983
 URL: https://issues.apache.org/jira/browse/SOLR-2983
 Project: Solr
  Issue Type: Bug
Reporter: Mathias Herberts
Assignee: Tommaso Teofili
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: SOLR-2983.patch


 As part of a recent upgrade to Solr 3.5.0 we encountered an error related to 
 our use of LinkedIn's ZoieMergePolicy.
 It seems the code that loads a custom MergePolicy was at some point moved 
 into SolrIndexConfig.java from SolrIndexWriter.java, but as this code was 
 copied verbatim it now contains a bug:
 try {
   policy = (MergePolicy) 
 schema.getResourceLoader().newInstance(mpClassName, null, new 
 Class[]{IndexWriter.class}, new Object[]{this});
 } catch (Exception e) {
   policy = (MergePolicy) 
 schema.getResourceLoader().newInstance(mpClassName);
 }
 'this' is no longer an IndexWriter but a SolrIndexConfig, therefore the call 
 to newInstance will always throw an exception and the catch clause will be 
 executed. If the custom MergePolicy does not have a default constructor 
 (which is the case of ZoieMergePolicy), the second attempt to create the 
 MergePolicy will also fail and Solr won't start.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3877) Lucene should not call System.out.println

2012-03-22 Thread Dawid Weiss (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235422#comment-13235422
 ] 

Dawid Weiss commented on LUCENE-3877:
-

bq. I have seen it not work in the past for obscure reasons

Most likely the reasons were incorrect pointcut definitions? These can be 
tricky, I agree. Nonetheless, I've been using AspectJ for a long time and it 
always fits my needs and expectations. I'm not saying it doesn't have any bugs 
-- I'm sure it has. But the right tool for the right job; it took me about 5 
mins to write and apply that aspect (with follow ups, I sent an e-mail to the 
mailing list, JIRA didn't work at the time).

I'm not advocating for any tool, really. To me aspectj is a fast tool for 
expressing where I want a given snippet of code to be injected (or what I want 
excluded) and for such tasks I don't see a faster or more pleasant to use 
alternative. Oh, I've been using asmlib too; extensively in fact; so it's not 
lack of knowledge about the tool itself.





 Lucene should not call System.out.println
 -

 Key: LUCENE-3877
 URL: https://issues.apache.org/jira/browse/LUCENE-3877
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
 Fix For: 3.6, 4.0

 Attachments: IllegalSystemTest.java, IllegalSystemTest.java, 
 SystemPrintCheck.java


 We seem to have accumulated a few random sops...
 Eg, PairOutputs.java (oal.util.fst) and MultiDocValues.java, at least.
 Can we somehow detect (eg, have a test failure) if we accidentally leave 
 errant System.out.println's (leftover from debugging)...?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3877) Lucene should not call System.out.println

2012-03-22 Thread Dawid Weiss (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235424#comment-13235424
 ] 

Dawid Weiss commented on LUCENE-3877:
-

My aspectj experiments from yesterday when JIRA was dead.

I applied that aspect just to see what happens.
{noformat}
ajc -sourceroots aspects \
   -inpath lucene-core-3.6-SNAPSHOT.jar \
   -d none \
   -cp aspectjrt.jar \
   -showWeaveInfo
{noformat}
Here's what I got:
{noformat}
Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.analysis.PorterStemmer'
(PorterStemmer.java:529) advised by before advice from
'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.analysis.PorterStemmer'
(PorterStemmer.java:534) advised by before advice from
'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.analysis.PorterStemmer'
(PorterStemmer.java:542) advised by before advice from
'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:989)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:996)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1003)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1012)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1013)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1038)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1043)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1047)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1056)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1057)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1062)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1071)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1073)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1074)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1077)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1079)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1081)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1082)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1085)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream 

[jira] [Commented] (LUCENE-3877) Lucene should not call System.out.println

2012-03-22 Thread Dawid Weiss (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235428#comment-13235428
 ] 

Dawid Weiss commented on LUCENE-3877:
-

Oh, btw. I think a FindBugs rule for detecting sysouts/syserrs would be a great 
addition to FindBugs -- you should definitely file it as an improvement there. 
In reality at least class-level exclusions will be needed to avoid legitimate 
matches like the ones shown above (main methods, exception handlers), but these 
can be lived with.

 Lucene should not call System.out.println
 -

 Key: LUCENE-3877
 URL: https://issues.apache.org/jira/browse/LUCENE-3877
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
 Fix For: 3.6, 4.0

 Attachments: IllegalSystemTest.java, IllegalSystemTest.java, 
 SystemPrintCheck.java


 We seem to have accumulated a few random sops...
 Eg, PairOutputs.java (oal.util.fst) and MultiDocValues.java, at least.
 Can we somehow detect (eg, have a test failure) if we accidentally leave 
 errant System.out.println's (leftover from debugging)...?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3901) Add katakana filter to better deal with katakana spelling variants

2012-03-22 Thread Christian Moen (Created) (JIRA)
Add katakana filter to better deal with katakana spelling variants
--

 Key: LUCENE-3901
 URL: https://issues.apache.org/jira/browse/LUCENE-3901
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/analysis
Reporter: Christian Moen
 Fix For: 3.6, 4.0


Many Japanese katakana words end in a long sound that is sometimes optional.

For example, パーティー and パーティ are both perfectly valid for party.  Similarly we 
have センター and センタ that are variants of center as well as サーバー and サーバ for 
server.

I'm proposing that we add a katakana stemmer that removes this long sound if 
the terms are longer than a configurable length.  It's also possible to add the 
variant as a synonym, but I think stemming is preferred from a ranking point of 
view.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3901) Add katakana stem filter to better deal with certain katakana spelling variants

2012-03-22 Thread Christian Moen (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Moen updated LUCENE-3901:
---

Summary: Add katakana stem filter to better deal with certain katakana 
spelling variants  (was: Add katakana filter to better deal with katakana 
spelling variants)

 Add katakana stem filter to better deal with certain katakana spelling 
 variants
 ---

 Key: LUCENE-3901
 URL: https://issues.apache.org/jira/browse/LUCENE-3901
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/analysis
Reporter: Christian Moen
 Fix For: 3.6, 4.0


 Many Japanese katakana words end in a long sound that is sometimes optional.
 For example, パーティー and パーティ are both perfectly valid for party.  Similarly 
 we have センター and センタ that are variants of center as well as サーバー and サーバ 
 for server.
 I'm proposing that we add a katakana stemmer that removes this long sound if 
 the terms are longer than a configurable length.  It's also possible to add 
 the variant as a synonym, but I think stemming is preferred from a ranking 
 point of view.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Solr-trunk - Build # 1801 - Still Failing

2012-03-22 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Solr-trunk/1801/

1 tests failed.
FAILED:  org.apache.solr.TestDistributedSearch.testDistribSearch

Error Message:
Uncaught exception by thread: Thread[Thread-662,5,]

Stack Trace:
org.apache.lucene.util.UncaughtExceptionsRule$UncaughtExceptionsInBackgroundThread:
 Uncaught exception by thread: Thread[Thread-662,5,]
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:84)
at 
org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:618)
at org.junit.rules.RunRules.evaluate(RunRules.java:18)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
at 
org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:37)
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75)
at 
org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:38)
at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:39)
at org.junit.rules.RunRules.evaluate(RunRules.java:18)
at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:743)
Caused by: java.lang.RuntimeException: 
org.apache.solr.client.solrj.SolrServerException: http://localhost:53923/solr
at 
org.apache.solr.TestDistributedSearch$1.run(TestDistributedSearch.java:374)
Caused by: org.apache.solr.client.solrj.SolrServerException: 
http://localhost:53923/solr
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:496)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:251)
at 
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:90)
at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:312)
at 
org.apache.solr.TestDistributedSearch$1.run(TestDistributedSearch.java:369)
Caused by: org.apache.commons.httpclient.ConnectTimeoutException: The host did 
not accept the connection within timeout of 100 ms
at 
org.apache.commons.httpclient.protocol.ReflectionSocketFactory.createSocket(ReflectionSocketFactory.java:155)
at 
org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:125)
at 
org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
at 
org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.open(MultiThreadedHttpConnectionManager.java:1361)
at 
org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
at 
org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:426)
... 4 more
Caused by: java.net.SocketTimeoutException: connect timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327)
at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193)
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384)
at java.net.Socket.connect(Socket.java:546)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at 

[jira] [Commented] (LUCENE-3897) KuromojiTokenizer fails with large docs

2012-03-22 Thread Michael McCandless (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235493#comment-13235493
 ] 

Michael McCandless commented on LUCENE-3897:


Thanks Christian!

 KuromojiTokenizer fails with large docs
 ---

 Key: LUCENE-3897
 URL: https://issues.apache.org/jira/browse/LUCENE-3897
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/analysis
Reporter: Robert Muir
Assignee: Christian Moen
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3897.patch


 just shoving largeish random docs triggers asserts like:
 {noformat}
 [junit] Caused by: java.lang.AssertionError: backPos=4100 vs 
 lastBackTracePos=5120
 [junit]   at 
 org.apache.lucene.analysis.kuromoji.KuromojiTokenizer.backtrace(KuromojiTokenizer.java:907)
 [junit]   at 
 org.apache.lucene.analysis.kuromoji.KuromojiTokenizer.parse(KuromojiTokenizer.java:756)
 [junit]   at 
 org.apache.lucene.analysis.kuromoji.KuromojiTokenizer.incrementToken(KuromojiTokenizer.java:403)
 [junit]   at 
 org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:404)
 {noformat}
 But, you get no seed...
 I'll commit the test case and @Ignore it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect

2012-03-22 Thread Dawid Weiss (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235494#comment-13235494
 ] 

Dawid Weiss commented on LUCENE-3867:
-

I've been experimenting a bit with the new code. Field offsets for three 
classes in a hierarchy with unalignable fields (byte, long combinations at all 
levels). Note unaligned reordering of byte field in JRockit - nice.

{noformat}
JVM: [JVM: HotSpot, Sun Microsystems Inc., 1.6.0_31] (compressed OOPs)
@12  4 Super.superByte
@16  8 Super.subLong
@24  8 Sub.subLong
@32  4 Sub.subByte
@36  4 SubSub.subSubByte
@40  8 SubSub.subSubLong
@48sizeOf(SubSub.class instance)

JVM: [JVM: HotSpot, Sun Microsystems Inc., 1.6.0_31] (normal OOPs)
@16  8 Super.subLong
@24  8 Super.superByte
@32  8 Sub.subLong
@40  8 Sub.subByte
@48  8 SubSub.subSubLong
@56  8 SubSub.subSubByte
@64sizeOf(SubSub.class instance)


JVM: [JVM: J9, IBM Corporation, 1.6.0]
@24  8 Super.subLong
@32  4 Super.superByte
@36  4 Sub.subByte
@40  8 Sub.subLong
@48  8 SubSub.subSubLong
@56  8 SubSub.subSubByte
@64sizeOf(SubSub.class instance)

JVM: [JVM: JRockit, Oracle Corporation, 1.6.0_26] (64-bit JVM!)
@ 8  8 Super.subLong
@16  1 Super.superByte
@17  7 Sub.subByte
@24  8 Sub.subLong
@32  8 SubSub.subSubLong
@40  8 SubSub.subSubByte
@48sizeOf(SubSub.class instance)
{noformat}

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
 --

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3901) Add katakana stem filter to better deal with certain katakana spelling variants

2012-03-22 Thread Christian Moen (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235497#comment-13235497
 ] 

Christian Moen commented on LUCENE-3901:


Patch for this coming up shortly.

 Add katakana stem filter to better deal with certain katakana spelling 
 variants
 ---

 Key: LUCENE-3901
 URL: https://issues.apache.org/jira/browse/LUCENE-3901
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/analysis
Reporter: Christian Moen
 Fix For: 3.6, 4.0


 Many Japanese katakana words end in a long sound that is sometimes optional.
 For example, パーティー and パーティ are both perfectly valid for party.  Similarly 
 we have センター and センタ that are variants of center as well as サーバー and サーバ 
 for server.
 I'm proposing that we add a katakana stemmer that removes this long sound if 
 the terms are longer than a configurable length.  It's also possible to add 
 the variant as a synonym, but I think stemming is preferred from a ranking 
 point of view.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect

2012-03-22 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235500#comment-13235500
 ] 

Uwe Schindler commented on LUCENE-3867:
---

Thanks for the insight.

When thinking about the reordering, I am a littel bit afraid about the 
optimization in the shallow sizeOf(Class?). This optimiaztion does not 
recurse to superclasses, as it assumes, that all field offsets are greater than 
those of the superclass, so finding the maximum does not need to recurse up (so 
it early exits).

This is generally true (also in the above printout), but not guaranteed. E.g. 
JRockit does it partly (it reuses space inside the superclass area to locate 
the byte from the subclass). In the above example still the order of fields is 
always Super-Sub-SubSub, but if the ordeing in the JRockit example would be 
like:

{noformat}
@ 8  1 Super.superByte
@ 9  7 Sub.subByte
@16  8 Super.subLong
@24  8 Sub.subLong
@32  8 SubSub.subSubLong
@40  8 SubSub.subSubByte
@48sizeOf(SubSub.class instance)
{noformat}

The only thing the JVM cannot change is field offsets between sub classes (so 
the field offset of the superclass is inherited), but it could happen that 
*new* fields are located between super's fields (see above - it's unused 
space). This would also allow casting and so on (it's unused space in 
superclass). Unfortunately with that reordering the maximum field offset in the 
subclass is no longer guaranteed to be greater.

I would suggest that we remove the optimization in the shallow class size 
method. It's too risky in my opinion to underdetermine the size, because the 
maximum offset in the subclass is  the maximum offset in the superclass.

I hope my explanation was understandable... :-)

Dawid, what do you thing, should we remove the optimization? Patch is easy.

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
 --

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To 

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect

2012-03-22 Thread Dawid Weiss (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235501#comment-13235501
 ] 

Dawid Weiss commented on LUCENE-3867:
-

bq. I hope my explanation was understandable... 

Perfectly well. Yes, I agree, it's possible to fill in the holes packing them 
with fields from subclasses. It would be a nice vm-level optimization in fact! 

I'm still experimenting on this code and cleaning/ adding javadocs -- I'll 
patch this and provide a complete patch once I'm done, ok?

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
 --

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Edited] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect

2012-03-22 Thread Uwe Schindler (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235502#comment-13235502
 ] 

Uwe Schindler edited comment on LUCENE-3867 at 3/22/12 11:12 AM:
-

OK. All you have to remove is the if (fieldFound  useUnsafe) check and always 
recurse. fieldFound itsself can also be removed.

  was (Author: thetaphi):
OK. All you have to remove is the if (fieldFound || useUnsafe) check and 
always recurse. fieldFound itsself can also be removed.
  
 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
 --

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect

2012-03-22 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235502#comment-13235502
 ] 

Uwe Schindler commented on LUCENE-3867:
---

OK. All you have to remove is the if (fieldFound || useUnsafe) check and always 
recurse. fieldFound itsself can also be removed.

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
 --

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect

2012-03-22 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235505#comment-13235505
 ] 

Uwe Schindler commented on LUCENE-3867:
---

JRockit could even compress like this, it would still allow casting as all 
holes are solely used by one sub-class:

{noformat}
@ 8  1 Super.superByte
@ 9  1 Sub.subByte
@10  6 SubSub.subSubByte
@16  8 Super.subLong
@24  8 Sub.subLong
@32  8 SubSub.subSubLong
@40sizeOf(SubSub.class instance)
{noformat}

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
 --

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect

2012-03-22 Thread Dawid Weiss (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235506#comment-13235506
 ] 

Dawid Weiss commented on LUCENE-3867:
-

Maybe it does such things already. I didn't check extensively.

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
 --

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Reopened] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect

2012-03-22 Thread Uwe Schindler (Reopened) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler reopened LUCENE-3867:
---


We have to remove the shallow size optimization in 3.x and trunk.

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
 --

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3897) KuromojiTokenizer fails with large docs

2012-03-22 Thread Christian Moen (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235517#comment-13235517
 ] 

Christian Moen commented on LUCENE-3897:


Committed revision 1303739 on {{trunk}}.  Backporting to {{branch_3x}}. 

 KuromojiTokenizer fails with large docs
 ---

 Key: LUCENE-3897
 URL: https://issues.apache.org/jira/browse/LUCENE-3897
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/analysis
Reporter: Robert Muir
Assignee: Christian Moen
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3897.patch


 just shoving largeish random docs triggers asserts like:
 {noformat}
 [junit] Caused by: java.lang.AssertionError: backPos=4100 vs 
 lastBackTracePos=5120
 [junit]   at 
 org.apache.lucene.analysis.kuromoji.KuromojiTokenizer.backtrace(KuromojiTokenizer.java:907)
 [junit]   at 
 org.apache.lucene.analysis.kuromoji.KuromojiTokenizer.parse(KuromojiTokenizer.java:756)
 [junit]   at 
 org.apache.lucene.analysis.kuromoji.KuromojiTokenizer.incrementToken(KuromojiTokenizer.java:403)
 [junit]   at 
 org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:404)
 {noformat}
 But, you get no seed...
 I'll commit the test case and @Ignore it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



3.6 branching

2012-03-22 Thread Robert Muir
Hello,

I propose for 3.6 that we don't create a release branch but just use
our branch_3x as the release branch. We can 'svn mv' it to
'lucene_solr_3_6' when the release is ready.

Normally we would branch and open up branch_3x as 3.7 for changes, but
from previous discussions we intend to release 4.0 next (and put 3.x
in maintenance mode).

As Hossman noted in his last email: we are doing some JIRA
reorganization etc to get things organized. Also related to this:
because we intend for this to be the last 3.x release, I want to make
sure people have a few more days to get their changes in.

New features are fine, of course bugfixes, tests, and docs, but since
we are trying to get things in shape I only ask a few extra things at
this stage:
* please ensure any new classes have at least one sentence as the class javadocs
* please ensure any new packages have a package.html with at least a
description of what the package is
* please ensure any added files have the apache license header

thoughts? objections?

-- 
lucidimagination.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: 3.6 branching

2012-03-22 Thread Martijn v Groningen

 I propose for 3.6 that we don't create a release branch but just use
 our branch_3x as the release branch. We can 'svn mv' it to
 'lucene_solr_3_6' when the release is ready.

 Normally we would branch and open up branch_3x as 3.7 for changes, but
 from previous discussions we intend to release 4.0 next (and put 3.x
 in maintenance mode).

+1 This fine with me.


[jira] [Updated] (SOLR-3255) OpenExchangeRates.Org Exchange Rate Provider

2012-03-22 Thread Updated

 [ 
https://issues.apache.org/jira/browse/SOLR-3255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-3255:
--

Attachment: SOLR-3255.patch

Here's the provider implementation with tests. See 
http://wiki.apache.org/solr/CurrencyField for documentation. Highlights:

* Uses open, free exchange rates REST API
* Plugs into CurrencyField in schema.xml
* Can load rates json from any URL or through ResourceLoader
* Configurable refresh of rates, enforces max every 60 min (since that's the 
update rate of the API)

This patch also changes the ExchangeRateProvider interface slightly:
* Instead of listCurrencies() returning FROM,TO pairs (which would be 25.000 
lines for all available pairs for this provider, it takes an argument, so that 
listCurrencies(false) returns a list of supported currencies, while 
listCurrencies(true) returns list of pairs

Known limitations/questions:
* The reflection for the providerClass param uses Class.forName() to 
instantiate the provider. But then the solr.MyClass alias does not work. How to 
solve this?
* Is the correct location o.a.s.schema for these providers or should we make a 
new package somewhere else?

 OpenExchangeRates.Org Exchange Rate Provider
 

 Key: SOLR-3255
 URL: https://issues.apache.org/jira/browse/SOLR-3255
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Affects Versions: 3.6, 4.0
Reporter: Jan Høydahl
Assignee: Jan Høydahl
  Labels: CurrencyField
 Fix For: 3.6, 4.0

 Attachments: SOLR-3255.patch


 An exchange rate provider for CurrencyField using the freely available feed 
 from http://openexchangerates.org/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: 3.6 branching

2012-03-22 Thread Erick Erickson
+1, makes my life easier

On Thu, Mar 22, 2012 at 7:51 AM, Martijn v Groningen
martijn.v.gronin...@gmail.com wrote:
 I propose for 3.6 that we don't create a release branch but just use
 our branch_3x as the release branch. We can 'svn mv' it to
 'lucene_solr_3_6' when the release is ready.

 Normally we would branch and open up branch_3x as 3.7 for changes, but
 from previous discussions we intend to release 4.0 next (and put 3.x
 in maintenance mode).

 +1 This fine with me.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3897) KuromojiTokenizer fails with large docs

2012-03-22 Thread Christian Moen (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Moen resolved LUCENE-3897.


Resolution: Fixed

Thanks a lot, Mike and Robert!

 KuromojiTokenizer fails with large docs
 ---

 Key: LUCENE-3897
 URL: https://issues.apache.org/jira/browse/LUCENE-3897
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/analysis
Reporter: Robert Muir
Assignee: Christian Moen
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3897.patch


 just shoving largeish random docs triggers asserts like:
 {noformat}
 [junit] Caused by: java.lang.AssertionError: backPos=4100 vs 
 lastBackTracePos=5120
 [junit]   at 
 org.apache.lucene.analysis.kuromoji.KuromojiTokenizer.backtrace(KuromojiTokenizer.java:907)
 [junit]   at 
 org.apache.lucene.analysis.kuromoji.KuromojiTokenizer.parse(KuromojiTokenizer.java:756)
 [junit]   at 
 org.apache.lucene.analysis.kuromoji.KuromojiTokenizer.incrementToken(KuromojiTokenizer.java:403)
 [junit]   at 
 org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:404)
 {noformat}
 But, you get no seed...
 I'll commit the test case and @Ignore it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3897) KuromojiTokenizer fails with large docs

2012-03-22 Thread Christian Moen (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235531#comment-13235531
 ] 

Christian Moen commented on LUCENE-3897:


Committed revision 1303744 on {{branch_3x}}.

 KuromojiTokenizer fails with large docs
 ---

 Key: LUCENE-3897
 URL: https://issues.apache.org/jira/browse/LUCENE-3897
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/analysis
Reporter: Robert Muir
Assignee: Christian Moen
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3897.patch


 just shoving largeish random docs triggers asserts like:
 {noformat}
 [junit] Caused by: java.lang.AssertionError: backPos=4100 vs 
 lastBackTracePos=5120
 [junit]   at 
 org.apache.lucene.analysis.kuromoji.KuromojiTokenizer.backtrace(KuromojiTokenizer.java:907)
 [junit]   at 
 org.apache.lucene.analysis.kuromoji.KuromojiTokenizer.parse(KuromojiTokenizer.java:756)
 [junit]   at 
 org.apache.lucene.analysis.kuromoji.KuromojiTokenizer.incrementToken(KuromojiTokenizer.java:403)
 [junit]   at 
 org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:404)
 {noformat}
 But, you get no seed...
 I'll commit the test case and @Ignore it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3255) OpenExchangeRates.Org Exchange Rate Provider

2012-03-22 Thread Updated

 [ 
https://issues.apache.org/jira/browse/SOLR-3255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-3255:
--

Attachment: SOLR-3255.patch

Slightly improved Noggit JSON parsing loop. Removed a few unnecessary imports. 
Fixed order of assertEquals() params.

 OpenExchangeRates.Org Exchange Rate Provider
 

 Key: SOLR-3255
 URL: https://issues.apache.org/jira/browse/SOLR-3255
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Affects Versions: 3.6, 4.0
Reporter: Jan Høydahl
Assignee: Jan Høydahl
  Labels: CurrencyField
 Fix For: 3.6, 4.0

 Attachments: SOLR-3255.patch, SOLR-3255.patch


 An exchange rate provider for CurrencyField using the freely available feed 
 from http://openexchangerates.org/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3897) KuromojiTokenizer fails with large docs

2012-03-22 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235540#comment-13235540
 ] 

Robert Muir commented on LUCENE-3897:
-

Thanks guys! The last of the fallout from LUCENE-3894 I think :)

I ran 'ant test -Dtests.nightly=true -Dtests.multiplier=5 -Dtests.iter=10' to 
simulate 10 nightly builds
and (after 2 hours) everything looks ok :)


 KuromojiTokenizer fails with large docs
 ---

 Key: LUCENE-3897
 URL: https://issues.apache.org/jira/browse/LUCENE-3897
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/analysis
Reporter: Robert Muir
Assignee: Christian Moen
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3897.patch


 just shoving largeish random docs triggers asserts like:
 {noformat}
 [junit] Caused by: java.lang.AssertionError: backPos=4100 vs 
 lastBackTracePos=5120
 [junit]   at 
 org.apache.lucene.analysis.kuromoji.KuromojiTokenizer.backtrace(KuromojiTokenizer.java:907)
 [junit]   at 
 org.apache.lucene.analysis.kuromoji.KuromojiTokenizer.parse(KuromojiTokenizer.java:756)
 [junit]   at 
 org.apache.lucene.analysis.kuromoji.KuromojiTokenizer.incrementToken(KuromojiTokenizer.java:403)
 [junit]   at 
 org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:404)
 {noformat}
 But, you get no seed...
 I'll commit the test case and @Ignore it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3887) 'ant javadocs' should fail if a package is missing a package.html

2012-03-22 Thread Michael McCandless (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3887:
---

Attachment: LUCENE-3887.patch

Another iteration, this time working I think :)

 'ant javadocs' should fail if a package is missing a package.html
 -

 Key: LUCENE-3887
 URL: https://issues.apache.org/jira/browse/LUCENE-3887
 Project: Lucene - Java
  Issue Type: Task
  Components: general/build
Reporter: Robert Muir
 Attachments: LUCENE-3887.patch, LUCENE-3887.patch


 While reviewing the javadocs I noticed many packages are missing a basic 
 package.html.
 For 3.x I committed some package.html files where they were missing (I will 
 port forward to trunk).
 I think all packages should have this... really all public/protected 
 classes/methods/constants,
 but this would be a good step.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3887) 'ant javadocs' should fail if a package is missing a package.html

2012-03-22 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235543#comment-13235543
 ] 

Uwe Schindler commented on LUCENE-3887:
---

Some unrelated fixes in the patch, otherwise ok for smokeTesting. I would just 
disagree to add python requirements to our official ant script...

 'ant javadocs' should fail if a package is missing a package.html
 -

 Key: LUCENE-3887
 URL: https://issues.apache.org/jira/browse/LUCENE-3887
 Project: Lucene - Java
  Issue Type: Task
  Components: general/build
Reporter: Robert Muir
 Attachments: LUCENE-3887.patch, LUCENE-3887.patch


 While reviewing the javadocs I noticed many packages are missing a basic 
 package.html.
 For 3.x I committed some package.html files where they were missing (I will 
 port forward to trunk).
 I think all packages should have this... really all public/protected 
 classes/methods/constants,
 but this would be a good step.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3887) 'ant javadocs' should fail if a package is missing a package.html

2012-03-22 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235546#comment-13235546
 ] 

Robert Muir commented on LUCENE-3887:
-

Uwe well we can discuss integration into the official ant build later?

For now personally I would like to have an automated check in the smokeTester 
script, that would help me
clean the stuff up rather than manually eyeballing everything. Its a step.

 'ant javadocs' should fail if a package is missing a package.html
 -

 Key: LUCENE-3887
 URL: https://issues.apache.org/jira/browse/LUCENE-3887
 Project: Lucene - Java
  Issue Type: Task
  Components: general/build
Reporter: Robert Muir
 Attachments: LUCENE-3887.patch, LUCENE-3887.patch


 While reviewing the javadocs I noticed many packages are missing a basic 
 package.html.
 For 3.x I committed some package.html files where they were missing (I will 
 port forward to trunk).
 I think all packages should have this... really all public/protected 
 classes/methods/constants,
 but this would be a good step.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3887) 'ant javadocs' should fail if a package is missing a package.html

2012-03-22 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235549#comment-13235549
 ] 

Uwe Schindler commented on LUCENE-3887:
---

Did I say anything else?

 'ant javadocs' should fail if a package is missing a package.html
 -

 Key: LUCENE-3887
 URL: https://issues.apache.org/jira/browse/LUCENE-3887
 Project: Lucene - Java
  Issue Type: Task
  Components: general/build
Reporter: Robert Muir
 Attachments: LUCENE-3887.patch, LUCENE-3887.patch


 While reviewing the javadocs I noticed many packages are missing a basic 
 package.html.
 For 3.x I committed some package.html files where they were missing (I will 
 port forward to trunk).
 I think all packages should have this... really all public/protected 
 classes/methods/constants,
 but this would be a good step.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3902) public classes with no javadocs

2012-03-22 Thread Robert Muir (Created) (JIRA)
public classes with no javadocs
---

 Key: LUCENE-3902
 URL: https://issues.apache.org/jira/browse/LUCENE-3902
 Project: Lucene - Java
  Issue Type: Improvement
  Components: general/javadocs
Reporter: Robert Muir


Here is a list of public classes with no javadocs.

I think even some simple javadocs can be valuable for all javadocs classes:
* in various summaries, we don't see an empty summary for what the class does
* easier to work with the source in various IDEs that present this stuff on 
hover, etc
* better documentation for developers to know what all these classes do.

Maybe we don't have time to fix this for 3.x, but it would be great if anybody
has good knowledge of these classes and could commit any useful stuff to the 
javadocs.

Here is the list from Mike's tool on LUCENE-3887

{noformat}
rmuir@beast:~/workspace/lucene-branch3x2/dev-tools/scripts$ python 
checkJavaDocs.py ../../lucene/build/docs/api

Check...

../../lucene/build/docs/api/all/org/tartarus/snowball/package-summary.html
  missing: Among
  missing: TestApp

../../lucene/build/docs/api/all/org/apache/lucene/spatial/tier/package-summary.html
  missing: DistanceHandler.Precision

../../lucene/build/docs/api/all/org/apache/lucene/index/package-summary.html
  missing: MergePolicy.MergeAbortedException

../../lucene/build/docs/api/all/org/apache/lucene/index/pruning/package-summary.html
  missing: CarmelTopKTermPruningPolicy.ByDocComparator
  missing: CarmelUniformTermPruningPolicy.ByDocComparator

../../lucene/build/docs/api/all/org/apache/lucene/util/package-summary.html
  missing: ByteBlockPool.Allocator
  missing: ByteBlockPool.DirectAllocator
  missing: ByteBlockPool.DirectTrackingAllocator
  missing: BytesRefHash.BytesStartArray
  missing: BytesRefHash.DirectBytesStartArray
  missing: BytesRefIterator.EmptyBytesRefIterator
  missing: DoubleBarrelLRUCache.CloneableKey
  missing: English
  missing: OpenBitSetDISI
  missing: PagedBytes.Reader
  missing: StoreClassNameRule
  missing: SystemPropertiesInvariantRule
  missing: UncaughtExceptionsRule.UncaughtExceptionEntry
  missing: UnicodeUtil.UTF16Result
  missing: UnicodeUtil.UTF8Result

../../lucene/build/docs/api/all/org/apache/lucene/queryParser/core/nodes/package-summary.html
  missing: TextableQueryNode
  missing: PathQueryNode.QueryText
  missing: PhraseSlopQueryNode
  missing: ProximityQueryNode.ProximityType
  missing: ModifierQueryNode.Modifier
  missing: ParametricQueryNode.CompareOperator
  missing: ProximityQueryNode.Type

../../lucene/build/docs/api/all/org/apache/lucene/queryParser/core/parser/package-summary.html
  missing: EscapeQuerySyntax.Type

../../lucene/build/docs/api/all/org/apache/lucene/queryParser/standard/builders/package-summary.html
  missing: AnyQueryNodeBuilder

../../lucene/build/docs/api/all/org/apache/lucene/queryParser/standard/config/package-summary.html
  missing: FuzzyConfig
  missing: StandardQueryConfigHandler.ConfigurationKeys
  missing: DefaultOperatorAttribute.Operator
  missing: StandardQueryConfigHandler.Operator

../../lucene/build/docs/api/all/org/apache/lucene/queryParser/standard/parser/package-summary.html
  missing: EscapeQuerySyntaxImpl
  missing: StandardSyntaxParser

../../lucene/build/docs/api/all/org/apache/lucene/queryParser/surround/query/package-summary.html
  missing: DistanceSubQuery
  missing: SimpleTerm.MatchingTermVisitor
  missing: AndQuery
  missing: BasicQueryFactory
  missing: ComposedQuery
  missing: DistanceQuery
  missing: FieldsQuery
  missing: NotQuery
  missing: OrQuery
  missing: SimpleTerm
  missing: SpanNearClauseFactory
  missing: SrndPrefixQuery
  missing: SrndQuery
  missing: SrndTermQuery
  missing: SrndTruncQuery
  missing: TooManyBasicQueries

../../lucene/build/docs/api/all/org/apache/lucene/store/package-summary.html
  missing: FSDirectory.FSIndexOutput
  missing: NativePosixUtil
  missing: NIOFSDirectory.NIOFSIndexInput
  missing: RAMFile
  missing: SimpleFSDirectory.SimpleFSIndexInput
  missing: SimpleFSDirectory.SimpleFSIndexInput.Descriptor
  missing: WindowsDirectory.WindowsIndexInput
  missing: MockDirectoryWrapper.Throttling

../../lucene/build/docs/api/all/org/apache/lucene/xmlparser/package-summary.html
  missing: FilterBuilder
  missing: CorePlusExtensionsParser
  missing: DOMUtils
  missing: FilterBuilderFactory
  missing: QueryBuilderFactory
  missing: ParserException

../../lucene/build/docs/api/all/org/apache/lucene/xmlparser/builders/package-summary.html
  missing: SpanQueryBuilder
  missing: BooleanFilterBuilder
  missing: BooleanQueryBuilder
  missing: BoostingQueryBuilder
  missing: BoostingTermBuilder
  missing: ConstantScoreQueryBuilder
  missing: DuplicateFilterBuilder
  missing: FilteredQueryBuilder
  missing: FuzzyLikeThisQueryBuilder
  missing: LikeThisQueryBuilder
  missing: MatchAllDocsQueryBuilder
  missing: RangeFilterBuilder
  missing: SpanBuilderBase
  missing: 

Re: 3.6 branching

2012-03-22 Thread Jan Høydahl
+1. Keep it simple

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

On 22. mars 2012, at 12:48, Robert Muir wrote:

 Hello,
 
 I propose for 3.6 that we don't create a release branch but just use
 our branch_3x as the release branch. We can 'svn mv' it to
 'lucene_solr_3_6' when the release is ready.
 
 Normally we would branch and open up branch_3x as 3.7 for changes, but
 from previous discussions we intend to release 4.0 next (and put 3.x
 in maintenance mode).
 
 As Hossman noted in his last email: we are doing some JIRA
 reorganization etc to get things organized. Also related to this:
 because we intend for this to be the last 3.x release, I want to make
 sure people have a few more days to get their changes in.
 
 New features are fine, of course bugfixes, tests, and docs, but since
 we are trying to get things in shape I only ask a few extra things at
 this stage:
 * please ensure any new classes have at least one sentence as the class 
 javadocs
 * please ensure any new packages have a package.html with at least a
 description of what the package is
 * please ensure any added files have the apache license header
 
 thoughts? objections?
 
 -- 
 lucidimagination.com
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect

2012-03-22 Thread Dawid Weiss (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235570#comment-13235570
 ] 

Dawid Weiss commented on LUCENE-3867:
-

I confirmed that this packing indeed takes place. Wrote a pseudo-random test 
with lots of classes and fields. Here's an offender on J9 for example 
(Wild_{inheritance-level}_{field-number}):
{noformat}
@24  4 Wild_0_92.fld_0_0_92
@28  4 Wild_0_92.fld_1_0_92
@32  4 Wild_0_92.fld_2_0_92
@36  4 Wild_0_92.fld_3_0_92
@40  4 Wild_0_92.fld_4_0_92
@44  4 Wild_0_92.fld_5_0_92
@48  4 Wild_0_92.fld_6_0_92
@52  4 Wild_2_5.fld_0_2_5
@56  8 Wild_1_85.fld_0_1_85
@64  8 Wild_1_85.fld_1_1_85
@72sizeOf(Wild_2_5 instance)
{noformat}

HotSpot and JRockit don't seem to do this (at least it didn't fail on the 
example).


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
 --

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3887) 'ant javadocs' should fail if a package is missing a package.html

2012-03-22 Thread Michael McCandless (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235574#comment-13235574
 ] 

Michael McCandless commented on LUCENE-3887:


You can also just run the javadoc checker directly in a source checkout, like 
this:
{noformat}
  python -u dev-tools/scripts/checkJavaDocs.py /lucene/3x/lucene/build
{noformat}

You have to ant javadocs first yourself.

Right now it only checks for missing sentences in the package-summary.html... 
I'll see if I can fix it to also detect missing package.html's...

Here's what it reports on 3.x right now:
{noformat}
/lucene/3x/lucene/build/docs/api/contrib-highlighter/org/apache/lucene/search/highlight/package-summary.html
  missing: TokenStreamFromTermPositionVector

/lucene/3x/lucene/build/docs/api/contrib-highlighter/org/apache/lucene/search/vectorhighlight/package-summary.html
  missing: BoundaryScanner
  missing: BaseFragmentsBuilder
  missing: FieldFragList.WeightedFragInfo
  missing: FieldFragList.WeightedFragInfo.SubInfo
  missing: FieldPhraseList.WeightedPhraseInfo
  missing: FieldPhraseList.WeightedPhraseInfo.Toffs
  missing: FieldQuery.QueryPhraseMap
  missing: FieldTermStack.TermInfo
  missing: ScoreOrderFragmentsBuilder.ScoreComparator
  missing: SimpleBoundaryScanner

/lucene/3x/lucene/build/docs/api/contrib-spatial/org/apache/lucene/spatial/tier/package-summary.html
  missing: DistanceHandler.Precision

/lucene/3x/lucene/build/docs/api/contrib-spellchecker/org/apache/lucene/search/suggest/package-summary.html
  missing: Lookup.LookupPriorityQueue

/lucene/3x/lucene/build/docs/api/contrib-spellchecker/org/apache/lucene/search/suggest/jaspell/package-summary.html
  missing: JaspellLookup

/lucene/3x/lucene/build/docs/api/contrib-spellchecker/org/apache/lucene/search/suggest/tst/package-summary.html
  missing: TSTAutocomplete
  missing: TSTLookup

/lucene/3x/lucene/build/docs/api/contrib-pruning/org/apache/lucene/index/pruning/package-summary.html
  missing: CarmelTopKTermPruningPolicy.ByDocComparator
  missing: CarmelUniformTermPruningPolicy.ByDocComparator

/lucene/3x/lucene/build/docs/api/contrib-facet/org/apache/lucene/facet/taxonomy/writercache/lru/package-summary.html
  missing: LruTaxonomyWriterCache.LRUType

/lucene/3x/lucene/build/docs/api/contrib-facet/org/apache/lucene/facet/index/package-summary.html
  missing: FacetsPayloadProcessorProvider.FacetsDirPayloadProcessor

/lucene/3x/lucene/build/docs/api/core/org/apache/lucene/store/package-summary.html
  missing: FSDirectory.FSIndexOutput
  missing: NIOFSDirectory.NIOFSIndexInput
  missing: RAMFile
  missing: SimpleFSDirectory.SimpleFSIndexInput
  missing: SimpleFSDirectory.SimpleFSIndexInput.Descriptor

/lucene/3x/lucene/build/docs/api/core/org/apache/lucene/index/package-summary.html
  missing: MergePolicy.MergeAbortedException

/lucene/3x/lucene/build/docs/api/core/org/apache/lucene/search/package-summary.html
  missing: FieldCache.CreationPlaceholder
  missing: FieldComparator.NumericComparatorlt;T extends Numbergt;
  missing: FieldValueHitQueue.Entry
  missing: QueryTermVector
  missing: ScoringRewritelt;Q extends Querygt;
  missing: SpanFilterResult.PositionInfo
  missing: SpanFilterResult.StartEnd
  missing: TimeLimitingCollector.TimerThread

/lucene/3x/lucene/build/docs/api/core/org/apache/lucene/util/package-summary.html
  missing: ByteBlockPool.Allocator
  missing: ByteBlockPool.DirectAllocator
  missing: ByteBlockPool.DirectTrackingAllocator
  missing: BytesRefHash.BytesStartArray
  missing: BytesRefHash.DirectBytesStartArray
  missing: BytesRefIterator.EmptyBytesRefIterator
  missing: DoubleBarrelLRUCache.CloneableKey
  missing: OpenBitSetDISI
  missing: PagedBytes.Reader
  missing: UnicodeUtil.UTF16Result
  missing: UnicodeUtil.UTF8Result

/lucene/3x/lucene/build/docs/api/contrib-analyzers/org/tartarus/snowball/package-summary.html
  missing: Among
  missing: TestApp

/lucene/3x/lucene/build/docs/api/contrib-xml-query-parser/org/apache/lucene/xmlparser/package-summary.html
  missing: FilterBuilder
  missing: CorePlusExtensionsParser
  missing: DOMUtils
  missing: FilterBuilderFactory
  missing: QueryBuilderFactory
  missing: ParserException

/lucene/3x/lucene/build/docs/api/contrib-xml-query-parser/org/apache/lucene/xmlparser/builders/package-summary.html
  missing: SpanQueryBuilder
  missing: BooleanFilterBuilder
  missing: BooleanQueryBuilder
  missing: BoostingQueryBuilder
  missing: BoostingTermBuilder
  missing: ConstantScoreQueryBuilder
  missing: DuplicateFilterBuilder
  missing: FilteredQueryBuilder
  missing: FuzzyLikeThisQueryBuilder
  missing: LikeThisQueryBuilder
  missing: MatchAllDocsQueryBuilder
  missing: RangeFilterBuilder
  missing: SpanBuilderBase
  missing: SpanFirstBuilder
  missing: SpanNearBuilder
  missing: SpanNotBuilder
  missing: SpanOrBuilder
  missing: SpanOrTermsBuilder
  missing: SpanQueryBuilderFactory
  missing: SpanTermBuilder
  

[jira] [Created] (SOLR-3265) TestSolrEntityProcessorEndToEnd fails if you have a running Solr instance

2012-03-22 Thread Erick Erickson (Created) (JIRA)
TestSolrEntityProcessorEndToEnd fails if you have a running Solr instance
-

 Key: SOLR-3265
 URL: https://issues.apache.org/jira/browse/SOLR-3265
 Project: Solr
  Issue Type: Test
Affects Versions: 3.6, 4.0
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor


When running ant test from the command line in 3.x, if you have a Solr server 
running then TestSolrentityProcessorEndToEnd fails since it uses the default 
port (stack trace with address already in use). This should use some other 
port, especially as 3.x ant test is taking 50+ minutes and I often open up a 
server to look at something else.

In 4.0, some of the cloud tests also use 8983 as a port. Should these be 
changed too?

And just to make my life *especially* interesting, at least one test puts the 
string 8983 in a document, which doesn't have to be changed G...

Of course one can start your local server on a different port, but this seems 
trappy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect

2012-03-22 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235577#comment-13235577
 ] 

Uwe Schindler commented on LUCENE-3867:
---

Thanks, in that case shallowSizeOf(Wild_2_5.class) would incorrectly return 56 
because of the short-circuit - so let's fix this.

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
 --

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: 3.6 branching

2012-03-22 Thread Tommaso Teofili
+1, much easier.
Tommaso

2012/3/22 Robert Muir rcm...@gmail.com

 Hello,

 I propose for 3.6 that we don't create a release branch but just use
 our branch_3x as the release branch. We can 'svn mv' it to
 'lucene_solr_3_6' when the release is ready.

 Normally we would branch and open up branch_3x as 3.7 for changes, but
 from previous discussions we intend to release 4.0 next (and put 3.x
 in maintenance mode).

 As Hossman noted in his last email: we are doing some JIRA
 reorganization etc to get things organized. Also related to this:
 because we intend for this to be the last 3.x release, I want to make
 sure people have a few more days to get their changes in.

 New features are fine, of course bugfixes, tests, and docs, but since
 we are trying to get things in shape I only ask a few extra things at
 this stage:
 * please ensure any new classes have at least one sentence as the class
 javadocs
 * please ensure any new packages have a package.html with at least a
 description of what the package is
 * please ensure any added files have the apache license header

 thoughts? objections?

 --
 lucidimagination.com

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




[jira] [Commented] (SOLR-3265) TestSolrEntityProcessorEndToEnd fails if you have a running Solr instance

2012-03-22 Thread Martijn van Groningen (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235580#comment-13235580
 ] 

Martijn van Groningen commented on SOLR-3265:
-

This is trappy! This should be changed.

 TestSolrEntityProcessorEndToEnd fails if you have a running Solr instance
 -

 Key: SOLR-3265
 URL: https://issues.apache.org/jira/browse/SOLR-3265
 Project: Solr
  Issue Type: Test
Affects Versions: 3.6, 4.0
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor

 When running ant test from the command line in 3.x, if you have a Solr 
 server running then TestSolrentityProcessorEndToEnd fails since it uses the 
 default port (stack trace with address already in use). This should use 
 some other port, especially as 3.x ant test is taking 50+ minutes and I 
 often open up a server to look at something else.
 In 4.0, some of the cloud tests also use 8983 as a port. Should these be 
 changed too?
 And just to make my life *especially* interesting, at least one test puts the 
 string 8983 in a document, which doesn't have to be changed G...
 Of course one can start your local server on a different port, but this seems 
 trappy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3265) TestSolrEntityProcessorEndToEnd fails if you have a running Solr instance

2012-03-22 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235581#comment-13235581
 ] 

Robert Muir commented on SOLR-3265:
---

{quote}
especially as 3.x ant test is taking 50+ minutes
{quote}

Erick do you have a 386?

 TestSolrEntityProcessorEndToEnd fails if you have a running Solr instance
 -

 Key: SOLR-3265
 URL: https://issues.apache.org/jira/browse/SOLR-3265
 Project: Solr
  Issue Type: Test
Affects Versions: 3.6, 4.0
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor

 When running ant test from the command line in 3.x, if you have a Solr 
 server running then TestSolrentityProcessorEndToEnd fails since it uses the 
 default port (stack trace with address already in use). This should use 
 some other port, especially as 3.x ant test is taking 50+ minutes and I 
 often open up a server to look at something else.
 In 4.0, some of the cloud tests also use 8983 as a port. Should these be 
 changed too?
 And just to make my life *especially* interesting, at least one test puts the 
 string 8983 in a document, which doesn't have to be changed G...
 Of course one can start your local server on a different port, but this seems 
 trappy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] [Commented] (SOLR-3265) TestSolrEntityProcessorEndToEnd fails if you have a running Solr instance

2012-03-22 Thread Erick Erickson
No, I have an OS X about 3 years old. Sometimes in only feels like
a 386 G...

On Thu, Mar 22, 2012 at 9:56 AM, Robert Muir (Commented) (JIRA)
j...@apache.org wrote:

    [ 
 https://issues.apache.org/jira/browse/SOLR-3265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235581#comment-13235581
  ]

 Robert Muir commented on SOLR-3265:
 ---

 {quote}
 especially as 3.x ant test is taking 50+ minutes
 {quote}

 Erick do you have a 386?

 TestSolrEntityProcessorEndToEnd fails if you have a running Solr instance
 -

                 Key: SOLR-3265
                 URL: https://issues.apache.org/jira/browse/SOLR-3265
             Project: Solr
          Issue Type: Test
    Affects Versions: 3.6, 4.0
            Reporter: Erick Erickson
            Assignee: Erick Erickson
            Priority: Minor

 When running ant test from the command line in 3.x, if you have a Solr 
 server running then TestSolrentityProcessorEndToEnd fails since it uses the 
 default port (stack trace with address already in use). This should use 
 some other port, especially as 3.x ant test is taking 50+ minutes and I 
 often open up a server to look at something else.
 In 4.0, some of the cloud tests also use 8983 as a port. Should these be 
 changed too?
 And just to make my life *especially* interesting, at least one test puts 
 the string 8983 in a document, which doesn't have to be changed G...
 Of course one can start your local server on a different port, but this 
 seems trappy.

 --
 This message is automatically generated by JIRA.
 If you think it was sent incorrectly, please contact your JIRA 
 administrators: 
 https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
 For more information on JIRA, see: http://www.atlassian.com/software/jira



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect

2012-03-22 Thread Dawid Weiss (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235588#comment-13235588
 ] 

Dawid Weiss commented on LUCENE-3867:
-

Yep, that assumption was wrong -- indeed:
{noformat}
WildClasses.Wild_2_5 wc = new WildClasses.Wild_2_5();
wc.fld_6_0_92 = 0x1122;
wc.fld_0_2_5 = Float.intBitsToFloat(0xa1a2a3a4);
wc.fld_0_1_85 = Double.longBitsToDouble(0xb1b2b3b4b5b6b7L);
System.out.println(ExpMemoryDumper.dumpObjectMem(wc));
{noformat}
results in:
{noformat}
0x b0 3d 6f 01 00 00 00 00 0e 80 79 01 00 00 00 00
0x0010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0030 22 11 00 00 a4 a3 a2 a1 b7 b6 b5 b4 b3 b2 b1 00
0x0040 00 00 00 00 00 00 00 00
{noformat}
And you can see they are reordered and longs are aligned.

I'll provide a cumulative patch of changes in the evening, there's one more 
thing I wanted to add (cache of fields) because this affects processing speed.

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
 --

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] [Commented] (SOLR-3265) TestSolrEntityProcessorEndToEnd fails if you have a running Solr instance

2012-03-22 Thread Martijn v Groningen
:-) If I run the tests on my personal macbook it also takes a very very
very long time to complete. This macbook is 5 years old... Luckily I do
have another faster machine.

On 22 March 2012 15:00, Erick Erickson erickerick...@gmail.com wrote:

 No, I have an OS X about 3 years old. Sometimes in only feels like
 a 386 G...

 On Thu, Mar 22, 2012 at 9:56 AM, Robert Muir (Commented) (JIRA)
 j...@apache.org wrote:
 
 [
 https://issues.apache.org/jira/browse/SOLR-3265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235581#comment-13235581]
 
  Robert Muir commented on SOLR-3265:
  ---
 
  {quote}
  especially as 3.x ant test is taking 50+ minutes
  {quote}
 
  Erick do you have a 386?
 
  TestSolrEntityProcessorEndToEnd fails if you have a running Solr
 instance
 
 -
 
  Key: SOLR-3265
  URL: https://issues.apache.org/jira/browse/SOLR-3265
  Project: Solr
   Issue Type: Test
 Affects Versions: 3.6, 4.0
 Reporter: Erick Erickson
 Assignee: Erick Erickson
 Priority: Minor
 
  When running ant test from the command line in 3.x, if you have a
 Solr server running then TestSolrentityProcessorEndToEnd fails since it
 uses the default port (stack trace with address already in use). This
 should use some other port, especially as 3.x ant test is taking 50+
 minutes and I often open up a server to look at something else.
  In 4.0, some of the cloud tests also use 8983 as a port. Should these
 be changed too?
  And just to make my life *especially* interesting, at least one test
 puts the string 8983 in a document, which doesn't have to be changed
 G...
  Of course one can start your local server on a different port, but this
 seems trappy.
 
  --
  This message is automatically generated by JIRA.
  If you think it was sent incorrectly, please contact your JIRA
 administrators:
 https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
  For more information on JIRA, see:
 http://www.atlassian.com/software/jira
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-- 
Met vriendelijke groet,

Martijn van Groningen


[jira] [Commented] (LUCENE-3847) LuceneTestCase should check for modifications on System properties

2012-03-22 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235592#comment-13235592
 ] 

Robert Muir commented on LUCENE-3847:
-

Strangely i trip the timezone issue when running any solr tests from Eclipse... 
but not lucene tests?

E.g. if i run TestDemo from lucene its fine, but if i run TestRussianFilter 
(org.apache.solr.analysis) then i hit:

{noformat}
java.lang.AssertionError: System properties invariant violated.
Different values:
  [old]user.timezone=
  [new]user.timezone=America/New_York

at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:46)
at org.junit.rules.RunRules.evaluate(RunRules.java:18)
at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
at 
org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
at 
org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
{noformat}



 LuceneTestCase should check for modifications on System properties
 --

 Key: LUCENE-3847
 URL: https://issues.apache.org/jira/browse/LUCENE-3847
 Project: Lucene - Java
  Issue Type: Improvement
  Components: general/test
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3847.patch


 - fail the test if changes have been detected.
 - revert the state of system properties before the suite.
 - cleanup after the suite.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3883) Analysis for Irish

2012-03-22 Thread Robert Muir (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3883:


Attachment: LUCENE-3883.patch

Same patch but with the solr pieces too (factory/test for the lowercasefilter, 
text_ga fieldtype, resources synced, etc).


 Analysis for Irish
 --

 Key: LUCENE-3883
 URL: https://issues.apache.org/jira/browse/LUCENE-3883
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/analysis
Reporter: Jim Regan
Assignee: Robert Muir
Priority: Trivial
  Labels: analysis, newbie
 Attachments: LUCENE-3883.patch, LUCENE-3883.patch, LUCENE-3883.patch, 
 irish.sbl


 Adds analysis for Irish.
 The stemmer is generated from a snowball stemmer. I've sent it to Martin 
 Porter, who says it will be added during the week.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3260) Improve exception handling / logging for ScriptTransformer.init()

2012-03-22 Thread Steven Rowe (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235600#comment-13235600
 ] 

Steven Rowe commented on SOLR-3260:
---

James, the trunk Maven build is still unhappy:

{noformat}
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/433/

1 tests failed.
FAILED:  org.apache.solr.handler.dataimport.TestScriptTransformer.testOneparam

Error Message:
Cannot load Script Engine for language: JavaScript

Stack Trace:
org.apache.solr.handler.dataimport.DataImportHandlerException: Cannot load 
Script Engine for language: JavaScript
at 
org.apache.solr.handler.dataimport.ScriptTransformer.initEngine(ScriptTransformer.java:76)
{noformat}

 Improve exception handling / logging for ScriptTransformer.init()
 -

 Key: SOLR-3260
 URL: https://issues.apache.org/jira/browse/SOLR-3260
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 3.5, 4.0
Reporter: James Dyer
Assignee: James Dyer
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: SOLR-3260.patch


 This came up on the user-list.  ScriptTransformer logs the same need a =1.6 
 jre message for several problems, making debugging difficult for users.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3260) Improve exception handling / logging for ScriptTransformer.init()

2012-03-22 Thread James Dyer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235609#comment-13235609
 ] 

James Dyer commented on SOLR-3260:
--

I missed one.  Sorry about that.  Should be fixed now.

 Improve exception handling / logging for ScriptTransformer.init()
 -

 Key: SOLR-3260
 URL: https://issues.apache.org/jira/browse/SOLR-3260
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 3.5, 4.0
Reporter: James Dyer
Assignee: James Dyer
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: SOLR-3260.patch


 This came up on the user-list.  ScriptTransformer logs the same need a =1.6 
 jre message for several problems, making debugging difficult for users.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: 3.6 branching

2012-03-22 Thread Christian Moen
Robert,

I think this is a very good idea.  +1.


Christian
http://atilika.com

On Mar 22, 2012, at 8:48 PM, Robert Muir wrote:

 Hello,
 
 I propose for 3.6 that we don't create a release branch but just use
 our branch_3x as the release branch. We can 'svn mv' it to
 'lucene_solr_3_6' when the release is ready.
 
 Normally we would branch and open up branch_3x as 3.7 for changes, but
 from previous discussions we intend to release 4.0 next (and put 3.x
 in maintenance mode).
 
 As Hossman noted in his last email: we are doing some JIRA
 reorganization etc to get things organized. Also related to this:
 because we intend for this to be the last 3.x release, I want to make
 sure people have a few more days to get their changes in.
 
 New features are fine, of course bugfixes, tests, and docs, but since
 we are trying to get things in shape I only ask a few extra things at
 this stage:
 * please ensure any new classes have at least one sentence as the class 
 javadocs
 * please ensure any new packages have a package.html with at least a
 description of what the package is
 * please ensure any added files have the apache license header
 
 thoughts? objections?
 
 -- 
 lucidimagination.com
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: 3.6 branching

2012-03-22 Thread Uwe Schindler
Yeah, renaming the branch after release is a good idea. We should use the 
current 3.x branch to work on the release and there should be no 3.7 anymore. 
Techincally: branching 3.6 and deleting the branch_3x is not different in the 
sense of SVN to renaming (a rename is atomic copyadd + delete).

If we have major bugs in 3.6, we can still release 3.6.1, but this would be 
simply a new TAG in the 3.6 branch.

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Robert Muir [mailto:rcm...@gmail.com]
 Sent: Thursday, March 22, 2012 12:48 PM
 To: dev@lucene.apache.org
 Subject: 3.6 branching
 
 Hello,
 
 I propose for 3.6 that we don't create a release branch but just use our
 branch_3x as the release branch. We can 'svn mv' it to 'lucene_solr_3_6' when
 the release is ready.
 
 Normally we would branch and open up branch_3x as 3.7 for changes, but from
 previous discussions we intend to release 4.0 next (and put 3.x in maintenance
 mode).
 
 As Hossman noted in his last email: we are doing some JIRA reorganization etc
 to get things organized. Also related to this:
 because we intend for this to be the last 3.x release, I want to make sure
 people have a few more days to get their changes in.
 
 New features are fine, of course bugfixes, tests, and docs, but since we are
 trying to get things in shape I only ask a few extra things at this stage:
 * please ensure any new classes have at least one sentence as the class
 javadocs
 * please ensure any new packages have a package.html with at least a
 description of what the package is
 * please ensure any added files have the apache license header
 
 thoughts? objections?
 
 --
 lucidimagination.com
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3903) javadocs very very ugly if you generate with java7

2012-03-22 Thread Robert Muir (Created) (JIRA)
javadocs very very ugly if you generate with java7
--

 Key: LUCENE-3903
 URL: https://issues.apache.org/jira/browse/LUCENE-3903
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/javadocs
Affects Versions: 3.6, 4.0
Reporter: Robert Muir


Java7 changes its javadocs to look much nicer, but this involves different CSS 
styles.

Lucene overrides the CSS with stylesheet+prettify.css which is a combination of 
java5/6 stylesheet + google prettify:
but there are problems because java7 has totally different styles.

So if you generate javadocs with java7, its like you have no stylesheet at all.

A solution might be to make stylesheet7+prettify.css and conditionalize a 
property in ant based on java version.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3903) javadocs very very ugly if you generate with java7

2012-03-22 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235628#comment-13235628
 ] 

Robert Muir commented on LUCENE-3903:
-

I really think we should fix this for 3.6: its not just that its ugly but it 
looks actually broken.

 javadocs very very ugly if you generate with java7
 --

 Key: LUCENE-3903
 URL: https://issues.apache.org/jira/browse/LUCENE-3903
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/javadocs
Affects Versions: 3.6, 4.0
Reporter: Robert Muir

 Java7 changes its javadocs to look much nicer, but this involves different 
 CSS styles.
 Lucene overrides the CSS with stylesheet+prettify.css which is a combination 
 of java5/6 stylesheet + google prettify:
 but there are problems because java7 has totally different styles.
 So if you generate javadocs with java7, its like you have no stylesheet at 
 all.
 A solution might be to make stylesheet7+prettify.css and conditionalize a 
 property in ant based on java version.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: svn commit: r1303828 - in /lucene/dev/trunk: lucene/ modules/analysis/ modules/benchmark/ modules/facet/ modules/grouping/ modules/join/ modules/queries/ modules/queryparser/ modules/suggest/ solr

2012-03-22 Thread Uwe Schindler
Oh it's already 2012?

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: rm...@apache.org [mailto:rm...@apache.org]
 Sent: Thursday, March 22, 2012 4:21 PM
 To: comm...@lucene.apache.org
 Subject: svn commit: r1303828 - in /lucene/dev/trunk: lucene/
 modules/analysis/ modules/benchmark/ modules/facet/ modules/grouping/
 modules/join/ modules/queries/ modules/queryparser/ modules/suggest/ solr/
 
 Author: rmuir
 Date: Thu Mar 22 15:21:17 2012
 New Revision: 1303828
 
 URL: http://svn.apache.org/viewvc?rev=1303828view=rev
 Log:
 happy new year
 
 Modified:
 lucene/dev/trunk/lucene/NOTICE.txt
 lucene/dev/trunk/modules/analysis/NOTICE.txt
 lucene/dev/trunk/modules/benchmark/NOTICE.txt
 lucene/dev/trunk/modules/facet/NOTICE.txt
 lucene/dev/trunk/modules/grouping/NOTICE.txt
 lucene/dev/trunk/modules/join/NOTICE.txt
 lucene/dev/trunk/modules/queries/NOTICE.txt
 lucene/dev/trunk/modules/queryparser/NOTICE.txt
 lucene/dev/trunk/modules/suggest/NOTICE.txt
 lucene/dev/trunk/solr/NOTICE.txt
 
 Modified: lucene/dev/trunk/lucene/NOTICE.txt
 URL:
 http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/NOTICE.txt?rev=130382
 8r1=1303827r2=1303828view=diff
 
 ==
 --- lucene/dev/trunk/lucene/NOTICE.txt (original)
 +++ lucene/dev/trunk/lucene/NOTICE.txt Thu Mar 22 15:21:17 2012
 @@ -1,5 +1,5 @@
  Apache Lucene
 -Copyright 2011 The Apache Software Foundation
 +Copyright 2012 The Apache Software Foundation
 
  This product includes software developed by  The Apache Software Foundation
 (http://www.apache.org/).
 
 Modified: lucene/dev/trunk/modules/analysis/NOTICE.txt
 URL:
 http://svn.apache.org/viewvc/lucene/dev/trunk/modules/analysis/NOTICE.txt?r
 ev=1303828r1=1303827r2=1303828view=diff
 
 ==
 --- lucene/dev/trunk/modules/analysis/NOTICE.txt (original)
 +++ lucene/dev/trunk/modules/analysis/NOTICE.txt Thu Mar 22 15:21:17
 +++ 2012
 @@ -1,5 +1,5 @@
  Apache Lucene
 -Copyright 2011 The Apache Software Foundation
 +Copyright 2012 The Apache Software Foundation
 
  This product includes software developed by  The Apache Software Foundation
 (http://www.apache.org/).
 
 Modified: lucene/dev/trunk/modules/benchmark/NOTICE.txt
 URL:
 http://svn.apache.org/viewvc/lucene/dev/trunk/modules/benchmark/NOTICE.t
 xt?rev=1303828r1=1303827r2=1303828view=diff
 
 ==
 --- lucene/dev/trunk/modules/benchmark/NOTICE.txt (original)
 +++ lucene/dev/trunk/modules/benchmark/NOTICE.txt Thu Mar 22 15:21:17
 +++ 2012
 @@ -1,5 +1,5 @@
  Apache Lucene Benchmark
 -Copyright 2011 The Apache Software Foundation
 +Copyright 2012 The Apache Software Foundation
 
  This product includes software developed by  The Apache Software Foundation
 (http://www.apache.org/).
 
 Modified: lucene/dev/trunk/modules/facet/NOTICE.txt
 URL:
 http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/NOTICE.txt?rev
 =1303828r1=1303827r2=1303828view=diff
 
 ==
 --- lucene/dev/trunk/modules/facet/NOTICE.txt (original)
 +++ lucene/dev/trunk/modules/facet/NOTICE.txt Thu Mar 22 15:21:17 2012
 @@ -1,5 +1,5 @@
  Apache Lucene Facets
 -Copyright 2011 The Apache Software Foundation
 +Copyright 2012 The Apache Software Foundation
 
  This product includes software developed by  The Apache Software Foundation
 (http://www.apache.org/).
 
 Modified: lucene/dev/trunk/modules/grouping/NOTICE.txt
 URL:
 http://svn.apache.org/viewvc/lucene/dev/trunk/modules/grouping/NOTICE.txt?
 rev=1303828r1=1303827r2=1303828view=diff
 
 ==
 --- lucene/dev/trunk/modules/grouping/NOTICE.txt (original)
 +++ lucene/dev/trunk/modules/grouping/NOTICE.txt Thu Mar 22 15:21:17
 +++ 2012
 @@ -1,5 +1,5 @@
  Apache Lucene Grouping
 -Copyright 2011 The Apache Software Foundation
 +Copyright 2012 The Apache Software Foundation
 
  This product includes software developed by  The Apache Software Foundation
 (http://www.apache.org/).
 
 Modified: lucene/dev/trunk/modules/join/NOTICE.txt
 URL:
 http://svn.apache.org/viewvc/lucene/dev/trunk/modules/join/NOTICE.txt?rev=
 1303828r1=1303827r2=1303828view=diff
 
 ==
 --- lucene/dev/trunk/modules/join/NOTICE.txt (original)
 +++ lucene/dev/trunk/modules/join/NOTICE.txt Thu Mar 22 15:21:17 2012
 @@ -1,5 +1,5 @@
  Apache Lucene Join
 -Copyright 2011 The Apache Software Foundation
 +Copyright 2012 The Apache Software Foundation
 
  This product includes software developed by  The Apache Software Foundation
 (http://www.apache.org/).
 
 Modified: lucene/dev/trunk/modules/queries/NOTICE.txt
 URL:
 

[jira] [Commented] (LUCENE-3887) 'ant javadocs' should fail if a package is missing a package.html

2012-03-22 Thread Michael McCandless (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235635#comment-13235635
 ] 

Michael McCandless commented on LUCENE-3887:


OK I committed the basic checking for smoke tester...

I'll leave this open for having ant javadocs fail when things are missing...

 'ant javadocs' should fail if a package is missing a package.html
 -

 Key: LUCENE-3887
 URL: https://issues.apache.org/jira/browse/LUCENE-3887
 Project: Lucene - Java
  Issue Type: Task
  Components: general/build
Reporter: Robert Muir
 Attachments: LUCENE-3887.patch, LUCENE-3887.patch


 While reviewing the javadocs I noticed many packages are missing a basic 
 package.html.
 For 3.x I committed some package.html files where they were missing (I will 
 port forward to trunk).
 I think all packages should have this... really all public/protected 
 classes/methods/constants,
 but this would be a good step.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3265) TestSolrEntityProcessorEndToEnd fails if you have a running Solr instance

2012-03-22 Thread Luca Cavanna (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235642#comment-13235642
 ] 

Luca Cavanna commented on SOLR-3265:


Looks like this has already been fixed on trunk some time ago. Erick, if you 
haven't started yet working on this I can provide a patch for 3x soon.

 TestSolrEntityProcessorEndToEnd fails if you have a running Solr instance
 -

 Key: SOLR-3265
 URL: https://issues.apache.org/jira/browse/SOLR-3265
 Project: Solr
  Issue Type: Test
Affects Versions: 3.6, 4.0
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor

 When running ant test from the command line in 3.x, if you have a Solr 
 server running then TestSolrentityProcessorEndToEnd fails since it uses the 
 default port (stack trace with address already in use). This should use 
 some other port, especially as 3.x ant test is taking 50+ minutes and I 
 often open up a server to look at something else.
 In 4.0, some of the cloud tests also use 8983 as a port. Should these be 
 changed too?
 And just to make my life *especially* interesting, at least one test puts the 
 string 8983 in a document, which doesn't have to be changed G...
 Of course one can start your local server on a different port, but this seems 
 trappy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3903) javadocs very very ugly if you generate with java7

2012-03-22 Thread Robert Muir (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3903:


Attachment: java7docs.jpg

 javadocs very very ugly if you generate with java7
 --

 Key: LUCENE-3903
 URL: https://issues.apache.org/jira/browse/LUCENE-3903
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/javadocs
Affects Versions: 3.6, 4.0
Reporter: Robert Muir
 Attachments: java7docs.jpg


 Java7 changes its javadocs to look much nicer, but this involves different 
 CSS styles.
 Lucene overrides the CSS with stylesheet+prettify.css which is a combination 
 of java5/6 stylesheet + google prettify:
 but there are problems because java7 has totally different styles.
 So if you generate javadocs with java7, its like you have no stylesheet at 
 all.
 A solution might be to make stylesheet7+prettify.css and conditionalize a 
 property in ant based on java version.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3265) TestSolrEntityProcessorEndToEnd fails if you have a running Solr instance

2012-03-22 Thread Luca Cavanna (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luca Cavanna updated SOLR-3265:
---

Attachment: SOLR-3265.patch

Patch against 3.x branch to solve the TestSolrEntityProcessorEndToEnd port 
problem.
Trunk is already ok.

 TestSolrEntityProcessorEndToEnd fails if you have a running Solr instance
 -

 Key: SOLR-3265
 URL: https://issues.apache.org/jira/browse/SOLR-3265
 Project: Solr
  Issue Type: Test
Affects Versions: 3.6, 4.0
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor
 Attachments: SOLR-3265.patch


 When running ant test from the command line in 3.x, if you have a Solr 
 server running then TestSolrentityProcessorEndToEnd fails since it uses the 
 default port (stack trace with address already in use). This should use 
 some other port, especially as 3.x ant test is taking 50+ minutes and I 
 often open up a server to look at something else.
 In 4.0, some of the cloud tests also use 8983 as a port. Should these be 
 changed too?
 And just to make my life *especially* interesting, at least one test puts the 
 string 8983 in a document, which doesn't have to be changed G...
 Of course one can start your local server on a different port, but this seems 
 trappy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3778) Create a grouping convenience class

2012-03-22 Thread Martijn van Groningen (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen resolved LUCENE-3778.
---

   Resolution: Fixed
Lucene Fields:   (was: New)

Committed to trunk. Feature work (distributed grouping, grouped facets etc.) 
will be done in new issues.

 Create a grouping convenience class
 ---

 Key: LUCENE-3778
 URL: https://issues.apache.org/jira/browse/LUCENE-3778
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/grouping
Reporter: Martijn van Groningen
 Fix For: 4.0

 Attachments: LUCENE-3778.patch, LUCENE-3778.patch, LUCENE-3778.patch, 
 LUCENE-3778.patch


 Currently the grouping module has many collector classes with a lot of 
 different options per class. I think it would be a good idea to have a 
 GroupUtil (Or another name?) convenience class. I think this could be a 
 builder, because of the many options 
 (sort,sortWithinGroup,groupOffset,groupCount and more) and implementations 
 (term/dv/function) grouping has.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3260) Improve exception handling / logging for ScriptTransformer.init()

2012-03-22 Thread Steven Rowe (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235688#comment-13235688
 ] 

Steven Rowe commented on SOLR-3260:
---

bq. I missed one. Sorry about that. Should be fixed now.

Thanks James, I think it's fixed - just now in the console output from the 
Jenkins Maven trunk job (still running as I write this), I saw:

{noformat}
Running org.apache.solr.handler.dataimport.TestScriptTransformer
NOTE: Assume failed in 
'testCheckScript(org.apache.solr.handler.dataimport.TestScriptTransformer)' 
(ignored): got: org.apache.lucene.util.InternalAssumptionViolatedException: 
failed assumption: This JVM does not have Rhino installed.  Test Skipped., 
expected: null
NOTE: Assume failed in 
'testBasic(org.apache.solr.handler.dataimport.TestScriptTransformer)' 
(ignored): got: org.apache.lucene.util.InternalAssumptionViolatedException: 
failed assumption: This JVM does not have Rhino installed.  Test Skipped., 
expected: null
NOTE: Assume failed in 
'testOneparam(org.apache.solr.handler.dataimport.TestScriptTransformer)' 
(ignored): got: org.apache.lucene.util.InternalAssumptionViolatedException: 
failed assumption: This JVM does not have Rhino installed.  Test Skipped., 
expected: null
Tests run: 4, Failures: 0, Errors: 0, Skipped: 3, Time elapsed: 0.023 sec
{noformat}

 Improve exception handling / logging for ScriptTransformer.init()
 -

 Key: SOLR-3260
 URL: https://issues.apache.org/jira/browse/SOLR-3260
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 3.5, 4.0
Reporter: James Dyer
Assignee: James Dyer
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: SOLR-3260.patch


 This came up on the user-list.  ScriptTransformer logs the same need a =1.6 
 jre message for several problems, making debugging difficult for users.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-3903) javadocs very very ugly if you generate with java7

2012-03-22 Thread Uwe Schindler (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler reassigned LUCENE-3903:
-

Assignee: Uwe Schindler

 javadocs very very ugly if you generate with java7
 --

 Key: LUCENE-3903
 URL: https://issues.apache.org/jira/browse/LUCENE-3903
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/javadocs
Affects Versions: 3.6, 4.0
Reporter: Robert Muir
Assignee: Uwe Schindler
 Attachments: LUCENE-3903.patch, java7docs.jpg


 Java7 changes its javadocs to look much nicer, but this involves different 
 CSS styles.
 Lucene overrides the CSS with stylesheet+prettify.css which is a combination 
 of java5/6 stylesheet + google prettify:
 but there are problems because java7 has totally different styles.
 So if you generate javadocs with java7, its like you have no stylesheet at 
 all.
 A solution might be to make stylesheet7+prettify.css and conditionalize a 
 property in ant based on java version.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3903) javadocs very very ugly if you generate with java7

2012-03-22 Thread Uwe Schindler (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-3903:
--

Attachment: LUCENE-3903.patch

Patch that fixes the issue(s):
- Simply append the pretify.css to the one created by javadocs itsself (as 
post-javadoc-task concat/)
- Fix javascript issues by Java 7: The code that triggered prettyprint was 
relying on an implementation specific javascript function name no longer 
existent in Java 7. I changed the window.onload handler to dynamically append 
the 2nd handler.

 javadocs very very ugly if you generate with java7
 --

 Key: LUCENE-3903
 URL: https://issues.apache.org/jira/browse/LUCENE-3903
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/javadocs
Affects Versions: 3.6, 4.0
Reporter: Robert Muir
 Attachments: LUCENE-3903.patch, java7docs.jpg


 Java7 changes its javadocs to look much nicer, but this involves different 
 CSS styles.
 Lucene overrides the CSS with stylesheet+prettify.css which is a combination 
 of java5/6 stylesheet + google prettify:
 but there are problems because java7 has totally different styles.
 So if you generate javadocs with java7, its like you have no stylesheet at 
 all.
 A solution might be to make stylesheet7+prettify.css and conditionalize a 
 property in ant based on java version.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-2382) DIH Cache Improvements

2012-03-22 Thread James Dyer (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Dyer resolved SOLR-2382.
--

Resolution: Fixed

commit to 3.x: r1303792 ( r1303822 - license headers)

 DIH Cache Improvements
 --

 Key: SOLR-2382
 URL: https://issues.apache.org/jira/browse/SOLR-2382
 Project: Solr
  Issue Type: New Feature
  Components: contrib - DataImportHandler
Reporter: James Dyer
Assignee: James Dyer
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: SOLR-2382-dihwriter.patch, SOLR-2382-dihwriter.patch, 
 SOLR-2382-dihwriter.patch, SOLR-2382-dihwriter.patch, 
 SOLR-2382-dihwriter.patch, SOLR-2382-dihwriter_standalone.patch, 
 SOLR-2382-entities.patch, SOLR-2382-entities.patch, SOLR-2382-entities.patch, 
 SOLR-2382-entities.patch, SOLR-2382-entities.patch, SOLR-2382-entities.patch, 
 SOLR-2382-entities.patch, SOLR-2382-entities.patch, 
 SOLR-2382-properties.patch, SOLR-2382-properties.patch, 
 SOLR-2382-solrwriter-verbose-fix.patch, SOLR-2382-solrwriter.patch, 
 SOLR-2382-solrwriter.patch, SOLR-2382-solrwriter.patch, SOLR-2382.patch, 
 SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, 
 SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, SOLR-2382_3x.patch, 
 TestCachedSqlEntityProcessor.java-break-where-clause.patch, 
 TestCachedSqlEntityProcessor.java-fix-where-clause-by-adding-cachePk-and-lookup.patch,
  
 TestCachedSqlEntityProcessor.java-wrong-pk-detected-due-to-lack-of-where-support.patch,
  TestThreaded.java.patch


 Functionality:
  1. Provide a pluggable caching framework for DIH so that users can choose a 
 cache implementation that best suits their data and application.
  
  2. Provide a means to temporarily cache a child Entity's data without 
 needing to create a special cached implementation of the Entity Processor 
 (such as CachedSqlEntityProcessor).
  
  3. Provide a means to write the final (root entity) DIH output to a cache 
 rather than to Solr.  Then provide a way for a subsequent DIH call to use the 
 cache as an Entity input.  Also provide the ability to do delta updates on 
 such persistent caches.
  
  4. Provide the ability to partition data across multiple caches that can 
 then be fed back into DIH and indexed either to varying Solr Shards, or to 
 the same Core in parallel.
 Use Cases:
  1. We needed a flexible  scalable way to temporarily cache child-entity 
 data prior to joining to parent entities.
   - Using SqlEntityProcessor with Child Entities can cause an n+1 select 
 problem.
   - CachedSqlEntityProcessor only supports an in-memory HashMap as a Caching 
 mechanism and does not scale.
   - There is no way to cache non-SQL inputs (ex: flat files, xml, etc).
  
  2. We needed the ability to gather data from long-running entities by a 
 process that runs separate from our main indexing process.
   
  3. We wanted the ability to do a delta import of only the entities that 
 changed.
   - Lucene/Solr requires entire documents to be re-indexed, even if only a 
 few fields changed.
   - Our data comes from 50+ complex sql queries and/or flat files.
   - We do not want to incur overhead re-gathering all of this data if only 1 
 entity's data changed.
   - Persistent DIH caches solve this problem.
   
  4. We want the ability to index several documents in parallel (using 1.4.1, 
 which did not have the threads parameter).
  
  5. In the future, we may need to use Shards, creating a need to easily 
 partition our source data into Shards.
 Implementation Details:
  1. De-couple EntityProcessorBase from caching.  
   - Created a new interface, DIHCache  two implementations:  
 - SortedMapBackedCache - An in-memory cache, used as default with 
 CachedSqlEntityProcessor (now deprecated).
 - BerkleyBackedCache - A disk-backed cache, dependent on bdb-je, tested 
 with je-4.1.6.jar
- NOTE: the existing Lucene Contrib db project uses je-3.3.93.jar.  
 I believe this may be incompatible due to Generic Usage.
- NOTE: I did not modify the ant script to automatically get this jar, 
 so to use or evaluate this patch, download bdb-je from 
 http://www.oracle.com/technetwork/database/berkeleydb/downloads/index.html 
  
  2. Allow Entity Processors to take a cacheImpl parameter to cause the 
 entity data to be cached (see EntityProcessorBase  DIHCacheProperties).
  
  3. Partially De-couple SolrWriter from DocBuilder
   - Created a new interface DIHWriter,  two implementations:
- SolrWriter (refactored)
- DIHCacheWriter (allows DIH to write ultimately to a Cache).

  4. Create a new Entity Processor, DIHCacheProcessor, which reads a 
 persistent Cache as DIH Entity Input.
  
  5. Support a partition parameter with both DIHCacheWriter and 
 DIHCacheProcessor to allow for easy partitioning of source entity 

[jira] [Updated] (LUCENE-3901) Add katakana stem filter to better deal with certain katakana spelling variants

2012-03-22 Thread Christian Moen (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Moen updated LUCENE-3901:
---

Attachment: LUCENE-3901.patch

 Add katakana stem filter to better deal with certain katakana spelling 
 variants
 ---

 Key: LUCENE-3901
 URL: https://issues.apache.org/jira/browse/LUCENE-3901
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/analysis
Reporter: Christian Moen
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3901.patch


 Many Japanese katakana words end in a long sound that is sometimes optional.
 For example, パーティー and パーティ are both perfectly valid for party.  Similarly 
 we have センター and センタ that are variants of center as well as サーバー and サーバ 
 for server.
 I'm proposing that we add a katakana stemmer that removes this long sound if 
 the terms are longer than a configurable length.  It's also possible to add 
 the variant as a synonym, but I think stemming is preferred from a ranking 
 point of view.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENENET-477) NullReferenceException in ThreadLocal when Lucene.Net compiled for .Net 2.0

2012-03-22 Thread Christopher Currens (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christopher Currens resolved LUCENENET-477.
---

   Resolution: Fixed
Fix Version/s: Lucene.Net 3.0.3

Thanks for the patch.  It's been applied to trunk for version 3.0.3.

 NullReferenceException in ThreadLocal when Lucene.Net compiled for .Net 2.0
 ---

 Key: LUCENENET-477
 URL: https://issues.apache.org/jira/browse/LUCENENET-477
 Project: Lucene.Net
  Issue Type: Bug
  Components: Lucene.Net Core
Affects Versions: Lucene.Net 2.9.4g
 Environment: .Net 2.0
Reporter: Andrew Sampson
 Fix For: Lucene.Net 3.0.3

 Attachments: CloseableThreadLocal.cs.patch


 A NullReferenceException occurs in Lucene.Net.Util.ThreadLocal. This class is 
 only included when Lucene is compiled for .Net 2.0. 
 The cause is that the threadstatic slots variable is lazily-initialized, 
 but there is no null-check in the dispose.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Closed] (LUCENENET-179) SnowballFilter speed improvment

2012-03-22 Thread Christopher Currens (Closed) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christopher Currens closed LUCENENET-179.
-

Resolution: Invalid

It's been so long since this patch was submitted (2009), that it's no longer 
needed.  The new version of the SnowballFilter from 3.0.3 only uses reflection 
in the constructor to create the filter (as does the patch).  It's too bad this 
didn't make it into 2.9.4, where it could have really been used.

 SnowballFilter speed improvment
 ---

 Key: LUCENENET-179
 URL: https://issues.apache.org/jira/browse/LUCENENET-179
 Project: Lucene.Net
  Issue Type: Improvement
  Components: Lucene.Net Contrib
Affects Versions: Lucene.Net 2.9.2
Reporter: Arian Bär
 Fix For: Lucene.Net 3.0.3

 Attachments: FailOverSnowballFilter.cs


 I'm using Lucene.Net along with snowball stemming to index text from a 
 database. The class Lucene.Net.Analysis.Snowball.SnowballFilter uses the 
 reflection API and the invoke method to call the stem methods of snowball. I 
 have written a Snowball filter which creates a delegate and uses this 
 delegate to stem the words afterwards. This approach improves the indexing 
 speed of my indexing program by about 10%. I would be happy if you include 
 this code into lucene.net.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (LUCENE-3847) LuceneTestCase should check for modifications on System properties

2012-03-22 Thread Dawid Weiss (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235710#comment-13235710
 ] 

Dawid Weiss commented on LUCENE-3847:
-

Well... something is changing it, the question is what it is. I'll take a look.

 LuceneTestCase should check for modifications on System properties
 --

 Key: LUCENE-3847
 URL: https://issues.apache.org/jira/browse/LUCENE-3847
 Project: Lucene - Java
  Issue Type: Improvement
  Components: general/test
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3847.patch


 - fail the test if changes have been detected.
 - revert the state of system properties before the suite.
 - cleanup after the suite.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Closed] (LUCENENET-372) NLS pack for Lucene.NET: BR, CJK, CN, CZ, DE, FR, NL, RU analyzers

2012-03-22 Thread Christopher Currens (Closed) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christopher Currens closed LUCENENET-372.
-

Resolution: Won't Fix
  Assignee: (was: Prescott Nasser)

We're not doing a separate contrib release for these, and it's already ported 
into 3.0.3.  Closing issue as won't fix.  

I apologize that this didn't make it into the official release of 2.9.4.  I 
hope this doesn't discourage you from contributing in the future.

 NLS pack for Lucene.NET: BR, CJK, CN, CZ, DE, FR, NL, RU analyzers
 --

 Key: LUCENENET-372
 URL: https://issues.apache.org/jira/browse/LUCENENET-372
 Project: Lucene.Net
  Issue Type: New Feature
  Components: Lucene.Net Contrib
Reporter: Pasha Bizhan
Priority: Minor
  Labels: Analyzers
 Attachments: lucene-net-nls.zip


 Port of java analyzers. Sorry for 1.4 version, it's from 2005 year
 Update to 2.9.2/2.9.4 compatibility for 2.9.4 release. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (LUCENE-3901) Add katakana stem filter to better deal with certain katakana spelling variants

2012-03-22 Thread Christian Moen (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Moen updated LUCENE-3901:
---

Attachment: LUCENE-3901.patch

 Add katakana stem filter to better deal with certain katakana spelling 
 variants
 ---

 Key: LUCENE-3901
 URL: https://issues.apache.org/jira/browse/LUCENE-3901
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/analysis
Reporter: Christian Moen
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3901.patch, LUCENE-3901.patch


 Many Japanese katakana words end in a long sound that is sometimes optional.
 For example, パーティー and パーティ are both perfectly valid for party.  Similarly 
 we have センター and センタ that are variants of center as well as サーバー and サーバ 
 for server.
 I'm proposing that we add a katakana stemmer that removes this long sound if 
 the terms are longer than a configurable length.  It's also possible to add 
 the variant as a synonym, but I think stemming is preferred from a ranking 
 point of view.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1303792 [1/2] - in /lucene/dev/branches/branch_3x/solr: contrib/dataimporthandler-extras/src/test/org/apache/solr/handler/dataimport/ contrib/dataimporthandler/src/java/org/apache/sol

2012-03-22 Thread Dawid Weiss
 I'm a little worried about doing anything automated (I think it would
 be bad to stamp a wrong license on something or whatever).

That's why that task doesn't touch anything it cannot recognize and reports it.

 I used ant rat-sources to find these problems though, so detecting
 them is automated...

Ok.

D.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3903) javadocs very very ugly if you generate with java7

2012-03-22 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235735#comment-13235735
 ] 

Robert Muir commented on LUCENE-3903:
-

+1

Tested on branch_3x with Java5, 6, and 7 (just patch --merge)

 javadocs very very ugly if you generate with java7
 --

 Key: LUCENE-3903
 URL: https://issues.apache.org/jira/browse/LUCENE-3903
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/javadocs
Affects Versions: 3.6, 4.0
Reporter: Robert Muir
Assignee: Uwe Schindler
 Attachments: LUCENE-3903.patch, java7docs.jpg


 Java7 changes its javadocs to look much nicer, but this involves different 
 CSS styles.
 Lucene overrides the CSS with stylesheet+prettify.css which is a combination 
 of java5/6 stylesheet + google prettify:
 but there are problems because java7 has totally different styles.
 So if you generate javadocs with java7, its like you have no stylesheet at 
 all.
 A solution might be to make stylesheet7+prettify.css and conditionalize a 
 property in ant based on java version.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-3.x - Build # 12842 - Failure

2012-03-22 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-3.x/12842/

All tests passed

Build Log (for compile errors):
[...truncated 22095 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3901) Add katakana stem filter to better deal with certain katakana spelling variants

2012-03-22 Thread Christian Moen (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235746#comment-13235746
 ] 

Christian Moen commented on LUCENE-3901:


Find attached a patch for this.

The stemming is done by {{KuromojiKatakanaStemFilter}}, which has been added to 
{{KuromojiAnalyzer}} and a corresponding {{KuromojiKatakanaStemFilterFactory}} 
has been added to the {{text_ja}} field type in {{schema.xml}}.

Note that this stemming is now turned on by default and I think it makes good 
sense to do so.  The minimum length of a token considered for stemming is 
configurable and I've made the default of 4 explicit in {{schema.xml}} to 
convey that it's there.

The stemmer only supports full-width katakana and should be used in combination 
with a {{CJKWidthFilter}} if stemming half-width characters is required and 
you're doing your wiring.  Both {{text_ja}} and {{KuromojiAnalyzer}} takes care 
of this, and the default overall processing is the same.

There are some test cases in {{TestKuromojiKatakanaStemFilter}}, but I've added 
a case to {{TestKuromojiAnalyzer}} that demonstrates how the stemming works in 
combination with katakana compound splitting.

In Japanese, manager can be written both as マネージャー and マネージャ (and probably 
also マネジャー), and for the compound シニアプロジェクトマネージャー (senior project manager), we 
now get tokens シニア (senior) プロジェクト (project) マネージャ (manager), and we've stemmed 
the last token by removing the trailing ー.  Kuromoji also makes the compound 
シニアプロジェクトマネージャ a synonym to シニア, and ー is also removed for the synonym compound.

Tests pass and I've also tested this end-to-end in a Solr trunk build.

 Add katakana stem filter to better deal with certain katakana spelling 
 variants
 ---

 Key: LUCENE-3901
 URL: https://issues.apache.org/jira/browse/LUCENE-3901
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/analysis
Reporter: Christian Moen
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3901.patch, LUCENE-3901.patch


 Many Japanese katakana words end in a long sound that is sometimes optional.
 For example, パーティー and パーティ are both perfectly valid for party.  Similarly 
 we have センター and センタ that are variants of center as well as サーバー and サーバ 
 for server.
 I'm proposing that we add a katakana stemmer that removes this long sound if 
 the terms are longer than a configurable length.  It's also possible to add 
 the variant as a synonym, but I think stemming is preferred from a ranking 
 point of view.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3847) LuceneTestCase should check for modifications on System properties

2012-03-22 Thread Dawid Weiss (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235749#comment-13235749
 ] 

Dawid Weiss commented on LUCENE-3847:
-

I know what's changing it. Eh. So -- there is a warning being printed:
{noformat}
Mar 22, 2012 6:20:33 PM org.apache.solr.core.Config parseLuceneVersionString
WARNING: You should not use LUCENE_CURRENT as luceneMatchVersion property: if 
you use this setting, and then Solr upgrades to a newer release of Lucene, 
sizable changes may happen. If precise back compatibility is important then you 
should instead explicitly specify an actual Lucene version.
Mar 22, 2012 6:20:33 PM org.apache.solr.analysis.BaseTokenStreamFactory 
warnDeprecated
WARNING: RussianLetterTokenizerFactory is deprecated. Use 
StandardTokenizerFactory instead.
{noformat}

These warnings go through Java logging and this in turn is localized (date 
format, warning info, etc.). This in turn asks for the default TimeZone and 
this in turn sets the system property (I mentioned it a while ago).

I suggest that we just ignore user.timezone as it is triggered from multiple 
locations and doesn't seem that important?



 LuceneTestCase should check for modifications on System properties
 --

 Key: LUCENE-3847
 URL: https://issues.apache.org/jira/browse/LUCENE-3847
 Project: Lucene - Java
  Issue Type: Improvement
  Components: general/test
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3847.patch


 - fail the test if changes have been detected.
 - revert the state of system properties before the suite.
 - cleanup after the suite.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3901) Add katakana stem filter to better deal with certain katakana spelling variants

2012-03-22 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235753#comment-13235753
 ] 

Robert Muir commented on LUCENE-3901:
-

patch looks great!

 Add katakana stem filter to better deal with certain katakana spelling 
 variants
 ---

 Key: LUCENE-3901
 URL: https://issues.apache.org/jira/browse/LUCENE-3901
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/analysis
Reporter: Christian Moen
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3901.patch, LUCENE-3901.patch


 Many Japanese katakana words end in a long sound that is sometimes optional.
 For example, パーティー and パーティ are both perfectly valid for party.  Similarly 
 we have センター and センタ that are variants of center as well as サーバー and サーバ 
 for server.
 I'm proposing that we add a katakana stemmer that removes this long sound if 
 the terms are longer than a configurable length.  It's also possible to add 
 the variant as a synonym, but I think stemming is preferred from a ranking 
 point of view.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3847) LuceneTestCase should check for modifications on System properties

2012-03-22 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235755#comment-13235755
 ] 

Robert Muir commented on LUCENE-3847:
-

{quote}
I suggest that we just ignore user.timezone as it is triggered from multiple 
locations and doesn't seem that important?
{quote}

+1, we know its a side effect of our testcase itself randomizing the 
locale/timezone...

 LuceneTestCase should check for modifications on System properties
 --

 Key: LUCENE-3847
 URL: https://issues.apache.org/jira/browse/LUCENE-3847
 Project: Lucene - Java
  Issue Type: Improvement
  Components: general/test
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3847.patch


 - fail the test if changes have been detected.
 - revert the state of system properties before the suite.
 - cleanup after the suite.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3903) javadocs very very ugly if you generate with java7

2012-03-22 Thread Uwe Schindler (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-3903:
--

Attachment: LUCENE-3903.patch

Minor tweaks:
- Moved the javascript into bottom, as its then not duplicated multiple times
- fixed attributes corrumption in the CDATA.

I will commit this later!

 javadocs very very ugly if you generate with java7
 --

 Key: LUCENE-3903
 URL: https://issues.apache.org/jira/browse/LUCENE-3903
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/javadocs
Affects Versions: 3.6, 4.0
Reporter: Robert Muir
Assignee: Uwe Schindler
 Attachments: LUCENE-3903.patch, LUCENE-3903.patch, java7docs.jpg


 Java7 changes its javadocs to look much nicer, but this involves different 
 CSS styles.
 Lucene overrides the CSS with stylesheet+prettify.css which is a combination 
 of java5/6 stylesheet + google prettify:
 but there are problems because java7 has totally different styles.
 So if you generate javadocs with java7, its like you have no stylesheet at 
 all.
 A solution might be to make stylesheet7+prettify.css and conditionalize a 
 property in ant based on java version.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



java7-style docs for the website for 3.6?

2012-03-22 Thread Robert Muir
Hello,

After Uwe fixes https://issues.apache.org/jira/browse/LUCENE-3903, its
possible to build the nice looking java7-style javadocs for lucene.

we could pass the java5 bootclasspath so that its all linked up with
java5 and not confusing in any way, just looks nicer (less
geocities-like).

Any opinions?

-- 
lucidimagination.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Proposal - a high performance Key-Value store based on Lucene APIs/concepts

2012-03-22 Thread mark harwood
I've been spending quite a bit of time recently benchmarking various Key-Value 
stores for a demanding project and been largely disappointed with results
However, I have developed a promising implementation based on these concepts:  
http://www.slideshare.net/MarkHarwood/lucene-kvstore

The code needs some packaging before I can release it but the slide deck should 
give a good overview of the design.


Is this something that it is likely to be of interest as a contrib module here?
I appreciate this is a departure from the regular search focus but it builds on 
some common ground in Lucene core and may have some applications here.

Cheers,
Mark


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3903) javadocs very very ugly if you generate with java7

2012-03-22 Thread Uwe Schindler (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-3903:
--

Attachment: LUCENE-3903.patch

Robert and me noticed a small issue: Javadoc does not regenerate the 
stylesheet, if its already there. This leads to appending the same 
prettyprint.css all the time. I added a delete for this file before running 
javadocs, so its regenerated.

Now its final :-)

 javadocs very very ugly if you generate with java7
 --

 Key: LUCENE-3903
 URL: https://issues.apache.org/jira/browse/LUCENE-3903
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/javadocs
Affects Versions: 3.6, 4.0
Reporter: Robert Muir
Assignee: Uwe Schindler
 Attachments: LUCENE-3903.patch, LUCENE-3903.patch, LUCENE-3903.patch, 
 java7docs.jpg


 Java7 changes its javadocs to look much nicer, but this involves different 
 CSS styles.
 Lucene overrides the CSS with stylesheet+prettify.css which is a combination 
 of java5/6 stylesheet + google prettify:
 but there are problems because java7 has totally different styles.
 So if you generate javadocs with java7, its like you have no stylesheet at 
 all.
 A solution might be to make stylesheet7+prettify.css and conditionalize a 
 property in ant based on java version.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3011) DIH MultiThreaded bug

2012-03-22 Thread James Dyer (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Dyer updated SOLR-3011:
-

Attachment: SOLR-3011.patch

Here is a cleaned-up version of the last patch.  

- simplified TestThreaded.  
- Added a logged deprecation warning that threads will be removed in a future 
release.
- ran the DIH tests a few times and everything passed.

This I will commit shortly to the 3.x branch.

 DIH MultiThreaded bug
 -

 Key: SOLR-3011
 URL: https://issues.apache.org/jira/browse/SOLR-3011
 Project: Solr
  Issue Type: Sub-task
  Components: contrib - DataImportHandler
Affects Versions: 3.5
Reporter: Mikhail Khludnev
Priority: Minor
 Fix For: 3.6

 Attachments: SOLR-3011.patch, SOLR-3011.patch, SOLR-3011.patch, 
 SOLR-3011.patch, SOLR-3011.patch, 
 patch-3011-EntityProcessorBase-iterator.patch, 
 patch-3011-EntityProcessorBase-iterator.patch


 current DIH design is not thread safe. see last comments at SOLR-2382 and 
 SOLR-2947. I'm going to provide the patch makes DIH core threadsafe. Mostly 
 it's a SOLR-2947 patch from 28th Dec. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Proposal - a high performance Key-Value store based on Lucene APIs/concepts

2012-03-22 Thread Ryan McKinley
+1

The one potential problem is the use of Trove for primitives


On Thu, Mar 22, 2012 at 10:42 AM, mark harwood markharw...@yahoo.co.uk wrote:
 I've been spending quite a bit of time recently benchmarking various 
 Key-Value stores for a demanding project and been largely disappointed with 
 results
 However, I have developed a promising implementation based on these concepts: 
  http://www.slideshare.net/MarkHarwood/lucene-kvstore

 The code needs some packaging before I can release it but the slide deck 
 should give a good overview of the design.


 Is this something that it is likely to be of interest as a contrib module 
 here?
 I appreciate this is a departure from the regular search focus but it builds 
 on some common ground in Lucene core and may have some applications here.

 Cheers,
 Mark


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: java7-style docs for the website for 3.6?

2012-03-22 Thread Robert Muir
I created this (java7 using bootclasspath of java5) and uploaded it
here so you can see:

http://people.apache.org/~rmuir/java7-style-javadocs/

On Thu, Mar 22, 2012 at 1:36 PM, Robert Muir rcm...@gmail.com wrote:
 Hello,

 After Uwe fixes https://issues.apache.org/jira/browse/LUCENE-3903, its
 possible to build the nice looking java7-style javadocs for lucene.

 we could pass the java5 bootclasspath so that its all linked up with
 java5 and not confusing in any way, just looks nicer (less
 geocities-like).

 Any opinions?

 --
 lucidimagination.com



-- 
lucidimagination.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3903) javadocs very very ugly if you generate with java7

2012-03-22 Thread Uwe Schindler (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler resolved LUCENE-3903.
---

   Resolution: Fixed
Fix Version/s: 4.0
   3.6

Committed trunk revision: 1303916
Committed 3.x revision: 1303922

 javadocs very very ugly if you generate with java7
 --

 Key: LUCENE-3903
 URL: https://issues.apache.org/jira/browse/LUCENE-3903
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/javadocs
Affects Versions: 3.6, 4.0
Reporter: Robert Muir
Assignee: Uwe Schindler
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3903.patch, LUCENE-3903.patch, LUCENE-3903.patch, 
 java7docs.jpg


 Java7 changes its javadocs to look much nicer, but this involves different 
 CSS styles.
 Lucene overrides the CSS with stylesheet+prettify.css which is a combination 
 of java5/6 stylesheet + google prettify:
 but there are problems because java7 has totally different styles.
 So if you generate javadocs with java7, its like you have no stylesheet at 
 all.
 A solution might be to make stylesheet7+prettify.css and conditionalize a 
 property in ant based on java version.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3011) DIH MultiThreaded bug

2012-03-22 Thread Mikhail Khludnev (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235799#comment-13235799
 ] 

Mikhail Khludnev commented on SOLR-3011:


James,

I'm glad to hear it. Let me know if you like me to refresh patches at SOLR-2961 
and SOLR-2804. They are also blockers for using threads. 

 DIH MultiThreaded bug
 -

 Key: SOLR-3011
 URL: https://issues.apache.org/jira/browse/SOLR-3011
 Project: Solr
  Issue Type: Sub-task
  Components: contrib - DataImportHandler
Affects Versions: 3.5
Reporter: Mikhail Khludnev
Priority: Minor
 Fix For: 3.6

 Attachments: SOLR-3011.patch, SOLR-3011.patch, SOLR-3011.patch, 
 SOLR-3011.patch, SOLR-3011.patch, 
 patch-3011-EntityProcessorBase-iterator.patch, 
 patch-3011-EntityProcessorBase-iterator.patch


 current DIH design is not thread safe. see last comments at SOLR-2382 and 
 SOLR-2947. I'm going to provide the patch makes DIH core threadsafe. Mostly 
 it's a SOLR-2947 patch from 28th Dec. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3903) javadocs very very ugly if you generate with java7

2012-03-22 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235812#comment-13235812
 ] 

Robert Muir commented on LUCENE-3903:
-

Thanks Uwe!

 javadocs very very ugly if you generate with java7
 --

 Key: LUCENE-3903
 URL: https://issues.apache.org/jira/browse/LUCENE-3903
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/javadocs
Affects Versions: 3.6, 4.0
Reporter: Robert Muir
Assignee: Uwe Schindler
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3903.patch, LUCENE-3903.patch, LUCENE-3903.patch, 
 java7docs.jpg


 Java7 changes its javadocs to look much nicer, but this involves different 
 CSS styles.
 Lucene overrides the CSS with stylesheet+prettify.css which is a combination 
 of java5/6 stylesheet + google prettify:
 but there are problems because java7 has totally different styles.
 So if you generate javadocs with java7, its like you have no stylesheet at 
 all.
 A solution might be to make stylesheet7+prettify.css and conditionalize a 
 property in ant based on java version.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-2921) Make any Filters, Tokenizers and CharFilters implement MultiTermAwareComponent if they should

2012-03-22 Thread Erick Erickson (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-2921:
-

Attachment: SOLR-2921-trunk.patch
SOLR-2921-3x.patch

3x r:1303937
Trunk r: 1303939

 Make any Filters, Tokenizers and CharFilters implement 
 MultiTermAwareComponent if they should
 -

 Key: SOLR-2921
 URL: https://issues.apache.org/jira/browse/SOLR-2921
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 3.6, 4.0
 Environment: All
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: SOLR-2921-3x.patch, SOLR-2921-3x.patch, 
 SOLR-2921-3x.patch, SOLR-2921-trunk.patch


 SOLR-2438 creates a new MultiTermAwareComponent interface. This allows Solr 
 to automatically assemble a multiterm analyzer that does the right thing 
 vis-a-vis transforming the individual terms of a multi-term query at query 
 time. Examples are: lower casing, folding accents, etc. Currently 
 (27-Nov-2011), the following classes implement MultiTermAwareComponent:
  * ASCIIFoldingFilterFactory
  * LowerCaseFilterFactory
  * LowerCaseTokenizerFactory
  * MappingCharFilterFactory
  * PersianCharFilterFactory
 When users put any of the above in their query analyzer, Solr will do the 
 right thing at query time and the perennial question users have, why didn't 
 my wildcard query automatically lower-case (or accent fold or) my terms? 
 will be gone. Die question die!
 But taking a quick look, for instance, at the various FilterFactories that 
 exist, there are a number of possibilities that *might* be good candidates 
 for implementing MultiTermAwareComponent. But I really don't understand the 
 correct behavior here well enough to know whether these should implement the 
 interface or not. And this doesn't include other CharFilters or Tokenizers.
 Actually implementing the interface is often trivial, see the classes above 
 for examples. Note that LowerCaseTokenizerFactory returns a *Filter*, which 
 is the right thing in this case.
 Here is a quick cull of the Filters that, just from their names, might be 
 candidates. If anyone wants to take any of them on, that would be great. If 
 all you can do is provide test cases, I could probably do the code part, just 
 let me know.
 ArabicNormalizationFilterFactory
 GreekLowerCaseFilterFactory
 HindiNormalizationFilterFactory
 ICUFoldingFilterFactory
 ICUNormalizer2FilterFactory
 ICUTransformFilterFactory
 IndicNormalizationFilterFactory
 ISOLatin1AccentFilterFactory
 PersianNormalizationFilterFactory
 RussianLowerCaseFilterFactory
 TurkishLowerCaseFilterFactory

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-2921) Make any Filters, Tokenizers and CharFilters implement MultiTermAwareComponent if they should

2012-03-22 Thread Erick Erickson (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-2921.
--

   Resolution: Fixed
Fix Version/s: 4.0
   3.6

Let's open up any further issues in a new JIRA?

 Make any Filters, Tokenizers and CharFilters implement 
 MultiTermAwareComponent if they should
 -

 Key: SOLR-2921
 URL: https://issues.apache.org/jira/browse/SOLR-2921
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 3.6, 4.0
 Environment: All
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: SOLR-2921-3x.patch, SOLR-2921-3x.patch, 
 SOLR-2921-3x.patch, SOLR-2921-trunk.patch


 SOLR-2438 creates a new MultiTermAwareComponent interface. This allows Solr 
 to automatically assemble a multiterm analyzer that does the right thing 
 vis-a-vis transforming the individual terms of a multi-term query at query 
 time. Examples are: lower casing, folding accents, etc. Currently 
 (27-Nov-2011), the following classes implement MultiTermAwareComponent:
  * ASCIIFoldingFilterFactory
  * LowerCaseFilterFactory
  * LowerCaseTokenizerFactory
  * MappingCharFilterFactory
  * PersianCharFilterFactory
 When users put any of the above in their query analyzer, Solr will do the 
 right thing at query time and the perennial question users have, why didn't 
 my wildcard query automatically lower-case (or accent fold or) my terms? 
 will be gone. Die question die!
 But taking a quick look, for instance, at the various FilterFactories that 
 exist, there are a number of possibilities that *might* be good candidates 
 for implementing MultiTermAwareComponent. But I really don't understand the 
 correct behavior here well enough to know whether these should implement the 
 interface or not. And this doesn't include other CharFilters or Tokenizers.
 Actually implementing the interface is often trivial, see the classes above 
 for examples. Note that LowerCaseTokenizerFactory returns a *Filter*, which 
 is the right thing in this case.
 Here is a quick cull of the Filters that, just from their names, might be 
 candidates. If anyone wants to take any of them on, that would be great. If 
 all you can do is provide test cases, I could probably do the code part, just 
 let me know.
 ArabicNormalizationFilterFactory
 GreekLowerCaseFilterFactory
 HindiNormalizationFilterFactory
 ICUFoldingFilterFactory
 ICUNormalizer2FilterFactory
 ICUTransformFilterFactory
 IndicNormalizationFilterFactory
 ISOLatin1AccentFilterFactory
 PersianNormalizationFilterFactory
 RussianLowerCaseFilterFactory
 TurkishLowerCaseFilterFactory

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3011) DIH MultiThreaded bug

2012-03-22 Thread James Dyer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235824#comment-13235824
 ] 

James Dyer commented on SOLR-3011:
--

That would be great if you can.  Lucene/Solr 3.6 is going to be the last 3.x 
release and it is closing for new functionality soon.  SOLR-2804 for sure looks 
like something that should be there.  Is SOLR-2961 just for Tika?

 DIH MultiThreaded bug
 -

 Key: SOLR-3011
 URL: https://issues.apache.org/jira/browse/SOLR-3011
 Project: Solr
  Issue Type: Sub-task
  Components: contrib - DataImportHandler
Affects Versions: 3.5
Reporter: Mikhail Khludnev
Priority: Minor
 Fix For: 3.6

 Attachments: SOLR-3011.patch, SOLR-3011.patch, SOLR-3011.patch, 
 SOLR-3011.patch, SOLR-3011.patch, 
 patch-3011-EntityProcessorBase-iterator.patch, 
 patch-3011-EntityProcessorBase-iterator.patch


 current DIH design is not thread safe. see last comments at SOLR-2382 and 
 SOLR-2947. I'm going to provide the patch makes DIH core threadsafe. Mostly 
 it's a SOLR-2947 patch from 28th Dec. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Proposal - a high performance Key-Value store based on Lucene APIs/concepts

2012-03-22 Thread J. Delgado
Mark, can you share more on what K-V (NoSQL) stores have you've been
benchmarking and what have been the results?

Did you try all the well known ones?
http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis

-- J

On Thu, Mar 22, 2012 at 10:42 AM, mark harwood markharw...@yahoo.co.ukwrote:

 I've been spending quite a bit of time recently benchmarking various
 Key-Value stores for a demanding project and been largely disappointed with
 results
 However, I have developed a promising implementation based on these
 concepts:  http://www.slideshare.net/MarkHarwood/lucene-kvstore

 The code needs some packaging before I can release it but the slide deck
 should give a good overview of the design.


 Is this something that it is likely to be of interest as a contrib module
 here?
 I appreciate this is a departure from the regular search focus but it
 builds on some common ground in Lucene core and may have some applications
 here.

 Cheers,
 Mark


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




[jira] [Commented] (SOLR-3011) DIH MultiThreaded bug

2012-03-22 Thread Mikhail Khludnev (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235829#comment-13235829
 ] 

Mikhail Khludnev commented on SOLR-3011:


bq.  Is SOLR-2961 just for Tika?

yep. it seems so. Why do you ask, we don't need to support it further?

 DIH MultiThreaded bug
 -

 Key: SOLR-3011
 URL: https://issues.apache.org/jira/browse/SOLR-3011
 Project: Solr
  Issue Type: Sub-task
  Components: contrib - DataImportHandler
Affects Versions: 3.5
Reporter: Mikhail Khludnev
Priority: Minor
 Fix For: 3.6

 Attachments: SOLR-3011.patch, SOLR-3011.patch, SOLR-3011.patch, 
 SOLR-3011.patch, SOLR-3011.patch, 
 patch-3011-EntityProcessorBase-iterator.patch, 
 patch-3011-EntityProcessorBase-iterator.patch


 current DIH design is not thread safe. see last comments at SOLR-2382 and 
 SOLR-2947. I'm going to provide the patch makes DIH core threadsafe. Mostly 
 it's a SOLR-2947 patch from 28th Dec. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-3011) DIH MultiThreaded bug

2012-03-22 Thread James Dyer (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Dyer resolved SOLR-3011.
--

Resolution: Fixed
  Assignee: James Dyer

Committed branch_3x (only): r1303949

Thank you Mikhail!  I realize this took a lot of patience and unforgiving work 
on your part.

 DIH MultiThreaded bug
 -

 Key: SOLR-3011
 URL: https://issues.apache.org/jira/browse/SOLR-3011
 Project: Solr
  Issue Type: Sub-task
  Components: contrib - DataImportHandler
Affects Versions: 3.5
Reporter: Mikhail Khludnev
Assignee: James Dyer
Priority: Minor
 Fix For: 3.6

 Attachments: SOLR-3011.patch, SOLR-3011.patch, SOLR-3011.patch, 
 SOLR-3011.patch, SOLR-3011.patch, 
 patch-3011-EntityProcessorBase-iterator.patch, 
 patch-3011-EntityProcessorBase-iterator.patch


 current DIH design is not thread safe. see last comments at SOLR-2382 and 
 SOLR-2947. I'm going to provide the patch makes DIH core threadsafe. Mostly 
 it's a SOLR-2947 patch from 28th Dec. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2961) DIH with threads and TikaEntityProcessor JDBC ISsue

2012-03-22 Thread James Dyer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235842#comment-13235842
 ] 

James Dyer commented on SOLR-2961:
--

{quote}
Mikhail Khludnev commented on SOLR-3011:


bq.  Is SOLR-2961 just for Tika?

yep. it seems so. Why do you ask, we don't need to support it further?
{quote}

I don't think we have to support _threads_ with everything.  (This is one 
reason why I want to remove threads on Trunk.  Its going to be very difficult 
to support every use-case.)  On the other hand, if you or someone else puts up 
a good patch in the very near-term I will try to get it into 3.6.

 DIH with threads and TikaEntityProcessor JDBC ISsue
 ---

 Key: SOLR-2961
 URL: https://issues.apache.org/jira/browse/SOLR-2961
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 3.4, 3.5
 Environment: Windows Server 2008, Apache Tomcat 6, Oracle 11g, ojdbc 
 11.2.0.1
Reporter: David Webb
  Labels: dih, tika
 Attachments: SOLR-2961.patch, data-config.xml


 I have a DIH Configuration that works great when I dont specify threads=X 
 in the root entity.  As soon as I give a value for threads, I get the 
 following error messages in the stacktrace.  Please advise.  
 SEVERE: JdbcDataSource was not closed prior to finalize(), indicates a bug -- 
 POSSIBLE RESOURCE LEAK!!!
 Dec 10, 2011 1:18:33 PM org.apache.solr.handler.dataimport.JdbcDataSource 
 closeConnection
 SEVERE: Ignoring Error when closing connection
 java.sql.SQLRecoverableException: IO Error: Socket closed
   at oracle.jdbc.driver.T4CConnection.logoff(T4CConnection.java:511)
   at 
 oracle.jdbc.driver.PhysicalConnection.close(PhysicalConnection.java:3931)
   at 
 org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:401)
   at 
 org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:392)
   at 
 org.apache.solr.handler.dataimport.JdbcDataSource.finalize(JdbcDataSource.java:380)
   at java.lang.ref.Finalizer.invokeFinalizeMethod(Native Method)
   at java.lang.ref.Finalizer.runFinalizer(Unknown Source)
   at java.lang.ref.Finalizer.access$100(Unknown Source)
   at java.lang.ref.Finalizer$FinalizerThread.run(Unknown Source)
 Caused by: java.net.SocketException: Socket closed
   at java.net.SocketOutputStream.socketWrite(Unknown Source)
   at java.net.SocketOutputStream.write(Unknown Source)
   at oracle.net.ns.DataPacket.send(DataPacket.java:199)
   at oracle.net.ns.NetOutputStream.flush(NetOutputStream.java:211)
   at oracle.net.ns.NetInputStream.getNextPacket(NetInputStream.java:227)
   at oracle.net.ns.NetInputStream.read(NetInputStream.java:175)
   at oracle.net.ns.NetInputStream.read(NetInputStream.java:100)
   at oracle.net.ns.NetInputStream.read(NetInputStream.java:85)
   at 
 oracle.jdbc.driver.T4CSocketInputStreamWrapper.readNextPacket(T4CSocketInputStreamWrapper.java:123)
   at 
 oracle.jdbc.driver.T4CSocketInputStreamWrapper.read(T4CSocketInputStreamWrapper.java:79)
   at oracle.jdbc.driver.T4CMAREngine.unmarshalUB1(T4CMAREngine.java:1122)
   at oracle.jdbc.driver.T4CMAREngine.unmarshalSB1(T4CMAREngine.java:1099)
   at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:288)
   at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:191)
   at oracle.jdbc.driver.T4C7Ocommoncall.doOLOGOFF(T4C7Ocommoncall.java:61)
   at oracle.jdbc.driver.T4CConnection.logoff(T4CConnection.java:498)
   ... 8 more
 Dec 10, 2011 1:18:34 PM 
 org.apache.solr.handler.dataimport.ThreadedEntityProcessorWrapper nextRow
 SEVERE: Exception in entity : null
 org.apache.solr.handler.dataimport.DataImportHandlerException: Failed to 
 initialize DataSource: f2
   at 
 org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
   at 
 org.apache.solr.handler.dataimport.DataImporter.getDataSourceInstance(DataImporter.java:333)
   at 
 org.apache.solr.handler.dataimport.ContextImpl.getDataSource(ContextImpl.java:99)
   at 
 org.apache.solr.handler.dataimport.ThreadedContext.getDataSource(ThreadedContext.java:66)
   at 
 org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:101)
   at 
 org.apache.solr.handler.dataimport.ThreadedEntityProcessorWrapper.nextRow(ThreadedEntityProcessorWrapper.java:84)
   at 
 org.apache.solr.handler.dataimport.DocBuilder$EntityRunner.runAThread(DocBuilder.java:446)
   at 
 org.apache.solr.handler.dataimport.DocBuilder$EntityRunner.run(DocBuilder.java:399)
   at 
 

Re: Proposal - a high performance Key-Value store based on Lucene APIs/concepts

2012-03-22 Thread Mark Harwood


 Mark, can you share more on what K-V (NoSQL) stores have you've been 
 benchmarking and what have been the results?
 

Mongo, Cassandra, Krati, Bdb a Java version of BitCask, Lucene, MySQL

I was interested in benchmarking the single-server stores rather than a 
distributed setup because your choice of store could be plugged into the likes 
of Voldemort for scale out. 

The design is similar to the Bitcask paper but keeps only hashes of keys in ram 
not the full key. 

My implementation was the only store that didn't degrade noticeably as you get 
into 10s of millions of keys in the store. 





 Did you try all the well known ones?
 http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis
 
 -- J
 
 On Thu, Mar 22, 2012 at 10:42 AM, mark harwood markharw...@yahoo.co.uk 
 wrote:
 I've been spending quite a bit of time recently benchmarking various 
 Key-Value stores for a demanding project and been largely disappointed with 
 results
 However, I have developed a promising implementation based on these concepts: 
  http://www.slideshare.net/MarkHarwood/lucene-kvstore
 
 The code needs some packaging before I can release it but the slide deck 
 should give a good overview of the design.
 
 
 Is this something that it is likely to be of interest as a contrib module 
 here?
 I appreciate this is a departure from the regular search focus but it builds 
 on some common ground in Lucene core and may have some applications here.
 
 Cheers,
 Mark
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 
 


Re: Proposal - a high performance Key-Value store based on Lucene APIs/concepts

2012-03-22 Thread Thomas Matthijs
On Thu, Mar 22, 2012 at 7:29 PM, Mark Harwood markharw...@yahoo.co.ukwrote:



 Mark, can you share more on what K-V (NoSQL) stores have you've been
 benchmarking and what have been the results?


 Mongo, Cassandra, Krati, Bdb a Java version of BitCask, Lucene, MySQL

 I was interested in benchmarking the single-server stores rather than a
 distributed setup because your choice of store could be plugged into the
 likes of Voldemort for scale out.

 The design is similar to the Bitcask paper but keeps only hashes of keys
 in ram not the full key.

 My implementation was the only store that didn't degrade noticeably as you
 get into 10s of millions of keys in the store.



Random question: Do you basically end up with something very similar to
LevelDB that many people where talking about a few weeks ago ?


[jira] [Assigned] (LUCENE-3901) Add katakana stem filter to better deal with certain katakana spelling variants

2012-03-22 Thread Christian Moen (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Moen reassigned LUCENE-3901:
--

Assignee: Christian Moen

 Add katakana stem filter to better deal with certain katakana spelling 
 variants
 ---

 Key: LUCENE-3901
 URL: https://issues.apache.org/jira/browse/LUCENE-3901
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/analysis
Reporter: Christian Moen
Assignee: Christian Moen
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3901.patch, LUCENE-3901.patch


 Many Japanese katakana words end in a long sound that is sometimes optional.
 For example, パーティー and パーティ are both perfectly valid for party.  Similarly 
 we have センター and センタ that are variants of center as well as サーバー and サーバ 
 for server.
 I'm proposing that we add a katakana stemmer that removes this long sound if 
 the terms are longer than a configurable length.  It's also possible to add 
 the variant as a synonym, but I think stemming is preferred from a ranking 
 point of view.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3901) Add katakana stem filter to better deal with certain katakana spelling variants

2012-03-22 Thread Christian Moen (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235877#comment-13235877
 ] 

Christian Moen commented on LUCENE-3901:


Thanks a lot, Robert.

I'll do some more testing and hopefully I can commit this to {{trunk}} and 
{{branch_3x}} tomorrow.

 Add katakana stem filter to better deal with certain katakana spelling 
 variants
 ---

 Key: LUCENE-3901
 URL: https://issues.apache.org/jira/browse/LUCENE-3901
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/analysis
Reporter: Christian Moen
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3901.patch, LUCENE-3901.patch


 Many Japanese katakana words end in a long sound that is sometimes optional.
 For example, パーティー and パーティ are both perfectly valid for party.  Similarly 
 we have センター and センタ that are variants of center as well as サーバー and サーバ 
 for server.
 I'm proposing that we add a katakana stemmer that removes this long sound if 
 the terms are longer than a configurable length.  It's also possible to add 
 the variant as a synonym, but I think stemming is preferred from a ranking 
 point of view.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



trunk javadoc failures?

2012-03-22 Thread Chris Hostetter


I think there must be something wonky with the javadoc classpath (or 
whatever it's called in javadoc) on trunk when using the java 6 javadoc. 
I'm seeing solr/contrib/uima complain a lot about packages/files not 
existing when using ant javadoc (either at the top level or just in 
solr).


is anyone else seeing this?...


  [javadoc] Constructing Javadoc information...
  [javadoc] 
/home/hossman/lucene/dev/solr/contrib/uima/src/java/org/apache/solr/uima/analysis/UIMAAnnotationsTokenizerFactory.java:21: 
package org.apache.lucene.analysis.uima does not exist
  [javadoc] import 
org.apache.lucene.analysis.uima.UIMAAnnotationsTokenizer;

  [javadoc]   ^
  [javadoc] 
/home/hossman/lucene/dev/solr/contrib/uima/src/java/org/apache/solr/uima/analysis/UIMATypeAwareAnnotationsTokenizerFactory.java:21: 
package org.apache.lucene.analysis.uima does not exist
  [javadoc] import 
org.apache.lucene.analysis.uima.UIMATypeAwareAnnotationsTokenizer;

  [javadoc]   ^
  [javadoc] 
/home/hossman/lucene/dev/solr/contrib/uima/src/java/org/apache/solr/uima/processor/UIMAUpdateRequestProcessor.java:26: 
package org.apache.lucene.analysis.uima.ae does not exist

  [javadoc] import org.apache.lucene.analysis.uima.ae.AEProvider;
  [javadoc]  ^
  [javadoc] 
/home/hossman/lucene/dev/solr/contrib/uima/src/java/org/apache/solr/uima/processor/UIMAUpdateRequestProcessor.java:27: 
package org.apache.lucene.analysis.uima.ae does not exist

  [javadoc] import org.apache.lucene.analysis.uima.ae.AEProviderFactory;
  [javadoc]  ^
  [javadoc] 
/home/hossman/lucene/dev/solr/contrib/uima/src/java/org/apache/solr/uima/processor/UIMAUpdateRequestProcessor.java:51: 
cannot find symbol

  [javadoc] symbol  : class AEProvider
  [javadoc] location: class 
org.apache.solr.uima.processor.UIMAUpdateRequestProcessor

  [javadoc]   private AEProvider aeProvider;
  [javadoc]   ^
  [javadoc] Standard Doclet version 1.6.0_24
  [javadoc] Building tree for all the packages and classes...
  [javadoc] 
/home/hossman/lucene/dev/solr/contrib/uima/src/java/org/apache/solr/uima/analysis/UIMAAnnotationsTokenizerFactory.java:30: 
warning - Tag @link: reference not found: UIMAAnnotationsTokenizer
  [javadoc] 
/home/hossman/lucene/dev/solr/contrib/uima/src/java/org/apache/solr/uima/analysis/UIMATypeAwareAnnotationsTokenizerFactory.java:30: 
warning - Tag @link: reference not found: 
UIMATypeAwareAnnotationsTokenizer
  [javadoc] Generating 
/home/hossman/lucene/dev/solr/build/docs/api/org/apache/solr/uima/processor/exception//FieldMappingException.html...
  [javadoc] Copying file 
/home/hossman/lucene/dev/solr/core/src/java/doc-files/tutorial.html to 
directory /home/hossman/lucene/dev/solr/build/docs/api/doc-files...
  [javadoc] 
/home/hossman/lucene/dev/solr/contrib/uima/src/java/org/apache/solr/uima/analysis/UIMAAnnotationsTokenizerFactory.java:30: 
warning - Tag @link: reference not found: UIMAAnnotationsTokenizer
  [javadoc] 
/home/hossman/lucene/dev/solr/contrib/uima/src/java/org/apache/solr/uima/analysis/UIMATypeAwareAnnotationsTokenizerFactory.java:30: 
warning - Tag @link: reference not found: 
UIMATypeAwareAnnotationsTokenizer
  [javadoc] Generating 
/home/hossman/lucene/dev/solr/build/docs/api/org/apache/solr/util//package-summary.html...
  [javadoc] Copying file 
/home/hossman/lucene/dev/solr/core/src/java/org/apache/solr/util/doc-files/min-should-match.html 
to directory 
/home/hossman/lucene/dev/solr/build/docs/api/org/apache/solr/util/doc-files...
  [javadoc] Generating 
/home/hossman/lucene/dev/solr/build/docs/api/serialized-form.html...
  [javadoc] Copying file 
/home/hossman/lucene/dev/lucene/tools/prettify/stylesheet+prettify.css to 
file 
/home/hossman/lucene/dev/solr/build/docs/api/stylesheet+prettify.css...
  [javadoc] 
/home/hossman/lucene/dev/solr/contrib/uima/src/java/org/apache/solr/uima/analysis/UIMAAnnotationsTokenizerFactory.java:30: 
warning - Tag @link: reference not found: UIMAAnnotationsTokenizer
  [javadoc] 
/home/hossman/lucene/dev/solr/contrib/uima/src/java/org/apache/solr/uima/analysis/UIMATypeAwareAnnotationsTokenizerFactory.java:30: 
warning - Tag @link: reference not found: 
UIMATypeAwareAnnotationsTokenizer
  [javadoc] 
/home/hossman/lucene/dev/solr/contrib/uima/src/java/org/apache/solr/uima/analysis/UIMAAnnotationsTokenizerFactory.java:30: 
warning - Tag @link: reference not found: UIMAAnnotationsTokenizer
  [javadoc] 
/home/hossman/lucene/dev/solr/contrib/uima/src/java/org/apache/solr/uima/analysis/UIMATypeAwareAnnotationsTokenizerFactory.java:30: 
warning - Tag @link: reference not found: 
UIMATypeAwareAnnotationsTokenizer

  [javadoc] Building index for all the packages and classes...
  [javadoc] 
/home/hossman/lucene/dev/solr/contrib/uima/src/java/org/apache/solr/uima/analysis/UIMAAnnotationsTokenizerFactory.java:30: 
warning - Tag @link: reference not found: 

RE: trunk javadoc failures?

2012-03-22 Thread Uwe Schindler
This is as far as I remember a bug in the build scripts. Building Javadocs
from inside a contrib seems to be broken...

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Chris Hostetter [mailto:hossman_luc...@fucit.org]
 Sent: Thursday, March 22, 2012 8:09 PM
 To: Lucene Dev
 Subject: trunk javadoc failures?
 
 
 I think there must be something wonky with the javadoc classpath (or
 whatever it's called in javadoc) on trunk when using the java 6 javadoc.
 I'm seeing solr/contrib/uima complain a lot about packages/files not
existing
 when using ant javadoc (either at the top level or just in solr).
 
 is anyone else seeing this?...
 
 
[javadoc] Constructing Javadoc information...
[javadoc]
 /home/hossman/lucene/dev/solr/contrib/uima/src/java/org/apache/solr/uima/
 analysis/UIMAAnnotationsTokenizerFactory.java:21:
 package org.apache.lucene.analysis.uima does not exist
[javadoc] import
 org.apache.lucene.analysis.uima.UIMAAnnotationsTokenizer;
[javadoc]   ^
[javadoc]
 /home/hossman/lucene/dev/solr/contrib/uima/src/java/org/apache/solr/uima/
 analysis/UIMATypeAwareAnnotationsTokenizerFactory.java:21:
 package org.apache.lucene.analysis.uima does not exist
[javadoc] import
 org.apache.lucene.analysis.uima.UIMATypeAwareAnnotationsTokenizer;
[javadoc]   ^
[javadoc]
 /home/hossman/lucene/dev/solr/contrib/uima/src/java/org/apache/solr/uima/
 processor/UIMAUpdateRequestProcessor.java:26:
 package org.apache.lucene.analysis.uima.ae does not exist
[javadoc] import org.apache.lucene.analysis.uima.ae.AEProvider;
[javadoc]  ^
[javadoc]
 /home/hossman/lucene/dev/solr/contrib/uima/src/java/org/apache/solr/uima/
 processor/UIMAUpdateRequestProcessor.java:27:
 package org.apache.lucene.analysis.uima.ae does not exist
[javadoc] import org.apache.lucene.analysis.uima.ae.AEProviderFactory;
[javadoc]  ^
[javadoc]
 /home/hossman/lucene/dev/solr/contrib/uima/src/java/org/apache/solr/uima/
 processor/UIMAUpdateRequestProcessor.java:51:
 cannot find symbol
[javadoc] symbol  : class AEProvider
[javadoc] location: class
 org.apache.solr.uima.processor.UIMAUpdateRequestProcessor
[javadoc]   private AEProvider aeProvider;
[javadoc]   ^
[javadoc] Standard Doclet version 1.6.0_24
[javadoc] Building tree for all the packages and classes...
[javadoc]
 /home/hossman/lucene/dev/solr/contrib/uima/src/java/org/apache/solr/uima/
 analysis/UIMAAnnotationsTokenizerFactory.java:30:
 warning - Tag @link: reference not found: UIMAAnnotationsTokenizer
[javadoc]
 /home/hossman/lucene/dev/solr/contrib/uima/src/java/org/apache/solr/uima/
 analysis/UIMATypeAwareAnnotationsTokenizerFactory.java:30:
 warning - Tag @link: reference not found:
 UIMATypeAwareAnnotationsTokenizer
[javadoc] Generating
 /home/hossman/lucene/dev/solr/build/docs/api/org/apache/solr/uima/process
 or/exception//FieldMappingException.html...
[javadoc] Copying file
 /home/hossman/lucene/dev/solr/core/src/java/doc-files/tutorial.html to
 directory /home/hossman/lucene/dev/solr/build/docs/api/doc-files...
[javadoc]
 /home/hossman/lucene/dev/solr/contrib/uima/src/java/org/apache/solr/uima/
 analysis/UIMAAnnotationsTokenizerFactory.java:30:
 warning - Tag @link: reference not found: UIMAAnnotationsTokenizer
[javadoc]
 /home/hossman/lucene/dev/solr/contrib/uima/src/java/org/apache/solr/uima/
 analysis/UIMATypeAwareAnnotationsTokenizerFactory.java:30:
 warning - Tag @link: reference not found:
 UIMATypeAwareAnnotationsTokenizer
[javadoc] Generating
 /home/hossman/lucene/dev/solr/build/docs/api/org/apache/solr/util//package
 -summary.html...
[javadoc] Copying file
 /home/hossman/lucene/dev/solr/core/src/java/org/apache/solr/util/doc-
 files/min-should-match.html
 to directory
 /home/hossman/lucene/dev/solr/build/docs/api/org/apache/solr/util/doc-
 files...
[javadoc] Generating
 /home/hossman/lucene/dev/solr/build/docs/api/serialized-form.html...
[javadoc] Copying file
 /home/hossman/lucene/dev/lucene/tools/prettify/stylesheet+prettify.css to
file
 /home/hossman/lucene/dev/solr/build/docs/api/stylesheet+prettify.css...
[javadoc]
 /home/hossman/lucene/dev/solr/contrib/uima/src/java/org/apache/solr/uima/
 analysis/UIMAAnnotationsTokenizerFactory.java:30:
 warning - Tag @link: reference not found: UIMAAnnotationsTokenizer
[javadoc]
 /home/hossman/lucene/dev/solr/contrib/uima/src/java/org/apache/solr/uima/
 analysis/UIMATypeAwareAnnotationsTokenizerFactory.java:30:
 warning - Tag @link: reference not found:
 UIMATypeAwareAnnotationsTokenizer
[javadoc]
 /home/hossman/lucene/dev/solr/contrib/uima/src/java/org/apache/solr/uima/
 analysis/UIMAAnnotationsTokenizerFactory.java:30:
 warning - Tag @link: reference not found: 

[JENKINS] Lucene-Solr-tests-only-3.x - Build # 12843 - Still Failing

2012-03-22 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-3.x/12843/

All tests passed

Build Log (for compile errors):
[...truncated 22106 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

  1   2   >