date:20120608


[ 
https://issues.apache.org/jira/browse/LUCENE-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291600#comment-13291600
 ] 

Simon Willnauer commented on LUCENE-4087:
-

After talking about this to other committers during the conference I think this 
is really a bit more controversial than it seemed. Except of the DocValues 
behavior this all is pre-existing behavior. The discussion is similar to 
changing norms through IR and removing that capability did bring up some hard 
discussions. Yet, I think we should only solve the DocValues issue in the least 
intrusive way and discuss the omitNorms  IndexOptions behavior in a different 
issue. If we make all this throwing exceptions we almost introduce a schema 
here which makes lucene 4.0 very different in terms of RT behavior compared to 
previous versions.

 Provide consistent IW behavior for illegal meta data changes
 

 Key: LUCENE-4087
 URL: https://issues.apache.org/jira/browse/LUCENE-4087
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
 Fix For: 4.0, 4.1, 5.0


 Currently IW fails late and inconsistent if field metadata like an already 
 defined DocValues type or un-omitting norms.
 we can approach this similar to how we handle consistent field number and:
  
 * throw exception if indexOptions conflict (e.g. omitTF=true versus
 false) instead of silently dropping positions on merge
 * same with omitNorms
 * same with norms types and docvalues types
 * still keeping field numbers consistent
 this way we could eliminate all these traps and just give an
 exception instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3524) Make discard-punctuation feature in Kuromoji configurable from JapaneseTokenizerFactory


[ 
https://issues.apache.org/jira/browse/SOLR-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291635#comment-13291635
 ] 

Christian Moen commented on SOLR-3524:
--

Hiraga-san, there are different views on how punctuation characters best are 
handled by tokenizers.  Punctuation characters generally don't convey much 
meaning useful for text search, so they are generally removed in Lucene. (A 
different point of view is that tokenizers shouldn't remove punctuations and 
that filters should do this.)

The ability to keep punctuation was left as an expert-feature in 
JapanseTokenizer and I think we can expose this as an expert feature in Solr as 
well.  Could you share some details on your use-case just so that I get a 
better idea of the background and importance of this?


  


 Make discard-punctuation feature in Kuromoji configurable from 
 JapaneseTokenizerFactory
 ---

 Key: SOLR-3524
 URL: https://issues.apache.org/jira/browse/SOLR-3524
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 3.6
Reporter: Kazuaki Hiraga
Priority: Minor
 Attachments: kuromoji_discard_punctuation.patch.txt


 JapaneseTokenizer, Kuromoji doesn't provide configuration option to preserve 
 punctuation in Japanese text, although It has a parameter to change this 
 behavior.  JapaneseTokenizerFactory always set third parameter, which 
 controls this behavior, to true to remove punctuation.
 I would like to have an option I can configure this behavior by fieldtype 
 definition in schema.xml.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3520) Add missing tests for JapaneseReadingFormFilterFactory and JapaneseKatakanaStemFilterFactory


[ 
https://issues.apache.org/jira/browse/SOLR-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291637#comment-13291637
 ] 

Robert Muir commented on SOLR-3520:
---

+1

its dangerous when these don't have tests, there could be very simple bugs or 
patches in the future that break things
and we won't notice.

we should also keep an eye on 
https://builds.apache.org/job/Solr-trunk/clover/org/apache/solr/analysis/pkg-summary.html
which makes it very easy to see which ones are missing tests.

 Add missing tests for JapaneseReadingFormFilterFactory and 
 JapaneseKatakanaStemFilterFactory
 

 Key: SOLR-3520
 URL: https://issues.apache.org/jira/browse/SOLR-3520
 Project: Solr
  Issue Type: Test
Affects Versions: 4.0, 5.0
Reporter: Christian Moen
Priority: Minor
 Attachments: SOLR-3520.patch


 {{JapaneseReadingFormFilterFactory}} and 
 {{JapaneseKatakanaStemFilterFactory}} doesn't have any tests and it would be 
 good to have some.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3520) Add missing tests for JapaneseReadingFormFilterFactory and JapaneseKatakanaStemFilterFactory


[ 
https://issues.apache.org/jira/browse/SOLR-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291639#comment-13291639
 ] 

Christian Moen commented on SOLR-3520:
--

Thanks, Robert.

We have them in Lucene, but not adding some for Solr was an oversight on my 
part.  Very good idea to keep an eye the Clover reports.

I'll commit this shortly.

 Add missing tests for JapaneseReadingFormFilterFactory and 
 JapaneseKatakanaStemFilterFactory
 

 Key: SOLR-3520
 URL: https://issues.apache.org/jira/browse/SOLR-3520
 Project: Solr
  Issue Type: Test
Affects Versions: 4.0, 5.0
Reporter: Christian Moen
Priority: Minor
 Attachments: SOLR-3520.patch


 {{JapaneseReadingFormFilterFactory}} and 
 {{JapaneseKatakanaStemFilterFactory}} doesn't have any tests and it would be 
 good to have some.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4120) FST should use packed integer arrays

Adrien Grand created LUCENE-4120:


 Summary: FST should use packed integer arrays
 Key: LUCENE-4120
 URL: https://issues.apache.org/jira/browse/LUCENE-4120
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/FSTs
Affects Versions: 5.0
Reporter: Adrien Grand
Priority: Minor
 Fix For: 5.0


There are some places where an int[] could be advantageously replaced with a 
packed integer array.

I am thinking (at least) of:
 * FST.nodeAddress (GrowableWriter)
 * FST.inCounts (GrowableWriter)
 * FST.nodeRefToAddress (read-only Reader)

The serialization/deserialization methods should be modified too in order to 
take advantage of PackedInts.get{Reader,Writer}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3524) Make discard-punctuation feature in Kuromoji configurable from JapaneseTokenizerFactory


[ 
https://issues.apache.org/jira/browse/SOLR-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291643#comment-13291643
 ] 

Christian Moen commented on SOLR-3524:
--

Ohtani-san, thanks for the patch!

I've tried it on {{trunk}} and applying it fails because of an 
{{InitializationException}} is thrown instead of a {{SolrException}}.  I'll 
correct this shortly.

We also need some tests here...

 Make discard-punctuation feature in Kuromoji configurable from 
 JapaneseTokenizerFactory
 ---

 Key: SOLR-3524
 URL: https://issues.apache.org/jira/browse/SOLR-3524
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 3.6
Reporter: Kazuaki Hiraga
Priority: Minor
 Attachments: kuromoji_discard_punctuation.patch.txt


 JapaneseTokenizer, Kuromoji doesn't provide configuration option to preserve 
 punctuation in Japanese text, although It has a parameter to change this 
 behavior.  JapaneseTokenizerFactory always set third parameter, which 
 controls this behavior, to true to remove punctuation.
 I would like to have an option I can configure this behavior by fieldtype 
 definition in schema.xml.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3525) Per-field similarity should display used impl. in debug output broken

2012-06-08 Thread Markus Jelsma (JIRA)

Markus Jelsma created SOLR-3525:
---

 Summary: Per-field similarity should display used impl. in debug 
output broken
 Key: SOLR-3525
 URL: https://issues.apache.org/jira/browse/SOLR-3525
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 4.0
Reporter: Markus Jelsma
Priority: Minor
 Fix For: 4.0


When using per-field similarity debugQuery should display the used similarity 
implementation for each match.

Right now it's broken and displays empty brackets:
112.33515 = (MATCH) weight(content:blah in 273) [], result of:

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3526) Remove classfile dependency on ZooKeeper from CoreContainer

2012-06-08 Thread Michael Froh (JIRA)

Michael Froh created SOLR-3526:
--

 Summary: Remove classfile dependency on ZooKeeper from 
CoreContainer
 Key: SOLR-3526
 URL: https://issues.apache.org/jira/browse/SOLR-3526
 Project: Solr
  Issue Type: Wish
  Components: SolrCloud
Affects Versions: 4.0
Reporter: Michael Froh


We are using Solr as a library embedded within an existing application, and are 
currently developing toward using 4.0 when it is released.

We are currently instantiating SolrCores with null CoreDescriptors (and hence 
no CoreContainer), since we don't need SolrCloud functionality (and do not want 
to depend on ZooKeeper).

A couple of months ago, SearchHandler was modified to try to retrieve a 
ShardHandlerFactory from the CoreContainer. I was able to work around this by 
specifying a dummy ShardHandlerFactory in the config.

Now UpdateRequestProcessorChain is inserting a DistributedUpdateProcessor into 
my chains, again triggering a NPE when trying to dereference the CoreDescriptor.

I would happily place the SolrCores in CoreContainers, except that 
CoreContainer imports and references org.apache.zookeeper.KeeperException, 
which we do not have (and do not want) in our classpath. Therefore, I get a 
ClassNotFoundException when loading the CoreContainer class.

Ideally (IMHO), ZkController should isolate the ZooKeeper dependency, and 
simply rethrow KeeperExceptions as 
org.apache.solr.common.cloud.ZooKeeperException (or some Solr-hosted checked 
exception). Then CoreContainer could remove the offending import/references.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3525) Per-field similarity should display used impl. in debug output broken


[ 
https://issues.apache.org/jira/browse/SOLR-3525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291651#comment-13291651
 ] 

Robert Muir commented on SOLR-3525:
---

note: its an impl detail of PerFieldSimilarityWrapper that it does different 
things for different fields.
The reason you probably get blank brackets is because the weight uses [ + 
similarity.getClass().getSimpleName() + ]
In the solr case this is an anonymous class.

If we want to keep this (I just added it for debugging, we could also just 
remove it), probably better instead print 
the class of whats scoring the documents: so you would see ExactBM25DocScorer 
or SloppyTFIDFDocScorer.

 Per-field similarity should display used impl. in debug output broken
 -

 Key: SOLR-3525
 URL: https://issues.apache.org/jira/browse/SOLR-3525
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 4.0
Reporter: Markus Jelsma
Priority: Minor
 Fix For: 4.0


 When using per-field similarity debugQuery should display the used similarity 
 implementation for each match.
 Right now it's broken and displays empty brackets:
 112.33515 = (MATCH) weight(content:blah in 273) [], result of:

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (LUCENE-4120) FST should use packed integer arrays


 [ 
https://issues.apache.org/jira/browse/LUCENE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand reassigned LUCENE-4120:


Assignee: Adrien Grand

 FST should use packed integer arrays
 

 Key: LUCENE-4120
 URL: https://issues.apache.org/jira/browse/LUCENE-4120
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/FSTs
Affects Versions: 5.0
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Fix For: 5.0


 There are some places where an int[] could be advantageously replaced with a 
 packed integer array.
 I am thinking (at least) of:
  * FST.nodeAddress (GrowableWriter)
  * FST.inCounts (GrowableWriter)
  * FST.nodeRefToAddress (read-only Reader)
 The serialization/deserialization methods should be modified too in order to 
 take advantage of PackedInts.get{Reader,Writer}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Solr-trunk - Build # 1879 - Failure

Build: https://builds.apache.org/job/Solr-trunk/1879/

3 tests failed.
REGRESSION:  org.apache.solr.cloud.FullSolrCloudTest.testDistribSearch

Error Message:
Timeout occured while waiting response from server at: 
http://localhost:13109/solr/collection1

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting 
response from server at: http://localhost:13109/solr/collection1
at 
__randomizedtesting.SeedInfo.seed([CCF7D390CC98B64:8D29F3217B96EB58]:0)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:405)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:182)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
at 
org.apache.solr.cloud.FullSolrCloudTest.index_specific(FullSolrCloudTest.java:498)
at 
org.apache.solr.cloud.FullSolrCloudTest.brindDownShardIndexSomeDocsAndRecover(FullSolrCloudTest.java:713)
at 
org.apache.solr.cloud.FullSolrCloudTest.doTest(FullSolrCloudTest.java:550)
at 
org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:680)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
at 
org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53)
at 
org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132)
at

Packaging dependencies for Ivy and license validation

2012-06-08 Thread Lance Norskog

Hi-

I would like to get the OpenNLP project's packages in shape so that
the Lucene build will accept them.

What has to be done in a third-party package for the license
validation to pass? Is there a Maven cheatsheet somewhere for the
right pom.xml snippet?

-- 
Lance Norskog
goks...@gmail.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3178) Versioning - optimistic locking

2012-06-08 Thread Per Steffensen (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291668#comment-13291668
 ] 

Per Steffensen commented on SOLR-3178:
--

{quote} Regarding error handling, I tracked down the original issue: SOLR-445 
{quote}

Yes, SOLR-445 is solved by my patch - the nice way. On certain kinds of errors 
(PartialError subclasses) during the handling of a particular document in a 
nultidocument/batch-update the processing of subsequent documents will 
continue. The client will receive a response describing all errors (wrapped in 
PartialErrors) that happend during the processing of the entire update-request 
(multidocument/batch). Please have a look at 
http://wiki.apache.org/solr/Per%20Steffensen/Update%20semantics#Multi_document_updates
 

{quote} It's just a guess, but I think it unlikely any committers would feel 
comfortable tackling this big patch, or even have time to understand all of the 
different aspects. They may agree with some parts but disagree with other parts 
{quote}

Of course that is up to you, but I believe Solr has a problem being a real Open 
Source project receiving contributions from many semi-related organistions 
around the world, if you do not trust your test suite. Basically when taking 
in a patch a committer do not need to understand everything down to every 
detail. It should be enough (if you trust your test suite) to
* Verify that all existing tests are still green - and havnt been hacked
* Verify that all new tests seem to be meaningfull and covering the features 
described in the corresponding Jira (and in my case the associated Wiki page), 
indicating that the new features are usefull and well tested
* Scan through the new code to see if it is breaking any design principals 
etc., and in general if it seems to be doing the right thing the right way

As long as a patch does not break any existing functionality, and seems to 
bring nice new functionality (you should be able to see that from the added 
tests) a patch cannot be that harmfull - you can always refactor if you realize 
that you disagree with some parts. It all depends on trusting your test 
suite. Dont you agree, in principle at least?

Regards, Per Steffensen


 Versioning - optimistic locking
 ---

 Key: SOLR-3178
 URL: https://issues.apache.org/jira/browse/SOLR-3178
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 3.5
 Environment: All
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: RDBMS, insert, locking, nosql, optimistic, uniqueKey, 
 update, versioning
 Fix For: 4.0

 Attachments: SOLR-3173_3178_3382_3428_plus.patch, SOLR-3178.patch, 
 SOLR_3173_3178_3382_plus.patch

   Original Estimate: 168h
  Remaining Estimate: 168h

 In order increase the ability of Solr to be used as a NoSql database (lots of 
 concurrent inserts, updates, deletes and queries in the entire lifetime of 
 the index) instead of just a search index (first: everything indexed (in one 
 thread), after: only queries), I would like Solr to support versioning to be 
 used for optimistic locking.
 When my intent (see SOLR-3173) is to update an existing document, I will need 
 to provide a version-number equal to the version number I got when I fetched 
 the existing document for update plus one. If this provided version-number 
 does not correspond to the newest version-number of that document at the 
 time of update plus one, I will get a VersionConflict error. If it does 
 correspond the document will be updated with the new one, so that the newest 
 version-number of that document is NOW one higher than before the update. 
 Correct but efficient concurrency handling.
 When my intent (see SOLR-3173) is to insert a new document, the version 
 number provided will not be used - instead a version-number 0 will be used. 
 According to SOLR-3173 insert will only succeed if a document with the same 
 value on uniqueKey-field does not already exist.
 In general when talking about different versions of the same document, of 
 course we need to be able to identify when a document is the same - that, 
 per definition, is when the values of the uniqueKey-fields are equal. 
 The functionality provided by this issue is only really meaningfull when you 
 run with updateLog activated.
 This issue might be solved more or less at the same time as SOLR-3173, and 
 only one single SVN patch might be given to cover both issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

Confusing method names to get the size of objects

2012-06-08 Thread Adrien Grand

Hi,

Lucene and Solr have a few classes that expose the size of their
instances, but with different method names. There are at least
ramBytesUsed (packed ints), sizeInBytes (FST, RamDirectory) and
memSize (Solr DocSets) that provide an estimation of the memory used
in bytes. The confusing thing is that sizeInBytes is sometimes also
used for on-disk sizes (SegmentInfo for example). I think it would
improve readability to stick to only two method names, one for the
in-memory size and one for the on-disk size. Or maybe these methods
have different meanings that I am missing? What do you think?

-- 
Adrien

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4115) JAR resolution/ cleanup should be done automatically for ant clean/ eclipse/ resolve.


[ 
https://issues.apache.org/jira/browse/LUCENE-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291670#comment-13291670
 ] 

Robert Muir commented on LUCENE-4115:
-

Another possibility (didnt investigate if it has options that would work for 
us) is the sync=true option for retrieve:

http://ant.apache.org/ivy/history/trunk/use/retrieve.html

Just at a glance there could be some problems: sha1/license/notice files, and 
solr/lib which is 'shared' across solrj and core dependencies. 

But maybe we could still utilize this...

 JAR resolution/ cleanup should be done automatically for ant clean/ eclipse/ 
 resolve.
 -

 Key: LUCENE-4115
 URL: https://issues.apache.org/jira/browse/LUCENE-4115
 Project: Lucene - Java
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 4.0, 5.0

 Attachments: LUCENE-4111.patch


 I think we should add the following target deps:
 ant clean [depends on] clean-jars
 ant resolve [depends on] clean-jars
 ant eclipse [depends on] resolve, clean-jars
 ant idea [depends on] resolve, clean-jars
 This eliminates the need to remember about cleaning up stale jars which users 
 complain about (and I think they're right about it). The overhead will be 
 minimal since resolve is only going to copy jars from cache. Eclipse won't 
 have a problem with updated JARs if they end up at the same location.
 If there are no objections I will fix this in a few hours.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4119) SegmentInfoFormat.getSegmentInfos{Reader,Writer} should be singular


[ 
https://issues.apache.org/jira/browse/LUCENE-4119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291671#comment-13291671
 ] 

Robert Muir commented on LUCENE-4119:
-

Thanks for cleaning this up!

 SegmentInfoFormat.getSegmentInfos{Reader,Writer} should be singular
 ---

 Key: LUCENE-4119
 URL: https://issues.apache.org/jira/browse/LUCENE-4119
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/codecs
Affects Versions: 4.0, 5.0
Reporter: Andrzej Bialecki 
Assignee: Andrzej Bialecki 
 Fix For: 4.0, 5.0

 Attachments: LUCENE-4119.patch


 Left-over from SegmentInfos refactoring. The name should be singular, we 
 don't have SegmentInfosWriter/Reader anymore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-861) SOLRJ Client does not release connections 'nicely' by default

2012-06-08 Thread Sami Siren (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291678#comment-13291678
 ] 

Sami Siren commented on SOLR-861:
-

bq. it's not clear to me if this has already been addressed by the new client 
in SOLR-2020 - can you please triage for 4.0?

I have not done anything specific to address this issue. Since opening this 
issue a shutdown() method was added in HttpSolrServer that should take care of 
releasing the resources, if that's not working then there's a bug.

 SOLRJ Client does not release connections 'nicely' by default
 -

 Key: SOLR-861
 URL: https://issues.apache.org/jira/browse/SOLR-861
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 1.3
 Environment: linux
Reporter: Ian Holsman
Assignee: Sami Siren
 Fix For: 4.0

 Attachments: SimpleClient.patch


 as-is the SolrJ Commons HttpServer uses the multi-threaded http connection 
 manager. This manager seems to keep the connection alive for the client and 
 does not close it when the object is dereferenced.
 When you keep on opening new CommonsHttpSolrServer instances it results in a 
 socket that is stuck in the CLOSE_WAIT state. Eventually this will use up all 
 your available file handles, causing your client to die a painful death.
 The solution I propose is that it uses a 'Simple' HttpConnectionManager which 
 is set to not reuse connections if you don't specify a HttpClient.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4115) JAR resolution/ cleanup should be done automatically for ant clean/ eclipse/ resolve.

2012-06-08 Thread Dawid Weiss (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291681#comment-13291681
 ] 

Dawid Weiss commented on LUCENE-4115:
-

I've checked that -- sync on retrieve deletes everything from a folder (there 
is no exclusion pattern to be applied). Besides it won't solve the locking 
problem on windows (assuming something keeps a lock on a jar to be deleted it'd 
fail anyway).

A true nice solution would be to revisit the issue where classpaths are 
constructed to ivy cache directly (they're always correct then) and just use 
copying for packaging.

 JAR resolution/ cleanup should be done automatically for ant clean/ eclipse/ 
 resolve.
 -

 Key: LUCENE-4115
 URL: https://issues.apache.org/jira/browse/LUCENE-4115
 Project: Lucene - Java
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 4.0, 5.0

 Attachments: LUCENE-4111.patch


 I think we should add the following target deps:
 ant clean [depends on] clean-jars
 ant resolve [depends on] clean-jars
 ant eclipse [depends on] resolve, clean-jars
 ant idea [depends on] resolve, clean-jars
 This eliminates the need to remember about cleaning up stale jars which users 
 complain about (and I think they're right about it). The overhead will be 
 minimal since resolve is only going to copy jars from cache. Eclipse won't 
 have a problem with updated JARs if they end up at the same location.
 If there are no objections I will fix this in a few hours.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2526) Grouping on multiple fields

2012-06-08 Thread Olav Frengstad (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291698#comment-13291698
 ] 

Olav Frengstad commented on SOLR-2526:
--

What's the status of this? As [LUCENE-3099] and [LUCENE-2883] is fixed what 
would it take to fix this? I would gladly try implementing this, any pointers 
on where to start would be appreciated. 

 Grouping on multiple fields
 ---

 Key: SOLR-2526
 URL: https://issues.apache.org/jira/browse/SOLR-2526
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 4.0
Reporter: Arian Karbasi
Priority: Minor

 Grouping on multiple fields and/or ranges should be an option (X,Y) 
 groupings.   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-3178) Versioning - optimistic locking

2012-06-08 Thread Per Steffensen (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291668#comment-13291668
 ] 

Per Steffensen edited comment on SOLR-3178 at 6/8/12 11:13 AM:
---

{quote} Regarding error handling, I tracked down the original issue: SOLR-445 
{quote}

Yes, SOLR-445 is solved by my patch - the nice way. On certain kinds of errors 
(PartialError subclasses) during the handling of a particular document in a 
nultidocument/batch-update the processing of subsequent documents will 
continue. The client will receive a response describing all errors (wrapped in 
PartialErrors) that happend during the processing of the entire update-request 
(multidocument/batch). Please have a look at 
http://wiki.apache.org/solr/Per%20Steffensen/Update%20semantics#Multi_document_updates
 

{quote} It's just a guess, but I think it unlikely any committers would feel 
comfortable tackling this big patch, or even have time to understand all of the 
different aspects. They may agree with some parts but disagree with other parts 
{quote}

Of course that is up to you, but I believe Solr has a problem being a real Open 
Source project receiving contributions from many semi-related organistions 
around the world, if you do not trust your test suite. Basically when taking 
in a patch a committer do not need to understand everything down to every 
detail. It should be enough (if you trust your test suite) to
* Verify that all existing tests are still green - and havnt been hacked
* Verify that all new tests seem to be meaningfull and covering the features 
described in the corresponding Jira (and in my case the associated Wiki page), 
indicating that the new features are usefull and well tested (in order to be 
able to trust the test suite will reveal if future commits will ruin this new 
feature)
* Scan through the new code to see if it is breaking any design principals 
etc., and in general if it seems to be doing the right thing the right way

As long as a patch does not break any existing functionality, and seems to 
bring nice new functionality (you should be able to see that from the added 
tests) a patch cannot be that harmfull - you can always refactor if you realize 
that you disagree with some parts. It all depends on trusting your test 
suite. Dont you agree, in principle at least?

Regards, Per Steffensen


  was (Author: steff1193):
{quote} Regarding error handling, I tracked down the original issue: 
SOLR-445 {quote}

Yes, SOLR-445 is solved by my patch - the nice way. On certain kinds of errors 
(PartialError subclasses) during the handling of a particular document in a 
nultidocument/batch-update the processing of subsequent documents will 
continue. The client will receive a response describing all errors (wrapped in 
PartialErrors) that happend during the processing of the entire update-request 
(multidocument/batch). Please have a look at 
http://wiki.apache.org/solr/Per%20Steffensen/Update%20semantics#Multi_document_updates
 

{quote} It's just a guess, but I think it unlikely any committers would feel 
comfortable tackling this big patch, or even have time to understand all of the 
different aspects. They may agree with some parts but disagree with other parts 
{quote}

Of course that is up to you, but I believe Solr has a problem being a real Open 
Source project receiving contributions from many semi-related organistions 
around the world, if you do not trust your test suite. Basically when taking 
in a patch a committer do not need to understand everything down to every 
detail. It should be enough (if you trust your test suite) to
* Verify that all existing tests are still green - and havnt been hacked
* Verify that all new tests seem to be meaningfull and covering the features 
described in the corresponding Jira (and in my case the associated Wiki page), 
indicating that the new features are usefull and well tested
* Scan through the new code to see if it is breaking any design principals 
etc., and in general if it seems to be doing the right thing the right way

As long as a patch does not break any existing functionality, and seems to 
bring nice new functionality (you should be able to see that from the added 
tests) a patch cannot be that harmfull - you can always refactor if you realize 
that you disagree with some parts. It all depends on trusting your test 
suite. Dont you agree, in principle at least?

Regards, Per Steffensen

  
 Versioning - optimistic locking
 ---

 Key: SOLR-3178
 URL: https://issues.apache.org/jira/browse/SOLR-3178
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 3.5
 Environment: All
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels:

[jira] [Commented] (SOLR-2352) TermVectorComponent fails with Undefined Field errors for score, *, or any Solr 4x psuedo-fields used in the fl param.


[ 
https://issues.apache.org/jira/browse/SOLR-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291726#comment-13291726
 ] 

Robert Muir commented on SOLR-2352:
---

{quote}
...the last item seemingly a relic from when the code use to use the 
TermVectorMapper interface to walk the vectors the various fields, and used 
diff code paths depending on wether all fields were requested, or just 
specific ones.
{quote}

I didnt look at the patch, or the issue, but maybe in the case only specific 
fields are returned you could just wrap 
the Fields returned by getTermVectors with a FilteredFields so you only have 
one codepath:

http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/test-framework/src/java/org/apache/lucene/index/FieldFilterAtomicReader.java



 TermVectorComponent fails with Undefined Field errors for score, *, or any 
 Solr 4x psuedo-fields used in the fl param.
 --

 Key: SOLR-2352
 URL: https://issues.apache.org/jira/browse/SOLR-2352
 Project: Solr
  Issue Type: Bug
  Components: SearchComponents - other
Affects Versions: 3.1
 Environment: Ubuntu 10.04/Arch solr 3.x branch r1058326
Reporter: Jed Glazner
Assignee: Hoss Man
 Fix For: 4.0

 Attachments: SOLR-2352.patch


 When searching using the term vector components and setting fl=*,score the 
 result is a http 400 error 'undefined field: *'. If you disable the tvc the 
 search works properly.
 Example bad request...
 {code}http://localhost:8983/solr/select/?qt=tvrhq=includes:[*+TO+*]fl=*{code}
 3.1 stack trace:
 {noformat}
 SEVERE: org.apache.solr.common.SolrException: undefined field: *
at 
 org.apache.solr.handler.component.TermVectorComponent.process(TermVectorComponent.java:142)
 ...
 {noformat}
 The work around is to explicitly use the tv.fl param when using psuedo-fields 
 in the fl...
 {code}http://localhost:8983/solr/select/?qt=tvrhq=includes:[*+TO+*]fl=*tv.fl=includes{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Confusing method names to get the size of objects

2012-06-08 Thread Michael McCandless

+1 to standardize on two names.  It is confusing now!

Mike McCandless

http://blog.mikemccandless.com

On Fri, Jun 8, 2012 at 6:20 AM, Adrien Grand jpou...@gmail.com wrote:
 Hi,

 Lucene and Solr have a few classes that expose the size of their
 instances, but with different method names. There are at least
 ramBytesUsed (packed ints), sizeInBytes (FST, RamDirectory) and
 memSize (Solr DocSets) that provide an estimation of the memory used
 in bytes. The confusing thing is that sizeInBytes is sometimes also
 used for on-disk sizes (SegmentInfo for example). I think it would
 improve readability to stick to only two method names, one for the
 in-memory size and one for the on-disk size. Or maybe these methods
 have different meanings that I am missing? What do you think?

 --
 Adrien

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4087) Provide consistent IW behavior for illegal meta data changes


[ 
https://issues.apache.org/jira/browse/LUCENE-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291742#comment-13291742
 ] 

Michael McCandless commented on LUCENE-4087:


I think this is a good baby step for 4.0.

But I think it's important the javadocs make it clear that if you change up the 
DV type for a given field, the behavior is undefined and we are free to improve 
it in the future.

Ideally I think apps should get clear exceptions on attempting to index a doc 
with an incompatible change to anything that is our effective schema 
(omitNorms, indexOptions, DV types, etc.).  For example, if a given field 
already omitNorms and you try to add a doc with that field not omitting norms, 
you should get a clear exception (it can only be an app bug, because on merge 
the norms will silently go away).  But let's open a separate issue for that...

 Provide consistent IW behavior for illegal meta data changes
 

 Key: LUCENE-4087
 URL: https://issues.apache.org/jira/browse/LUCENE-4087
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
 Fix For: 4.0, 4.1, 5.0

 Attachments: LUCENE-4087.patch


 Currently IW fails late and inconsistent if field metadata like an already 
 defined DocValues type or un-omitting norms.
 we can approach this similar to how we handle consistent field number and:
  
 * throw exception if indexOptions conflict (e.g. omitTF=true versus
 false) instead of silently dropping positions on merge
 * same with omitNorms
 * same with norms types and docvalues types
 * still keeping field numbers consistent
 this way we could eliminate all these traps and just give an
 exception instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-4.x-Windows-Java7-64 - Build # 18 - Failure!

Build: 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java7-64/18/

1 tests failed.
REGRESSION:  org.apache.solr.spelling.suggest.SuggesterTSTTest.testReload

Error Message:
Exception during query

Stack Trace:
java.lang.RuntimeException: Exception during query
at 
__randomizedtesting.SeedInfo.seed([476D2EB48E0F2244:809D56B7444CDA56]:0)
at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:459)
at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:426)
at 
org.apache.solr.spelling.suggest.SuggesterTest.testReload(SuggesterTest.java:91)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
at 
org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53)
at 
org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551)
Caused by: java.lang.RuntimeException: REQUEST FAILED: 
xpath=//lst[@name='spellcheck']/lst[@name='suggestions']/lst[@name='ac']/int[@name='numFound'][.='2']
xml response was: ?xml version=1.0 encoding=UTF-8?
response
lst name=responseHeaderint name=status0/intint 
name=QTime0/int/lstlst name=spellchecklst name=suggestions//lst
/response

request 
was:q=acspellcheck.count=2qt=/suggest_tstspellcheck.onlyMorePopular=true
at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:452)
...

[jira] [Commented] (LUCENE-4120) FST should use packed integer arrays


[ 
https://issues.apache.org/jira/browse/LUCENE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291747#comment-13291747
 ] 

Michael McCandless commented on LUCENE-4120:


+1


 FST should use packed integer arrays
 

 Key: LUCENE-4120
 URL: https://issues.apache.org/jira/browse/LUCENE-4120
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/FSTs
Affects Versions: 5.0
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Fix For: 5.0


 There are some places where an int[] could be advantageously replaced with a 
 packed integer array.
 I am thinking (at least) of:
  * FST.nodeAddress (GrowableWriter)
  * FST.inCounts (GrowableWriter)
  * FST.nodeRefToAddress (read-only Reader)
 The serialization/deserialization methods should be modified too in order to 
 take advantage of PackedInts.get{Reader,Writer}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4101) Remove XXXField.TYPE_STORED


 [ 
https://issues.apache.org/jira/browse/LUCENE-4101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-4101:
---

Attachment: LUCENE-4101.patch

New patch, adding StoredStringField and StoredTextField (instead of 
StringField.TYPE_STORED / TextField.TYPE_STORED).  I think it's ready.

 Remove XXXField.TYPE_STORED
 ---

 Key: LUCENE-4101
 URL: https://issues.apache.org/jira/browse/LUCENE-4101
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Blocker
 Fix For: 4.0, 5.0

 Attachments: LUCENE-4101.patch, LUCENE-4101.patch


 Spinoff from LUCENE-3312.
 For 4.0 I think we should simplify the sugar field APIs by requiring
 that you add a StoredField if you want to store the field.  Expert users
 can still make a custom FieldType that both stores and indexes...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3524) Make discard-punctuation feature in Kuromoji configurable from JapaneseTokenizerFactory

2012-06-08 Thread Jun Ohtani (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291787#comment-13291787
 ] 

Jun Ohtani commented on SOLR-3524:
--

Hi Christian,

Sorry, I create the patch based ver. 3.6.0.

 Make discard-punctuation feature in Kuromoji configurable from 
 JapaneseTokenizerFactory
 ---

 Key: SOLR-3524
 URL: https://issues.apache.org/jira/browse/SOLR-3524
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 3.6
Reporter: Kazuaki Hiraga
Priority: Minor
 Attachments: kuromoji_discard_punctuation.patch.txt


 JapaneseTokenizer, Kuromoji doesn't provide configuration option to preserve 
 punctuation in Japanese text, although It has a parameter to change this 
 behavior.  JapaneseTokenizerFactory always set third parameter, which 
 controls this behavior, to true to remove punctuation.
 I would like to have an option I can configure this behavior by fieldtype 
 definition in schema.xml.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3524) Make discard-punctuation feature in Kuromoji configurable from JapaneseTokenizerFactory


[ 
https://issues.apache.org/jira/browse/SOLR-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291792#comment-13291792
 ] 

Christian Moen commented on SOLR-3524:
--

No trouble.  I'll provide a new patch shortly for {{trunk}} and {{branch_4x}} 
with a test as well.

 Make discard-punctuation feature in Kuromoji configurable from 
 JapaneseTokenizerFactory
 ---

 Key: SOLR-3524
 URL: https://issues.apache.org/jira/browse/SOLR-3524
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 3.6
Reporter: Kazuaki Hiraga
Priority: Minor
 Attachments: kuromoji_discard_punctuation.patch.txt


 JapaneseTokenizer, Kuromoji doesn't provide configuration option to preserve 
 punctuation in Japanese text, although It has a parameter to change this 
 behavior.  JapaneseTokenizerFactory always set third parameter, which 
 controls this behavior, to true to remove punctuation.
 I would like to have an option I can configure this behavior by fieldtype 
 definition in schema.xml.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4101) Remove XXXField.TYPE_STORED

[
https://issues.apache.org/jira/browse/LUCENE-4101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291797#comment-13291797
]

Robert Muir commented on LUCENE-4101:
-

Thinking about this issue a bit, I think its bad if you have to use
Field/FieldType api just to store a field.
So I agree this should be fixed.

Separately we should also make it easy to have a stored-only (not indexed)
field.

I felt like both of these things were easy with the old document API.

{quote}
A third option is to add boolean isStored to each of XXXFields? So, it's not
stored by default, but then you can do:
{quote}

I don't like that we are making our apis hard to use just because java doesn't
have named parameter passing or something.
I think the old API was great here: it had an enum for Stored so it was totally
obvious from your code if it was stored or not,
or indexed or not.

I think if we dont like booleans for this silly reason, then we should just use
an enum like the old API!

Extra Stored* classes for each field are just overwhelming.

{quote}
I can't see a situation where having to add the same field twice with different
flags is good from a usability standpoint.
{quote}

We can never force that. People who are experts or committers are free to add
the field twice if
they want to (nothing stops them), but I don't want to see this forced in our
APIs, its too difficult.

Remove XXXField.TYPE_STORED
---

Key: LUCENE-4101
URL: https://issues.apache.org/jira/browse/LUCENE-4101
Project: Lucene - Java
Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Blocker
Fix For: 4.0, 5.0

Attachments: LUCENE-4101.patch, LUCENE-4101.patch

Spinoff from LUCENE-3312.
For 4.0 I think we should simplify the sugar field APIs by requiring
that you add a StoredField if you want to store the field. Expert users
can still make a custom FieldType that both stores and indexes...

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

2012-06-08 Thread Jason Rutherglen (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291803#comment-13291803
 ] 

Jason Rutherglen commented on SOLR-2242:


Terrance, can you post a patch to the Jira?  It makes sense to start this Jira 
off non-distributed, and add a distributed version in another Jira issue...

 Get distinct count of names for a facet field
 -

 Key: SOLR-2242
 URL: https://issues.apache.org/jira/browse/SOLR-2242
 Project: Solr
  Issue Type: New Feature
  Components: Response Writers
Affects Versions: 4.0
Reporter: Bill Bell
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_5_tests.patch, 
 SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, 
 SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, 
 SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch


 When returning facet.field=name of field you will get a list of matches for 
 distinct values. This is normal behavior. This patch tells you how many 
 distinct values you have (# of rows). Use with limit=-1 and mincount=1.
 The feature is called namedistinct. Here is an example:
 http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price
 http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price
 http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price
 This currently only works on facet.field.
 {code}
 lst name=facet_fields
   lst name=price
 int name=numFacetTerms14/int
 int name=0.03/intint name=11.51/intint 
 name=19.951/intint name=74.991/intint name=92.01/intint 
 name=179.991/intint name=185.01/intint name=279.951/intint 
 name=329.951/intint name=350.01/intint name=399.01/intint 
 name=479.951/intint name=649.991/intint name=2199.01/int
   /lst
 /lst
 {code} 
 Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3524) Make discard-punctuation feature in Kuromoji configurable from JapaneseTokenizerFactory


 [ 
https://issues.apache.org/jira/browse/SOLR-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Moen updated SOLR-3524:
-

Attachment: SOLR-3524.patch

 Make discard-punctuation feature in Kuromoji configurable from 
 JapaneseTokenizerFactory
 ---

 Key: SOLR-3524
 URL: https://issues.apache.org/jira/browse/SOLR-3524
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 3.6
Reporter: Kazuaki Hiraga
Priority: Minor
 Attachments: SOLR-3524.patch, kuromoji_discard_punctuation.patch.txt


 JapaneseTokenizer, Kuromoji doesn't provide configuration option to preserve 
 punctuation in Japanese text, although It has a parameter to change this 
 behavior.  JapaneseTokenizerFactory always set third parameter, which 
 controls this behavior, to true to remove punctuation.
 I would like to have an option I can configure this behavior by fieldtype 
 definition in schema.xml.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3524) Make discard-punctuation feature in Kuromoji configurable from JapaneseTokenizerFactory


[ 
https://issues.apache.org/jira/browse/SOLR-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291807#comment-13291807
 ] 

Christian Moen commented on SOLR-3524:
--

New patch with tests and documentation changes attached.

 Make discard-punctuation feature in Kuromoji configurable from 
 JapaneseTokenizerFactory
 ---

 Key: SOLR-3524
 URL: https://issues.apache.org/jira/browse/SOLR-3524
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 3.6
Reporter: Kazuaki Hiraga
Priority: Minor
 Attachments: SOLR-3524.patch, kuromoji_discard_punctuation.patch.txt


 JapaneseTokenizer, Kuromoji doesn't provide configuration option to preserve 
 punctuation in Japanese text, although It has a parameter to change this 
 behavior.  JapaneseTokenizerFactory always set third parameter, which 
 controls this behavior, to true to remove punctuation.
 I would like to have an option I can configure this behavior by fieldtype 
 definition in schema.xml.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4121) Standardize ramBytesUsed/sizeInBytes/memSize

Adrien Grand created LUCENE-4121:


 Summary: Standardize ramBytesUsed/sizeInBytes/memSize
 Key: LUCENE-4121
 URL: https://issues.apache.org/jira/browse/LUCENE-4121
 Project: Lucene - Java
  Issue Type: Task
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Fix For: 5.0


We should standardize the names of the methods we use to estimate the sizes of 
objects in memory and on disk. (cf. discussion on dev@lucene 
http://search-lucene.com/m/VbXSx1BP60G).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4121) Standardize ramBytesUsed/sizeInBytes/memSize


[ 
https://issues.apache.org/jira/browse/LUCENE-4121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291819#comment-13291819
 ] 

Adrien Grand commented on LUCENE-4121:
--

I am currently thinking of {{memSize}} for the in-memory size and {{diskSize}} 
for the on-disk size.

 Standardize ramBytesUsed/sizeInBytes/memSize
 

 Key: LUCENE-4121
 URL: https://issues.apache.org/jira/browse/LUCENE-4121
 Project: Lucene - Java
  Issue Type: Task
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Fix For: 5.0


 We should standardize the names of the methods we use to estimate the sizes 
 of objects in memory and on disk. (cf. discussion on dev@lucene 
 http://search-lucene.com/m/VbXSx1BP60G).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Confusing method names to get the size of objects

2012-06-08 Thread Adrien Grand

On Fri, Jun 8, 2012 at 2:12 PM, Michael McCandless
luc...@mikemccandless.com wrote:
 +1 to standardize on two names.  It is confusing now!

Thanks Mike for your feedback, I created LUCENE-4121 [1] to address this issue.

 [1] https://issues.apache.org/jira/browse/LUCENE-4121

-- 
Adrien

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4087) Provide consistent IW behavior for illegal meta data changes


 [ 
https://issues.apache.org/jira/browse/LUCENE-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-4087:


Attachment: LUCENE-4087.patch

here is a new patch adding documentation to DocValues.java  a reference to all 
DV Fields. I added a bunch of tests including verification test that changing 
norms types actually fails. I extended the type promoter a little to actually 
promote INT_16  INT_8 to Float32 if needed as well as INT_32 to FLOAT_64.

I think its ready

 Provide consistent IW behavior for illegal meta data changes
 

 Key: LUCENE-4087
 URL: https://issues.apache.org/jira/browse/LUCENE-4087
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
 Fix For: 4.0, 4.1, 5.0

 Attachments: LUCENE-4087.patch, LUCENE-4087.patch


 Currently IW fails late and inconsistent if field metadata like an already 
 defined DocValues type or un-omitting norms.
 we can approach this similar to how we handle consistent field number and:
  
 * throw exception if indexOptions conflict (e.g. omitTF=true versus
 false) instead of silently dropping positions on merge
 * same with omitNorms
 * same with norms types and docvalues types
 * still keeping field numbers consistent
 this way we could eliminate all these traps and just give an
 exception instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4121) Standardize ramBytesUsed/sizeInBytes/memSize


[ 
https://issues.apache.org/jira/browse/LUCENE-4121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291831#comment-13291831
 ] 

Simon Willnauer commented on LUCENE-4121:
-

Adrien I think we can do that for 4.0 too though

 Standardize ramBytesUsed/sizeInBytes/memSize
 

 Key: LUCENE-4121
 URL: https://issues.apache.org/jira/browse/LUCENE-4121
 Project: Lucene - Java
  Issue Type: Task
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Fix For: 5.0


 We should standardize the names of the methods we use to estimate the sizes 
 of objects in memory and on disk. (cf. discussion on dev@lucene 
 http://search-lucene.com/m/VbXSx1BP60G).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4121) Standardize ramBytesUsed/sizeInBytes/memSize


 [ 
https://issues.apache.org/jira/browse/LUCENE-4121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-4121:
-

Fix Version/s: 4.0

 Standardize ramBytesUsed/sizeInBytes/memSize
 

 Key: LUCENE-4121
 URL: https://issues.apache.org/jira/browse/LUCENE-4121
 Project: Lucene - Java
  Issue Type: Task
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Fix For: 4.0, 5.0


 We should standardize the names of the methods we use to estimate the sizes 
 of objects in memory and on disk. (cf. discussion on dev@lucene 
 http://search-lucene.com/m/VbXSx1BP60G).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4121) Standardize ramBytesUsed/sizeInBytes/memSize


[ 
https://issues.apache.org/jira/browse/LUCENE-4121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291834#comment-13291834
 ] 

Adrien Grand commented on LUCENE-4121:
--

Updated fix version.

 Standardize ramBytesUsed/sizeInBytes/memSize
 

 Key: LUCENE-4121
 URL: https://issues.apache.org/jira/browse/LUCENE-4121
 Project: Lucene - Java
  Issue Type: Task
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Fix For: 4.0, 5.0


 We should standardize the names of the methods we use to estimate the sizes 
 of objects in memory and on disk. (cf. discussion on dev@lucene 
 http://search-lucene.com/m/VbXSx1BP60G).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3524) Make discard-punctuation feature in Kuromoji configurable from JapaneseTokenizerFactory

2012-06-08 Thread Kazuaki Hiraga (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291836#comment-13291836
 ] 

Kazuaki Hiraga commented on SOLR-3524:
--

Thank you guys!
Christian, Since some documents have keywords that consists of alphabet and 
punctuation such as c++, c# and so on, We want to match those keywords with the 
keyword that unchanged form. Of course, we will discard punctuation in many 
cases but some cases, especially short text, we want to preserve punctuation. 
Therefore, I want to have an option that I can control this behaviour.

Ohtani-san, thank you for your early reply and patch! 

 Make discard-punctuation feature in Kuromoji configurable from 
 JapaneseTokenizerFactory
 ---

 Key: SOLR-3524
 URL: https://issues.apache.org/jira/browse/SOLR-3524
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 3.6
Reporter: Kazuaki Hiraga
Priority: Minor
 Attachments: SOLR-3524.patch, kuromoji_discard_punctuation.patch.txt


 JapaneseTokenizer, Kuromoji doesn't provide configuration option to preserve 
 punctuation in Japanese text, although It has a parameter to change this 
 behavior.  JapaneseTokenizerFactory always set third parameter, which 
 controls this behavior, to true to remove punctuation.
 I would like to have an option I can configure this behavior by fieldtype 
 definition in schema.xml.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3520) Add missing tests for JapaneseReadingFormFilterFactory and JapaneseKatakanaStemFilterFactory


 [ 
https://issues.apache.org/jira/browse/SOLR-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Moen updated SOLR-3520:
-

Attachment: SOLR-3520.patch

 Add missing tests for JapaneseReadingFormFilterFactory and 
 JapaneseKatakanaStemFilterFactory
 

 Key: SOLR-3520
 URL: https://issues.apache.org/jira/browse/SOLR-3520
 Project: Solr
  Issue Type: Test
Affects Versions: 4.0, 5.0
Reporter: Christian Moen
Priority: Minor
 Attachments: SOLR-3520.patch, SOLR-3520.patch


 {{JapaneseReadingFormFilterFactory}} and 
 {{JapaneseKatakanaStemFilterFactory}} doesn't have any tests and it would be 
 good to have some.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3520) Add missing tests for JapaneseReadingFormFilterFactory and JapaneseKatakanaStemFilterFactory


[ 
https://issues.apache.org/jira/browse/SOLR-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291837#comment-13291837
 ] 

Christian Moen commented on SOLR-3520:
--

Updated patch with a case that also deals with short katakana terms that 
shouldn't be stemmed by default.  Will commit shortly.

 Add missing tests for JapaneseReadingFormFilterFactory and 
 JapaneseKatakanaStemFilterFactory
 

 Key: SOLR-3520
 URL: https://issues.apache.org/jira/browse/SOLR-3520
 Project: Solr
  Issue Type: Test
Affects Versions: 4.0, 5.0
Reporter: Christian Moen
Priority: Minor
 Attachments: SOLR-3520.patch, SOLR-3520.patch


 {{JapaneseReadingFormFilterFactory}} and 
 {{JapaneseKatakanaStemFilterFactory}} doesn't have any tests and it would be 
 good to have some.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3520) Add missing tests for JapaneseReadingFormFilterFactory and JapaneseKatakanaStemFilterFactory


[ 
https://issues.apache.org/jira/browse/SOLR-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291839#comment-13291839
 ] 

Christian Moen commented on SOLR-3520:
--

Committed r1348134 on {{trunk}}.

 Add missing tests for JapaneseReadingFormFilterFactory and 
 JapaneseKatakanaStemFilterFactory
 

 Key: SOLR-3520
 URL: https://issues.apache.org/jira/browse/SOLR-3520
 Project: Solr
  Issue Type: Test
Affects Versions: 4.0, 5.0
Reporter: Christian Moen
Priority: Minor
 Attachments: SOLR-3520.patch, SOLR-3520.patch


 {{JapaneseReadingFormFilterFactory}} and 
 {{JapaneseKatakanaStemFilterFactory}} doesn't have any tests and it would be 
 good to have some.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2357) Reduce transient RAM usage while merging by using packed ints array for docID re-mapping


[ 
https://issues.apache.org/jira/browse/LUCENE-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291840#comment-13291840
 ] 

Adrien Grand commented on LUCENE-2357:
--

I am going to commit this change next week unless someone objects.

 Reduce transient RAM usage while merging by using packed ints array for docID 
 re-mapping
 

 Key: LUCENE-2357
 URL: https://issues.apache.org/jira/browse/LUCENE-2357
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 4.0, 5.0

 Attachments: LUCENE-2357.patch, LUCENE-2357.patch, LUCENE-2357.patch, 
 LUCENE-2357.patch, LUCENE-2357.patch


 We allocate this int[] to remap docIDs due to compaction of deleted ones.
 This uses alot of RAM for large segment merges, and can fail to allocate due 
 to fragmentation on 32 bit JREs.
 Now that we have packed ints, a simple fix would be to use a packed int 
 array... and maybe instead of storing abs docID in the mapping, we could 
 store the number of del docs seen so far (so the remap would do a lookup then 
 a subtract).  This may add some CPU cost to merging but should bring down 
 transient RAM usage quite a bit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4122) Replace Payload with BytesRef

2012-06-08 Thread Andrzej Bialecki (JIRA)

Andrzej Bialecki  created LUCENE-4122:
-

 Summary: Replace Payload with BytesRef
 Key: LUCENE-4122
 URL: https://issues.apache.org/jira/browse/LUCENE-4122
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0, 5.0
Reporter: Andrzej Bialecki 
Assignee: Andrzej Bialecki 
 Fix For: 4.0, 5.0


The Payload class offers a very similar functionality to BytesRef. The code 
internally uses BytesRef-s to represent payloads, and on indexing and on 
retrieval this data is repackaged from/to Payload.

This seems wasteful. I propose to remove the Payload class and use BytesRef 
instead, thus avoid this re-wrapping and reducing the API footprint.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4122) Replace Payload with BytesRef

2012-06-08 Thread Andrzej Bialecki (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki  updated LUCENE-4122:
--

Attachment: LUCENE-4122.patch

Patch for trunk. All tests pass.

 Replace Payload with BytesRef
 -

 Key: LUCENE-4122
 URL: https://issues.apache.org/jira/browse/LUCENE-4122
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0, 5.0
Reporter: Andrzej Bialecki 
Assignee: Andrzej Bialecki 
 Fix For: 4.0, 5.0

 Attachments: LUCENE-4122.patch


 The Payload class offers a very similar functionality to BytesRef. The code 
 internally uses BytesRef-s to represent payloads, and on indexing and on 
 retrieval this data is repackaged from/to Payload.
 This seems wasteful. I propose to remove the Payload class and use BytesRef 
 instead, thus avoid this re-wrapping and reducing the API footprint.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3520) Add missing tests for JapaneseReadingFormFilterFactory and JapaneseKatakanaStemFilterFactory


[ 
https://issues.apache.org/jira/browse/SOLR-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291853#comment-13291853
 ] 

Christian Moen commented on SOLR-3520:
--

Committed r1348148 on {{branch_4x}}

 Add missing tests for JapaneseReadingFormFilterFactory and 
 JapaneseKatakanaStemFilterFactory
 

 Key: SOLR-3520
 URL: https://issues.apache.org/jira/browse/SOLR-3520
 Project: Solr
  Issue Type: Test
Affects Versions: 4.0, 5.0
Reporter: Christian Moen
Priority: Minor
 Attachments: SOLR-3520.patch, SOLR-3520.patch


 {{JapaneseReadingFormFilterFactory}} and 
 {{JapaneseKatakanaStemFilterFactory}} doesn't have any tests and it would be 
 good to have some.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-3520) Add missing tests for JapaneseReadingFormFilterFactory and JapaneseKatakanaStemFilterFactory


[ 
https://issues.apache.org/jira/browse/SOLR-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291853#comment-13291853
 ] 

Christian Moen edited comment on SOLR-3520 at 6/8/12 4:42 PM:
--

Committed r1348148 on {{branch_4x}}.

  was (Author: cm):
Committed r1348148 on {{branch_4x}}
  
 Add missing tests for JapaneseReadingFormFilterFactory and 
 JapaneseKatakanaStemFilterFactory
 

 Key: SOLR-3520
 URL: https://issues.apache.org/jira/browse/SOLR-3520
 Project: Solr
  Issue Type: Test
Affects Versions: 4.0, 5.0
Reporter: Christian Moen
Priority: Minor
 Attachments: SOLR-3520.patch, SOLR-3520.patch


 {{JapaneseReadingFormFilterFactory}} and 
 {{JapaneseKatakanaStemFilterFactory}} doesn't have any tests and it would be 
 good to have some.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-3520) Add missing tests for JapaneseReadingFormFilterFactory and JapaneseKatakanaStemFilterFactory


 [ 
https://issues.apache.org/jira/browse/SOLR-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Moen reassigned SOLR-3520:


Assignee: Christian Moen

 Add missing tests for JapaneseReadingFormFilterFactory and 
 JapaneseKatakanaStemFilterFactory
 

 Key: SOLR-3520
 URL: https://issues.apache.org/jira/browse/SOLR-3520
 Project: Solr
  Issue Type: Test
Affects Versions: 4.0, 5.0
Reporter: Christian Moen
Assignee: Christian Moen
Priority: Minor
 Attachments: SOLR-3520.patch, SOLR-3520.patch


 {{JapaneseReadingFormFilterFactory}} and 
 {{JapaneseKatakanaStemFilterFactory}} doesn't have any tests and it would be 
 good to have some.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-3520) Add missing tests for JapaneseReadingFormFilterFactory and JapaneseKatakanaStemFilterFactory


 [ 
https://issues.apache.org/jira/browse/SOLR-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Moen resolved SOLR-3520.
--

Resolution: Fixed

 Add missing tests for JapaneseReadingFormFilterFactory and 
 JapaneseKatakanaStemFilterFactory
 

 Key: SOLR-3520
 URL: https://issues.apache.org/jira/browse/SOLR-3520
 Project: Solr
  Issue Type: Test
Affects Versions: 4.0, 5.0
Reporter: Christian Moen
Assignee: Christian Moen
Priority: Minor
 Attachments: SOLR-3520.patch, SOLR-3520.patch


 {{JapaneseReadingFormFilterFactory}} and 
 {{JapaneseKatakanaStemFilterFactory}} doesn't have any tests and it would be 
 good to have some.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4123) Add CachingRAMDirectory

Michael McCandless created LUCENE-4123:
--

 Summary: Add CachingRAMDirectory
 Key: LUCENE-4123
 URL: https://issues.apache.org/jira/browse/LUCENE-4123
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/store
Reporter: Michael McCandless
Assignee: Michael McCandless


The directory is very simple and useful if you have an index that you
know fully fits into available RAM.  You could also use FileSwitchDir if
you want to leave some files (eg stored fields or term vectors) on disk.

It wraps any other Directory and delegates all writing (IndexOutput) to
it, but for reading (IndexInput), it allocates a single byte[] and fully
reads the file in and then serves requests off that single byte[].  It's
more GC friendly than RAMDir since it only allocates a single array per
file.

It has a few nocommits still, but all tests pass if I wrap the delegate
inside MockDirectoryWrapper using this.

I tested with 1M Wikipedia english index (would like to test w/ 10M docs
but I don't have enough RAM...); it seems to give a nice speedup:

{noformat}
TaskQPS base StdDev base  QPS cachedStdDev cached  Pct 
diff
 Respell  197.007.27  203.198.17   -4% -   
11%
PKLookup  121.122.80  125.463.20   -1% -
8%
  Fuzzy2   66.622.62   69.912.85   -3% -   
13%
  Fuzzy1  206.206.47  222.216.521% -   
14%
   TermGroup100K  160.146.62  175.713.793% -   
16%
  Phrase   34.850.40   38.750.618% -   
14%
  TermBGroup100K  363.75   15.74  406.98   13.233% -   
20%
SpanNear   53.081.11   59.532.944% -   
20%
TermBGroup100K1P  222.539.78  252.865.966% -   
21%
SloppyPhrase   70.362.05   79.954.484% -   
23%
Wildcard  238.104.29  272.784.97   10% -   
18%
   OrHighMed  123.494.85  149.324.66   12% -   
29%
 Prefix3  288.468.10  350.405.38   16% -   
26%
  OrHighHigh   76.463.27   93.132.96   13% -   
31%
  IntNRQ   92.252.12  113.475.74   14% -   
32%
Term  757.12   39.03  958.62   22.68   17% -   
36%
 AndHighHigh  103.034.48  133.893.76   21% -   
39%
  AndHighMed  376.36   16.58  493.99   10.00   23% -   
40%
{noformat}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4123) Add CachingRAMDirectory


 [ 
https://issues.apache.org/jira/browse/LUCENE-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-4123:
---

Attachment: LUCENE-4123.patch

 Add CachingRAMDirectory
 ---

 Key: LUCENE-4123
 URL: https://issues.apache.org/jira/browse/LUCENE-4123
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/store
Reporter: Michael McCandless
Assignee: Michael McCandless
 Attachments: LUCENE-4123.patch


 The directory is very simple and useful if you have an index that you
 know fully fits into available RAM.  You could also use FileSwitchDir if
 you want to leave some files (eg stored fields or term vectors) on disk.
 It wraps any other Directory and delegates all writing (IndexOutput) to
 it, but for reading (IndexInput), it allocates a single byte[] and fully
 reads the file in and then serves requests off that single byte[].  It's
 more GC friendly than RAMDir since it only allocates a single array per
 file.
 It has a few nocommits still, but all tests pass if I wrap the delegate
 inside MockDirectoryWrapper using this.
 I tested with 1M Wikipedia english index (would like to test w/ 10M docs
 but I don't have enough RAM...); it seems to give a nice speedup:
 {noformat}
 TaskQPS base StdDev base  QPS cachedStdDev cached  
 Pct diff
  Respell  197.007.27  203.198.17   -4% -  
  11%
 PKLookup  121.122.80  125.463.20   -1% -  
   8%
   Fuzzy2   66.622.62   69.912.85   -3% -  
  13%
   Fuzzy1  206.206.47  222.216.521% -  
  14%
TermGroup100K  160.146.62  175.713.793% -  
  16%
   Phrase   34.850.40   38.750.618% -  
  14%
   TermBGroup100K  363.75   15.74  406.98   13.233% -  
  20%
 SpanNear   53.081.11   59.532.944% -  
  20%
 TermBGroup100K1P  222.539.78  252.865.966% -  
  21%
 SloppyPhrase   70.362.05   79.954.484% -  
  23%
 Wildcard  238.104.29  272.784.97   10% -  
  18%
OrHighMed  123.494.85  149.324.66   12% -  
  29%
  Prefix3  288.468.10  350.405.38   16% -  
  26%
   OrHighHigh   76.463.27   93.132.96   13% -  
  31%
   IntNRQ   92.252.12  113.475.74   14% -  
  32%
 Term  757.12   39.03  958.62   22.68   17% -  
  36%
  AndHighHigh  103.034.48  133.893.76   21% -  
  39%
   AndHighMed  376.36   16.58  493.99   10.00   23% -  
  40%
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4123) Add CachingRAMDirectory


[ 
https://issues.apache.org/jira/browse/LUCENE-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291857#comment-13291857
 ] 

Simon Willnauer commented on LUCENE-4123:
-

bq.I tested with 1M Wikipedia english index (would like to test w/ 10M docs
but I don't have enough RAM...); it seems to give a nice speedup:

#fail! :)

 Add CachingRAMDirectory
 ---

 Key: LUCENE-4123
 URL: https://issues.apache.org/jira/browse/LUCENE-4123
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/store
Reporter: Michael McCandless
Assignee: Michael McCandless
 Attachments: LUCENE-4123.patch


 The directory is very simple and useful if you have an index that you
 know fully fits into available RAM.  You could also use FileSwitchDir if
 you want to leave some files (eg stored fields or term vectors) on disk.
 It wraps any other Directory and delegates all writing (IndexOutput) to
 it, but for reading (IndexInput), it allocates a single byte[] and fully
 reads the file in and then serves requests off that single byte[].  It's
 more GC friendly than RAMDir since it only allocates a single array per
 file.
 It has a few nocommits still, but all tests pass if I wrap the delegate
 inside MockDirectoryWrapper using this.
 I tested with 1M Wikipedia english index (would like to test w/ 10M docs
 but I don't have enough RAM...); it seems to give a nice speedup:
 {noformat}
 TaskQPS base StdDev base  QPS cachedStdDev cached  
 Pct diff
  Respell  197.007.27  203.198.17   -4% -  
  11%
 PKLookup  121.122.80  125.463.20   -1% -  
   8%
   Fuzzy2   66.622.62   69.912.85   -3% -  
  13%
   Fuzzy1  206.206.47  222.216.521% -  
  14%
TermGroup100K  160.146.62  175.713.793% -  
  16%
   Phrase   34.850.40   38.750.618% -  
  14%
   TermBGroup100K  363.75   15.74  406.98   13.233% -  
  20%
 SpanNear   53.081.11   59.532.944% -  
  20%
 TermBGroup100K1P  222.539.78  252.865.966% -  
  21%
 SloppyPhrase   70.362.05   79.954.484% -  
  23%
 Wildcard  238.104.29  272.784.97   10% -  
  18%
OrHighMed  123.494.85  149.324.66   12% -  
  29%
  Prefix3  288.468.10  350.405.38   16% -  
  26%
   OrHighHigh   76.463.27   93.132.96   13% -  
  31%
   IntNRQ   92.252.12  113.475.74   14% -  
  32%
 Term  757.12   39.03  958.62   22.68   17% -  
  36%
  AndHighHigh  103.034.48  133.893.76   21% -  
  39%
   AndHighMed  376.36   16.58  493.99   10.00   23% -  
  40%
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

in case you run into Author: [name] not defined in .git/authors.txt file

2012-06-08 Thread Michael Herndon

In case you are using git-svn and see the following error after running
git svn rebase on a brach:

Author: [name] not defined in .git/authors.txt file

just cd into the .git directory and add the user to the /authors.txt file
where [name] is the apache user name.

[name] = Full Name [name]@apache.org

save file.

run git svn rebase again.

[jira] [Created] (SOLR-3527) Optimize ignores maxSegments in distributed environment

2012-06-08 Thread Andy Laird (JIRA)

Andy Laird created SOLR-3527:


 Summary: Optimize ignores maxSegments in distributed environment
 Key: SOLR-3527
 URL: https://issues.apache.org/jira/browse/SOLR-3527
 Project: Solr
  Issue Type: Bug
  Components: SearchComponents - other
Affects Versions: 4.0
Reporter: Andy Laird


Send the following command to a Solr server with many segments in a 
multi-shard, multi-server environment:

curl 
http://localhost:8080/solr/update?optimize=truewaitFlush=truemaxSegments=6distrib=false;

The local server will end up with the number of segments at 6, as requested, 
but all other shards in the index will be optimized with maxSegments=1, which 
takes far longer to complete.  All shards should be optimized to the requested 
value of 6.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4123) Add CachingRAMDirectory


[ 
https://issues.apache.org/jira/browse/LUCENE-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291860#comment-13291860
 ] 

Robert Muir commented on LUCENE-4123:
-

I dont think it buys anything to code dup the readVint/vlong here. it should be 
compiled to the same code. e.g. mmapdir doesnt do this.

 Add CachingRAMDirectory
 ---

 Key: LUCENE-4123
 URL: https://issues.apache.org/jira/browse/LUCENE-4123
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/store
Reporter: Michael McCandless
Assignee: Michael McCandless
 Attachments: LUCENE-4123.patch


 The directory is very simple and useful if you have an index that you
 know fully fits into available RAM.  You could also use FileSwitchDir if
 you want to leave some files (eg stored fields or term vectors) on disk.
 It wraps any other Directory and delegates all writing (IndexOutput) to
 it, but for reading (IndexInput), it allocates a single byte[] and fully
 reads the file in and then serves requests off that single byte[].  It's
 more GC friendly than RAMDir since it only allocates a single array per
 file.
 It has a few nocommits still, but all tests pass if I wrap the delegate
 inside MockDirectoryWrapper using this.
 I tested with 1M Wikipedia english index (would like to test w/ 10M docs
 but I don't have enough RAM...); it seems to give a nice speedup:
 {noformat}
 TaskQPS base StdDev base  QPS cachedStdDev cached  
 Pct diff
  Respell  197.007.27  203.198.17   -4% -  
  11%
 PKLookup  121.122.80  125.463.20   -1% -  
   8%
   Fuzzy2   66.622.62   69.912.85   -3% -  
  13%
   Fuzzy1  206.206.47  222.216.521% -  
  14%
TermGroup100K  160.146.62  175.713.793% -  
  16%
   Phrase   34.850.40   38.750.618% -  
  14%
   TermBGroup100K  363.75   15.74  406.98   13.233% -  
  20%
 SpanNear   53.081.11   59.532.944% -  
  20%
 TermBGroup100K1P  222.539.78  252.865.966% -  
  21%
 SloppyPhrase   70.362.05   79.954.484% -  
  23%
 Wildcard  238.104.29  272.784.97   10% -  
  18%
OrHighMed  123.494.85  149.324.66   12% -  
  29%
  Prefix3  288.468.10  350.405.38   16% -  
  26%
   OrHighHigh   76.463.27   93.132.96   13% -  
  31%
   IntNRQ   92.252.12  113.475.74   14% -  
  32%
 Term  757.12   39.03  958.62   22.68   17% -  
  36%
  AndHighHigh  103.034.48  133.893.76   21% -  
  39%
   AndHighMed  376.36   16.58  493.99   10.00   23% -  
  40%
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4122) Replace Payload with BytesRef


[ 
https://issues.apache.org/jira/browse/LUCENE-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291862#comment-13291862
 ] 

Robert Muir commented on LUCENE-4122:
-

+1

 Replace Payload with BytesRef
 -

 Key: LUCENE-4122
 URL: https://issues.apache.org/jira/browse/LUCENE-4122
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0, 5.0
Reporter: Andrzej Bialecki 
Assignee: Andrzej Bialecki 
 Fix For: 4.0, 5.0

 Attachments: LUCENE-4122.patch


 The Payload class offers a very similar functionality to BytesRef. The code 
 internally uses BytesRef-s to represent payloads, and on indexing and on 
 retrieval this data is repackaged from/to Payload.
 This seems wasteful. I propose to remove the Payload class and use BytesRef 
 instead, thus avoid this re-wrapping and reducing the API footprint.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3527) Optimize ignores maxSegments in distributed environment

2012-06-08 Thread Andy Laird (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291864#comment-13291864
 ] 

Andy Laird commented on SOLR-3527:
--

One additional data point:  the distrib=false does not matter with current 
behavior.  It seems if distrib=false only the local server should be optimized 
(to the requested value) and if distrib=true (default) all shards in the index 
should be optimized with N max segments.

 Optimize ignores maxSegments in distributed environment
 ---

 Key: SOLR-3527
 URL: https://issues.apache.org/jira/browse/SOLR-3527
 Project: Solr
  Issue Type: Bug
  Components: SearchComponents - other
Affects Versions: 4.0
Reporter: Andy Laird

 Send the following command to a Solr server with many segments in a 
 multi-shard, multi-server environment:
 curl 
 http://localhost:8080/solr/update?optimize=truewaitFlush=truemaxSegments=6distrib=false;
 The local server will end up with the number of segments at 6, as requested, 
 but all other shards in the index will be optimized with maxSegments=1, which 
 takes far longer to complete.  All shards should be optimized to the 
 requested value of 6.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4122) Replace Payload with BytesRef


[ 
https://issues.apache.org/jira/browse/LUCENE-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291865#comment-13291865
 ] 

Michael McCandless commented on LUCENE-4122:


+1

 Replace Payload with BytesRef
 -

 Key: LUCENE-4122
 URL: https://issues.apache.org/jira/browse/LUCENE-4122
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0, 5.0
Reporter: Andrzej Bialecki 
Assignee: Andrzej Bialecki 
 Fix For: 4.0, 5.0

 Attachments: LUCENE-4122.patch


 The Payload class offers a very similar functionality to BytesRef. The code 
 internally uses BytesRef-s to represent payloads, and on indexing and on 
 retrieval this data is repackaged from/to Payload.
 This seems wasteful. I propose to remove the Payload class and use BytesRef 
 instead, thus avoid this re-wrapping and reducing the API footprint.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches


 [ 
https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Harwood updated LUCENE-4069:
-

Attachment: (was: BloomFilterPostings40.patch)

 Segment-level Bloom filters for a 2 x speed up on rare term searches
 

 Key: LUCENE-4069
 URL: https://issues.apache.org/jira/browse/LUCENE-4069
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 3.6, 4.0
Reporter: Mark Harwood
Priority: Minor
 Fix For: 4.0, 3.6.1

 Attachments: MHBloomFilterOn3.6Branch.patch, 
 PrimaryKey40PerformanceTestSrc.zip


 An addition to each segment which stores a Bloom filter for selected fields 
 in order to give fast-fail to term searches, helping avoid wasted disk access.
 Best suited for low-frequency fields e.g. primary keys on big indexes with 
 many segments but also speeds up general searching in my tests.
 Overview slideshow here: 
 http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments
 Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU
 Patch based on 3.6 codebase attached.
 There are no 3.6 API changes currently - to play just add a field with _blm 
 on the end of the name to invoke special indexing/querying capability. 
 Clearly a new Field or schema declaration(!) would need adding to APIs to 
 configure the service properly.
 Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches


 [ 
https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Harwood updated LUCENE-4069:
-

Attachment: BloomFilterPostingsBranch4x.patch

Updated as follows:
* Extracted Bloom filter functionality as new oal.util.FuzzySet class - the 
name is changed because Bloom filtering is one application for a FuzzySet, 
fuzzy count distincts being another.
* BloomFilterPostingsFormat now take a factory that can tailor choice of 
BloomFilter per field (bitset size/saturation settings and choice of hash 
algo). Provided a default factory implementation.
* All Unit tests pass now that I have a test PostingsFormat class that uses v 
small bitsets where before the many-field unit tests would cause OOM. 

Will follow up with benchmarks when I have more time to run and document them. 
Initial results from my large-scale tests on growing indexes show a nice flat 
line in the face of a growing index whereas a non-Bloomed index saw-tooths 
upwards as segments grow/merge.

 Segment-level Bloom filters for a 2 x speed up on rare term searches
 

 Key: LUCENE-4069
 URL: https://issues.apache.org/jira/browse/LUCENE-4069
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 3.6, 4.0
Reporter: Mark Harwood
Priority: Minor
 Fix For: 4.0, 3.6.1

 Attachments: BloomFilterPostingsBranch4x.patch, 
 MHBloomFilterOn3.6Branch.patch


 An addition to each segment which stores a Bloom filter for selected fields 
 in order to give fast-fail to term searches, helping avoid wasted disk access.
 Best suited for low-frequency fields e.g. primary keys on big indexes with 
 many segments but also speeds up general searching in my tests.
 Overview slideshow here: 
 http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments
 Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU
 Patch based on 3.6 codebase attached.
 There are no 3.6 API changes currently - to play just add a field with _blm 
 on the end of the name to invoke special indexing/querying capability. 
 Clearly a new Field or schema declaration(!) would need adding to APIs to 
 configure the service properly.
 Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches


 [ 
https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Harwood updated LUCENE-4069:
-

Attachment: (was: PrimaryKey40PerformanceTestSrc.zip)

 Segment-level Bloom filters for a 2 x speed up on rare term searches
 

 Key: LUCENE-4069
 URL: https://issues.apache.org/jira/browse/LUCENE-4069
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 3.6, 4.0
Reporter: Mark Harwood
Priority: Minor
 Fix For: 4.0, 3.6.1

 Attachments: BloomFilterPostingsBranch4x.patch, 
 MHBloomFilterOn3.6Branch.patch


 An addition to each segment which stores a Bloom filter for selected fields 
 in order to give fast-fail to term searches, helping avoid wasted disk access.
 Best suited for low-frequency fields e.g. primary keys on big indexes with 
 many segments but also speeds up general searching in my tests.
 Overview slideshow here: 
 http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments
 Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU
 Patch based on 3.6 codebase attached.
 There are no 3.6 API changes currently - to play just add a field with _blm 
 on the end of the name to invoke special indexing/querying capability. 
 Clearly a new Field or schema declaration(!) would need adding to APIs to 
 configure the service properly.
 Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Spurious JFlex warning from build

2012-06-08 Thread Jack Krupansky

I happened to notice the following message in an ant test today:

compile-test:
 [echo] Building analyzers-common...
jflex-uptodate-check:
jflex-notice:
 [echo]   One or more of the JFlex .jflex files is newer than its 
corresponding
 [echo]   .java file.  Run the jflex target to regenerate the 
artifacts.

It is a spurious warning/directive because HTMLCharacterEntities.jflex doesn’t 
have a matching .java file since it is a “macro” referenced by 
HTMLStripCharFilter.jflex.

I am wondering if it makes sense to rename HTMLCharacterEntities.jflex to 
HTMLCharacterEntities.jflex-macro (like 
HTMLStripCharFilter.SUPPLEMENTARY.jflex-macro) to avoid the misleading build 
warning/directive.

-- Jack Krupansky

[jira] [Updated] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches


 [ 
https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Harwood updated LUCENE-4069:
-

Attachment: PrimaryKeyPerfTest40.java

Benchmark tool adapted from Mike's original Pulsing codec benchmark. Now 
includes Bloom postings example.

 Segment-level Bloom filters for a 2 x speed up on rare term searches
 

 Key: LUCENE-4069
 URL: https://issues.apache.org/jira/browse/LUCENE-4069
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 3.6, 4.0
Reporter: Mark Harwood
Priority: Minor
 Fix For: 4.0, 3.6.1

 Attachments: BloomFilterPostingsBranch4x.patch, 
 MHBloomFilterOn3.6Branch.patch, PrimaryKeyPerfTest40.java


 An addition to each segment which stores a Bloom filter for selected fields 
 in order to give fast-fail to term searches, helping avoid wasted disk access.
 Best suited for low-frequency fields e.g. primary keys on big indexes with 
 many segments but also speeds up general searching in my tests.
 Overview slideshow here: 
 http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments
 Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU
 Patch based on 3.6 codebase attached.
 There are no 3.6 API changes currently - to play just add a field with _blm 
 on the end of the name to invoke special indexing/querying capability. 
 Clearly a new Field or schema declaration(!) would need adding to APIs to 
 configure the service properly.
 Also, a patch for Lucene4.0 codebase introducing a new PostingsFormat

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4124) factor ByteBufferIndexInput out of MMapDirectory

Robert Muir created LUCENE-4124:
---

 Summary: factor ByteBufferIndexInput out of MMapDirectory
 Key: LUCENE-4124
 URL: https://issues.apache.org/jira/browse/LUCENE-4124
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/store
Reporter: Robert Muir
Assignee: Uwe Schindler


I think we should factor a ByteBufferIndexInput out of MMapDir, leaving only 
the mmap/unmapping in mmapdir.

Its a cleaner separation and would allow it to be used for other purposes (e.g. 
direct or array-backed buffers)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4124) factor ByteBufferIndexInput out of MMapDirectory


 [ 
https://issues.apache.org/jira/browse/LUCENE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-4124:


Attachment: LUCENE-4124.patch

 factor ByteBufferIndexInput out of MMapDirectory
 

 Key: LUCENE-4124
 URL: https://issues.apache.org/jira/browse/LUCENE-4124
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/store
Reporter: Robert Muir
Assignee: Uwe Schindler
 Attachments: LUCENE-4124.patch


 I think we should factor a ByteBufferIndexInput out of MMapDir, leaving only 
 the mmap/unmapping in mmapdir.
 Its a cleaner separation and would allow it to be used for other purposes 
 (e.g. direct or array-backed buffers)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4123) Add CachingRAMDirectory


[ 
https://issues.apache.org/jira/browse/LUCENE-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291884#comment-13291884
 ] 

Michael McCandless commented on LUCENE-4123:


Results for 5M doc index:

{noformat}
TaskQPS base StdDev base  QPS cachedStdDev cached  Pct 
diff
 Respell  104.067.63  108.597.55   -9% -   
20%
 TermGroup1M   57.941.59   60.700.301% -
8%
TermBGroup1M  103.282.54  108.512.540% -   
10%
  Fuzzy2   43.072.96   45.323.06   -8% -   
20%
  Fuzzy1   72.644.73   76.924.38   -6% -   
19%
  TermBGroup1M1P   90.143.03   95.953.81   -1% -   
14%
  IntNRQ   16.010.95   17.170.330% -   
16%
PKLookup   86.212.51   92.552.591% -   
13%
Wildcard   65.513.13   71.001.451% -   
16%
   OrHighMed   21.641.83   23.561.24   -4% -   
25%
 Prefix3  105.334.94  114.752.461% -   
16%
  OrHighHigh   17.391.45   18.970.95   -4% -   
24%
 AndHighHigh   30.051.14   33.420.884% -   
18%
Term  243.139.03  273.928.265% -   
20%
SloppyPhrase   15.800.28   17.840.786% -   
19%
SpanNear   10.520.14   11.970.259% -   
17%
  AndHighMed  117.603.54  135.912.49   10% -   
21%
  Phrase   20.150.78   24.220.26   14% -   
26%
{noformat}


 Add CachingRAMDirectory
 ---

 Key: LUCENE-4123
 URL: https://issues.apache.org/jira/browse/LUCENE-4123
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/store
Reporter: Michael McCandless
Assignee: Michael McCandless
 Attachments: LUCENE-4123.patch


 The directory is very simple and useful if you have an index that you
 know fully fits into available RAM.  You could also use FileSwitchDir if
 you want to leave some files (eg stored fields or term vectors) on disk.
 It wraps any other Directory and delegates all writing (IndexOutput) to
 it, but for reading (IndexInput), it allocates a single byte[] and fully
 reads the file in and then serves requests off that single byte[].  It's
 more GC friendly than RAMDir since it only allocates a single array per
 file.
 It has a few nocommits still, but all tests pass if I wrap the delegate
 inside MockDirectoryWrapper using this.
 I tested with 1M Wikipedia english index (would like to test w/ 10M docs
 but I don't have enough RAM...); it seems to give a nice speedup:
 {noformat}
 TaskQPS base StdDev base  QPS cachedStdDev cached  
 Pct diff
  Respell  197.007.27  203.198.17   -4% -  
  11%
 PKLookup  121.122.80  125.463.20   -1% -  
   8%
   Fuzzy2   66.622.62   69.912.85   -3% -  
  13%
   Fuzzy1  206.206.47  222.216.521% -  
  14%
TermGroup100K  160.146.62  175.713.793% -  
  16%
   Phrase   34.850.40   38.750.618% -  
  14%
   TermBGroup100K  363.75   15.74  406.98   13.233% -  
  20%
 SpanNear   53.081.11   59.532.944% -  
  20%
 TermBGroup100K1P  222.539.78  252.865.966% -  
  21%
 SloppyPhrase   70.362.05   79.954.484% -  
  23%
 Wildcard  238.104.29  272.784.97   10% -  
  18%
OrHighMed  123.494.85  149.324.66   12% -  
  29%
  Prefix3  288.468.10  350.405.38   16% -  
  26%
   OrHighHigh   76.463.27   93.132.96   13% -  
  31%
   IntNRQ   92.252.12  113.475.74   14% -  
  32%
 Term  757.12   39.03  958.62   22.68   17% -  
  36%
  AndHighHigh  103.034.48  133.893.76   21% -  
  39%
   AndHighMed  376.36   16.58  493.99   10.00   23% -  
  40%
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see:

[jira] [Commented] (LUCENE-4123) Add CachingRAMDirectory


[ 
https://issues.apache.org/jira/browse/LUCENE-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291886#comment-13291886
 ] 

Michael McCandless commented on LUCENE-4123:


bq. I dont think it buys anything to code dup the readVint/vlong here. it 
should be compiled to the same code. e.g. mmapdir doesnt do this.

I think you're right!  Here are the results w/ the code dup removed (same 
static seed as previous 5M doc results):

{noformat}
TaskQPS base StdDev base  QPS cachedStdDev cached  Pct 
diff
  IntNRQ   16.360.86   16.920.75   -6% -   
14%
  TermBGroup1M1P   91.713.03   95.073.94   -3% -   
11%
 TermGroup1M   58.141.00   60.381.530% -
8%
TermBGroup1M  103.111.76  108.142.630% -
9%
 Prefix3  108.830.97  115.052.892% -
9%
Wildcard   67.270.72   71.221.712% -
9%
 Respell  102.297.78  109.087.22   -7% -   
23%
  Fuzzy2   42.462.95   45.513.31   -7% -   
23%
  Fuzzy1   72.463.55   77.964.51   -3% -   
19%
Term  247.45   17.73  268.17   12.28   -3% -   
22%
   OrHighMed   22.381.19   24.471.64   -3% -   
23%
  OrHighHigh   18.010.92   19.711.20   -2% -   
22%
 AndHighHigh   30.790.35   33.800.377% -   
12%
PKLookup   84.712.40   93.952.325% -   
16%
SpanNear   10.540.13   12.020.13   11% -   
16%
  AndHighMed  119.181.05  136.641.80   12% -   
17%
SloppyPhrase   15.500.15   18.260.30   14% -   
20%
  Phrase   20.640.12   24.940.48   17% -   
23%
{noformat}

So I'll remove the code dup.

 Add CachingRAMDirectory
 ---

 Key: LUCENE-4123
 URL: https://issues.apache.org/jira/browse/LUCENE-4123
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/store
Reporter: Michael McCandless
Assignee: Michael McCandless
 Attachments: LUCENE-4123.patch


 The directory is very simple and useful if you have an index that you
 know fully fits into available RAM.  You could also use FileSwitchDir if
 you want to leave some files (eg stored fields or term vectors) on disk.
 It wraps any other Directory and delegates all writing (IndexOutput) to
 it, but for reading (IndexInput), it allocates a single byte[] and fully
 reads the file in and then serves requests off that single byte[].  It's
 more GC friendly than RAMDir since it only allocates a single array per
 file.
 It has a few nocommits still, but all tests pass if I wrap the delegate
 inside MockDirectoryWrapper using this.
 I tested with 1M Wikipedia english index (would like to test w/ 10M docs
 but I don't have enough RAM...); it seems to give a nice speedup:
 {noformat}
 TaskQPS base StdDev base  QPS cachedStdDev cached  
 Pct diff
  Respell  197.007.27  203.198.17   -4% -  
  11%
 PKLookup  121.122.80  125.463.20   -1% -  
   8%
   Fuzzy2   66.622.62   69.912.85   -3% -  
  13%
   Fuzzy1  206.206.47  222.216.521% -  
  14%
TermGroup100K  160.146.62  175.713.793% -  
  16%
   Phrase   34.850.40   38.750.618% -  
  14%
   TermBGroup100K  363.75   15.74  406.98   13.233% -  
  20%
 SpanNear   53.081.11   59.532.944% -  
  20%
 TermBGroup100K1P  222.539.78  252.865.966% -  
  21%
 SloppyPhrase   70.362.05   79.954.484% -  
  23%
 Wildcard  238.104.29  272.784.97   10% -  
  18%
OrHighMed  123.494.85  149.324.66   12% -  
  29%
  Prefix3  288.468.10  350.405.38   16% -  
  26%
   OrHighHigh   76.463.27   93.132.96   13% -  
  31%
   IntNRQ   92.252.12  113.475.74   14% -  
  32%
 Term  757.12   39.03  958.62   22.68   17% -  
  36%
  AndHighHigh  103.034.48  133.893.76   21% -  
  39%
   AndHighMed  376.36   16.58  493.99   10.00   23% -  
  40%

RE: Spurious JFlex warning from build

2012-06-08 Thread Steven A Rowe

+1

From: Jack Krupansky [mailto:j...@basetechnology.com]
Sent: Friday, June 08, 2012 1:48 PM
To: Lucene/Solr Dev
Subject: Spurious JFlex warning from build

I happened to notice the following message in an ant test today:

compile-test:
 [echo] Building analyzers-common...
jflex-uptodate-check:
jflex-notice:
 [echo]   One or more of the JFlex .jflex files is newer than its 
corresponding
 [echo]   .java file.  Run the jflex target to regenerate the 
artifacts.

It is a spurious warning/directive because HTMLCharacterEntities.jflex doesn’t 
have a matching .java file since it is a “macro” referenced by 
HTMLStripCharFilter.jflex.

I am wondering if it makes sense to rename HTMLCharacterEntities.jflex to 
HTMLCharacterEntities.jflex-macro (like 
HTMLStripCharFilter.SUPPLEMENTARY.jflex-macro) to avoid the misleading build 
warning/directive.

-- Jack Krupansky

[jira] [Commented] (SOLR-3527) Optimize ignores maxSegments in distributed environment

2012-06-08 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291890#comment-13291890
 ] 

Mark Miller commented on SOLR-3527:
---

Sounds right Andy - thanks for the report.

 Optimize ignores maxSegments in distributed environment
 ---

 Key: SOLR-3527
 URL: https://issues.apache.org/jira/browse/SOLR-3527
 Project: Solr
  Issue Type: Bug
  Components: SearchComponents - other
Affects Versions: 4.0
Reporter: Andy Laird

 Send the following command to a Solr server with many segments in a 
 multi-shard, multi-server environment:
 curl 
 http://localhost:8080/solr/update?optimize=truewaitFlush=truemaxSegments=6distrib=false;
 The local server will end up with the number of segments at 6, as requested, 
 but all other shards in the index will be optimized with maxSegments=1, which 
 takes far longer to complete.  All shards should be optimized to the 
 requested value of 6.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4124) factor ByteBufferIndexInput out of MMapDirectory

2012-06-08 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291899#comment-13291899
 ] 

Uwe Schindler commented on LUCENE-4124:
---

Thanks for assigning me. Patch looks good as first step. The hashcode and 
equals in the (now abstract) base class must be final. This was not done 
before, as class on itself was final.

 factor ByteBufferIndexInput out of MMapDirectory
 

 Key: LUCENE-4124
 URL: https://issues.apache.org/jira/browse/LUCENE-4124
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/store
Reporter: Robert Muir
Assignee: Uwe Schindler
 Attachments: LUCENE-4124.patch


 I think we should factor a ByteBufferIndexInput out of MMapDir, leaving only 
 the mmap/unmapping in mmapdir.
 Its a cleaner separation and would allow it to be used for other purposes 
 (e.g. direct or array-backed buffers)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

lucene-highlighter 3.6 No highlight for 3 letter words

2012-06-08 Thread gerryjun

Hi, 

How can i highlight 3 letter words? everything is working except for
this, what setting do i need to change?

Im using  lucene-highlighter-3.6.0.jar  lucene-core-3.6.0.jar.


Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_30);
QueryParser parser = new QueryParser(Version.LUCENE_30, ,
analyzer);
parser.setAllowLeadingWildcard(true);
SimpleHTMLFormatter htmlFormatter = new SimpleHTMLFormatter(,);
Highlighter highlighter = new Highlighter(htmlFormatter,new
QueryScorer(parser.parse(pQuery)));
highlighter.setTextFragmenter(new NullFragmenter()); 
highlighter.setMaxDocCharsToAnalyze(Integer.MAX_VALUE);
String text = highlighter.getBestFragment(analyzer, , pText);

Thanks

--
View this message in context: 
http://lucene.472066.n3.nabble.com/lucene-highlighter-3-6-No-highlight-for-3-letter-words-tp3988464.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-tests-only-trunk - Build # 14636 - Failure

Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/14636/

All tests passed

Build Log:
[...truncated 21120 lines...]

rat-sources-typedef:

rat-sources:
 [echo] 
 [echo] *
 [echo] Summary
 [echo] ---
 [echo] Generated at: 2012-06-08T19:01:23+00:00
 [echo] Notes: 0
 [echo] Binaries: 0
 [echo] Archives: 0
 [echo] Standards: 6
 [echo] 
 [echo] Apache Licensed: 6
 [echo] Generated Documents: 0
 [echo] 
 [echo] JavaDocs are generated and so license header is optional
 [echo] Generated files do not required license headers
 [echo] 
 [echo] 0 Unknown Licenses
 [echo] 
 [echo] ***
 [echo] 
 [echo] Unapproved licenses:
 [echo] 
 [echo] 
 [echo] ***
 [echo] 
 [echo] Archives:
 [echo] 
 [echo] *
 [echo]   Files with Apache License headers will be marked AL
 [echo]   Binary files (which do not require AL headers) will be marked B
 [echo]   Compressed archives will be marked A
 [echo]   Notices, licenses etc will be marked N
 [echo]   AL
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/contrib/velocity/src/java/org/apache/solr/response/PageTool.java
 [echo]   AL
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/contrib/velocity/src/java/org/apache/solr/response/SolrParamResourceLoader.java
 [echo]   AL
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/contrib/velocity/src/java/org/apache/solr/response/SolrVelocityResourceLoader.java
 [echo]   AL
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/contrib/velocity/src/java/org/apache/solr/response/VelocityResponseWriter.java
 [echo]   AL
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/contrib/velocity/src/java/overview.html
 [echo]   AL
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/contrib/velocity/src/test/org/apache/solr/velocity/VelocityResponseWriterTest.java
 [echo]  
 [echo]  *
 [echo]  Printing headers for files without AL header...
 [echo]  
 [echo]  

BUILD SUCCESSFUL
Total time: 8 seconds
+ [ -z '' ]
+ cd 
/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene
+ /home/hudson/tools/ant/supported18/bin/ant 
-Djavadoc.link=/home/hudson/tools/java/api6 javadocs-lint
Buildfile: 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build.xml

check-lucene-core-javadocs-uptodate:

javadocs-lucene-core:

javadocs:
[mkdir] Created dir: 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/docs/core

download-java6-javadoc-packagelist:
 [copy] Copying 1 file to 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/docs/core
  [javadoc] Generating Javadoc
  [javadoc] Javadoc execution
  [javadoc] Loading source files for package org.apache.lucene...
  [javadoc] Loading source files for package org.apache.lucene.analysis...
  [javadoc] Loading source files for package 
org.apache.lucene.analysis.tokenattributes...
  [javadoc] Loading source files for package org.apache.lucene.codecs...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.appending...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.intblock...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.lucene40...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.lucene40.values...
  [javadoc] Loading source files for package org.apache.lucene.codecs.memory...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.perfield...
  [javadoc] Loading source files for package org.apache.lucene.codecs.pulsing...
  [javadoc] Loading source files for package org.apache.lucene.codecs.sep...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.simpletext...
  [javadoc] Loading source files for package org.apache.lucene.document...
  [javadoc] Loading source files for package org.apache.lucene.index...
  [javadoc] Loading source files for package org.apache.lucene.search...
  [javadoc] Loading source files for package 
org.apache.lucene.search.payloads...
  [javadoc] Loading source files for package 
org.apache.lucene.search.similarities...
  [javadoc] Loading source files for package org.apache.lucene.search.spans...
  [javadoc] Loading source files for package org.apache.lucene.store...
  [javadoc] Loading source files for package org.apache.lucene.util...
  [javadoc] Loading source files for package org.apache.lucene.util.automaton...
  [javadoc]

[jira] [Created] (SOLR-3528) Analysis UI should stack tokens at the same position

2012-06-08 Thread Ryan McKinley (JIRA)

Ryan McKinley created SOLR-3528:
---

 Summary: Analysis UI should stack tokens at the same position
 Key: SOLR-3528
 URL: https://issues.apache.org/jira/browse/SOLR-3528
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Ryan McKinley
 Attachments: position-stach.png

The old UI would display tokens that had the same position in the same column.  
The new one adds a new column for each position, making it less clear what is 
happening with position offsets (especially in the non-verbose output)

I think it should be reworked as:
{code}
tr
tdTokenizer/td
td
 divstuff at pos 0/div
 divstuff at pos 0/div
 divstuff at pos 0/div
/td
td
 divstuff at pos 1/div
/td
/tr
{code}

Using a table would also force the layout wide rather then wrapping


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3528) Analysis UI should stack tokens at the same position

2012-06-08 Thread Ryan McKinley (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley updated SOLR-3528:


Attachment: position-stach.png

Current view:
!position-stach.png!

 Analysis UI should stack tokens at the same position
 

 Key: SOLR-3528
 URL: https://issues.apache.org/jira/browse/SOLR-3528
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Ryan McKinley
 Attachments: position-stach.png


 The old UI would display tokens that had the same position in the same 
 column.  The new one adds a new column for each position, making it less 
 clear what is happening with position offsets (especially in the non-verbose 
 output)
 I think it should be reworked as:
 {code}
 tr
 tdTokenizer/td
 td
  divstuff at pos 0/div
  divstuff at pos 0/div
  divstuff at pos 0/div
 /td
 td
  divstuff at pos 1/div
 /td
 /tr
 {code}
 Using a table would also force the layout wide rather then wrapping

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3528) Analysis UI should stack tokens at the same position

2012-06-08 Thread Ryan McKinley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291917#comment-13291917
 ] 

Ryan McKinley commented on SOLR-3528:
-

synonyms and path heiarch are good examples of tokenizers/filters that stack 
positions

 Analysis UI should stack tokens at the same position
 

 Key: SOLR-3528
 URL: https://issues.apache.org/jira/browse/SOLR-3528
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Ryan McKinley
 Attachments: position-stach.png


 The old UI would display tokens that had the same position in the same 
 column.  The new one adds a new column for each position, making it less 
 clear what is happening with position offsets (especially in the non-verbose 
 output)
 I think it should be reworked as:
 {code}
 tr
 tdTokenizer/td
 td
  divstuff at pos 0/div
  divstuff at pos 0/div
  divstuff at pos 0/div
 /td
 td
  divstuff at pos 1/div
 /td
 /tr
 {code}
 Using a table would also force the layout wide rather then wrapping

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-Linux-Java6-64 - Build # 818 - Failure!

Build: 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux-Java6-64/818/

All tests passed

Build Log:
[...truncated 21105 lines...]

rat-sources-typedef:

rat-sources:
 [echo] 
 [echo] *
 [echo] Summary
 [echo] ---
 [echo] Generated at: 2012-06-08T19:18:24+00:00
 [echo] Notes: 0
 [echo] Binaries: 0
 [echo] Archives: 0
 [echo] Standards: 6
 [echo] 
 [echo] Apache Licensed: 6
 [echo] Generated Documents: 0
 [echo] 
 [echo] JavaDocs are generated and so license header is optional
 [echo] Generated files do not required license headers
 [echo] 
 [echo] 0 Unknown Licenses
 [echo] 
 [echo] ***
 [echo] 
 [echo] Unapproved licenses:
 [echo] 
 [echo] 
 [echo] ***
 [echo] 
 [echo] Archives:
 [echo] 
 [echo] *
 [echo]   Files with Apache License headers will be marked AL
 [echo]   Binary files (which do not require AL headers) will be marked B
 [echo]   Compressed archives will be marked A
 [echo]   Notices, licenses etc will be marked N
 [echo]   AL
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux-Java6-64/checkout/solr/contrib/velocity/src/java/org/apache/solr/response/PageTool.java
 [echo]   AL
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux-Java6-64/checkout/solr/contrib/velocity/src/java/org/apache/solr/response/SolrParamResourceLoader.java
 [echo]   AL
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux-Java6-64/checkout/solr/contrib/velocity/src/java/org/apache/solr/response/SolrVelocityResourceLoader.java
 [echo]   AL
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux-Java6-64/checkout/solr/contrib/velocity/src/java/org/apache/solr/response/VelocityResponseWriter.java
 [echo]   AL
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux-Java6-64/checkout/solr/contrib/velocity/src/java/overview.html
 [echo]   AL
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux-Java6-64/checkout/solr/contrib/velocity/src/test/org/apache/solr/velocity/VelocityResponseWriterTest.java
 [echo]  
 [echo]  *
 [echo]  Printing headers for files without AL header...
 [echo]  
 [echo]  

BUILD SUCCESSFUL
Total time: 7 seconds
+ [ -z  ]
+ cd /var/lib/jenkins/workspace/Lucene-Solr-trunk-Linux-Java6-64/checkout/lucene
+ /var/lib/jenkins/tools/ant/supported18/bin/ant 
-Djavadoc.link=/var/lib/jenkins/tools/java/api6 javadocs-lint
Buildfile: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux-Java6-64/checkout/lucene/build.xml

check-lucene-core-javadocs-uptodate:

javadocs-lucene-core:

javadocs:
[mkdir] Created dir: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux-Java6-64/checkout/lucene/build/docs/core

download-java6-javadoc-packagelist:
 [copy] Copying 1 file to 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux-Java6-64/checkout/lucene/build/docs/core
  [javadoc] Generating Javadoc
  [javadoc] Javadoc execution
  [javadoc] Loading source files for package org.apache.lucene...
  [javadoc] Loading source files for package org.apache.lucene.analysis...
  [javadoc] Loading source files for package 
org.apache.lucene.analysis.tokenattributes...
  [javadoc] Loading source files for package org.apache.lucene.codecs...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.appending...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.intblock...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.lucene40...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.lucene40.values...
  [javadoc] Loading source files for package org.apache.lucene.codecs.memory...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.perfield...
  [javadoc] Loading source files for package org.apache.lucene.codecs.pulsing...
  [javadoc] Loading source files for package org.apache.lucene.codecs.sep...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.simpletext...
  [javadoc] Loading source files for package org.apache.lucene.document...
  [javadoc] Loading source files for package org.apache.lucene.index...
  [javadoc] Loading source files for package org.apache.lucene.search...
  [javadoc] Loading source files for package 
org.apache.lucene.search.payloads...
  [javadoc] Loading source files for package 
org.apache.lucene.search.similarities...
  [javadoc] Loading source files for package org.apache.lucene.search.spans...
  [javadoc] Loading source files for package org.apache.lucene.store...
  [javadoc] Loading source files for package org.apache.lucene.util...
  [javadoc] Loading source files for package org.apache.lucene.util.automaton...
  [javadoc] Loading source files for package org.apache.lucene.util.fst...

[jira] [Commented] (SOLR-3528) Analysis UI should stack tokens at the same position


[ 
https://issues.apache.org/jira/browse/SOLR-3528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291921#comment-13291921
 ] 

Christian Moen commented on SOLR-3528:
--

I agree.  It's great if the new analysis UI can stack tokens.

 Analysis UI should stack tokens at the same position
 

 Key: SOLR-3528
 URL: https://issues.apache.org/jira/browse/SOLR-3528
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Ryan McKinley
 Attachments: position-stach.png


 The old UI would display tokens that had the same position in the same 
 column.  The new one adds a new column for each position, making it less 
 clear what is happening with position offsets (especially in the non-verbose 
 output)
 I think it should be reworked as:
 {code}
 tr
 tdTokenizer/td
 td
  divstuff at pos 0/div
  divstuff at pos 0/div
  divstuff at pos 0/div
 /td
 td
  divstuff at pos 1/div
 /td
 /tr
 {code}
 Using a table would also force the layout wide rather then wrapping

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-2569) Enable facile moving of cores

2012-06-08 Thread Jason Rutherglen (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Rutherglen resolved SOLR-2569.


Resolution: Won't Fix

 Enable facile moving of cores
 -

 Key: SOLR-2569
 URL: https://issues.apache.org/jira/browse/SOLR-2569
 Project: Solr
  Issue Type: Improvement
  Components: multicore, replication (java)
Affects Versions: 4.0
Reporter: Jason Rutherglen

 Spin-off from this thread: 
 http://search-lucene.com/m/5CO7Z1oOrh6/elastic+searchsubj=Solr+vs+ElasticSearch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: lucene-highlighter 3.6 No highlight for 3 letter words

2012-06-08 Thread Simon Willnauer

you might ask this question on the user list to get better response.
can you provide also the text  query you want to highlight?

simon

On Fri, Jun 8, 2012 at 1:17 PM, gerryjun gerry...@yahoo.com wrote:
 Hi,

    How can i highlight 3 letter words? everything is working except for
 this, what setting do i need to change?

    Im using  lucene-highlighter-3.6.0.jar  lucene-core-3.6.0.jar.


        Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_30);
        QueryParser parser = new QueryParser(Version.LUCENE_30, ,
 analyzer);
        parser.setAllowLeadingWildcard(true);
        SimpleHTMLFormatter htmlFormatter = new SimpleHTMLFormatter(,);
        Highlighter highlighter = new Highlighter(htmlFormatter,new
 QueryScorer(parser.parse(pQuery)));
        highlighter.setTextFragmenter(new NullFragmenter());
        highlighter.setMaxDocCharsToAnalyze(Integer.MAX_VALUE);
        String text = highlighter.getBestFragment(analyzer, , pText);

 Thanks

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/lucene-highlighter-3-6-No-highlight-for-3-letter-words-tp3988464.html
 Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4087) Provide consistent IW behavior for illegal meta data changes


[ 
https://issues.apache.org/jira/browse/LUCENE-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291937#comment-13291937
 ] 

Michael McCandless commented on LUCENE-4087:


Patch looks good.  +1

I think this patch now means that, depending on when flushes kick in,
you can sometimes apparently succeed in changing DV type for a field
(though on merge something strange can happen, eg suddenly upgrading
to a BYTES_XXX type) and other times hit an exception?  Like the error
checking is now intermittent as seen from the app?  You might think
everything is OK, push to production, and later (in production) hit a
new exception...

I think that's actually OK for now (this is all
best effort)... but I think we should clean this up (can come after 4.0)
so that the checking is consistent.

Can we shorten the javadoc to simply state Changing the DocValue type
for a given field is not supported.?  Sure we make best effort to
recover today but I don't think we should detail particulars of the
specific best effort we're doing in 4.0?


 Provide consistent IW behavior for illegal meta data changes
 

 Key: LUCENE-4087
 URL: https://issues.apache.org/jira/browse/LUCENE-4087
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
 Fix For: 4.0, 4.1, 5.0

 Attachments: LUCENE-4087.patch, LUCENE-4087.patch


 Currently IW fails late and inconsistent if field metadata like an already 
 defined DocValues type or un-omitting norms.
 we can approach this similar to how we handle consistent field number and:
  
 * throw exception if indexOptions conflict (e.g. omitTF=true versus
 false) instead of silently dropping positions on merge
 * same with omitNorms
 * same with norms types and docvalues types
 * still keeping field numbers consistent
 this way we could eliminate all these traps and just give an
 exception instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4087) Provide consistent IW behavior for illegal meta data changes


[ 
https://issues.apache.org/jira/browse/LUCENE-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291940#comment-13291940
 ] 

Robert Muir commented on LUCENE-4087:
-

nit: loosing - losing in DocValues.java javadocs

 Provide consistent IW behavior for illegal meta data changes
 

 Key: LUCENE-4087
 URL: https://issues.apache.org/jira/browse/LUCENE-4087
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
 Fix For: 4.0, 4.1, 5.0

 Attachments: LUCENE-4087.patch, LUCENE-4087.patch


 Currently IW fails late and inconsistent if field metadata like an already 
 defined DocValues type or un-omitting norms.
 we can approach this similar to how we handle consistent field number and:
  
 * throw exception if indexOptions conflict (e.g. omitTF=true versus
 false) instead of silently dropping positions on merge
 * same with omitNorms
 * same with norms types and docvalues types
 * still keeping field numbers consistent
 this way we could eliminate all these traps and just give an
 exception instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4087) Provide consistent IW behavior for illegal meta data changes


[ 
https://issues.apache.org/jira/browse/LUCENE-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291946#comment-13291946
 ] 

Simon Willnauer commented on LUCENE-4087:
-

{quote}
Can we shorten the javadoc to simply state Changing the DocValue type
for a given field is not supported.? Sure we make best effort to
recover today but I don't think we should detail particulars of the
specific best effort we're doing in 4.0?
{quote}

I am not sure if we should say that since its not true. you can safely change a 
float into a double and if you reindex all documents you will eventually 
converge to double. Same is true for Sorted and go from fixed to variable or 
extend the precision of an integer. I think its just fair to document that. if 
we can change it in future releases is a different thing.

 Provide consistent IW behavior for illegal meta data changes
 

 Key: LUCENE-4087
 URL: https://issues.apache.org/jira/browse/LUCENE-4087
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
 Fix For: 4.0, 4.1, 5.0

 Attachments: LUCENE-4087.patch, LUCENE-4087.patch


 Currently IW fails late and inconsistent if field metadata like an already 
 defined DocValues type or un-omitting norms.
 we can approach this similar to how we handle consistent field number and:
  
 * throw exception if indexOptions conflict (e.g. omitTF=true versus
 false) instead of silently dropping positions on merge
 * same with omitNorms
 * same with norms types and docvalues types
 * still keeping field numbers consistent
 this way we could eliminate all these traps and just give an
 exception instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4122) Replace Payload with BytesRef

2012-06-08 Thread Andrzej Bialecki (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki  resolved LUCENE-4122.
---

Resolution: Fixed

Committed in rev. 1348171 to trunk and in rev. 1348227 to branch_4x.

 Replace Payload with BytesRef
 -

 Key: LUCENE-4122
 URL: https://issues.apache.org/jira/browse/LUCENE-4122
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0, 5.0
Reporter: Andrzej Bialecki 
Assignee: Andrzej Bialecki 
 Fix For: 4.0, 5.0

 Attachments: LUCENE-4122.patch


 The Payload class offers a very similar functionality to BytesRef. The code 
 internally uses BytesRef-s to represent payloads, and on indexing and on 
 retrieval this data is repackaged from/to Payload.
 This seems wasteful. I propose to remove the Payload class and use BytesRef 
 instead, thus avoid this re-wrapping and reducing the API footprint.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2357) Reduce transient RAM usage while merging by using packed ints array for docID re-mapping


[ 
https://issues.apache.org/jira/browse/LUCENE-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291967#comment-13291967
 ] 

Simon Willnauer commented on LUCENE-2357:
-

s/(Adrien Grand via Mike McCandless)/(Adrien Grand)

otherwise +1

 Reduce transient RAM usage while merging by using packed ints array for docID 
 re-mapping
 

 Key: LUCENE-2357
 URL: https://issues.apache.org/jira/browse/LUCENE-2357
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 4.0, 5.0

 Attachments: LUCENE-2357.patch, LUCENE-2357.patch, LUCENE-2357.patch, 
 LUCENE-2357.patch, LUCENE-2357.patch


 We allocate this int[] to remap docIDs due to compaction of deleted ones.
 This uses alot of RAM for large segment merges, and can fail to allocate due 
 to fragmentation on 32 bit JREs.
 Now that we have packed ints, a simple fix would be to use a packed int 
 array... and maybe instead of storing abs docID in the mapping, we could 
 store the number of del docs seen so far (so the remap would do a lookup then 
 a subtract).  This may add some CPU cost to merging but should bring down 
 transient RAM usage quite a bit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4087) Provide consistent IW behavior for illegal meta data changes


[ 
https://issues.apache.org/jira/browse/LUCENE-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291972#comment-13291972
 ] 

Michael McCandless commented on LUCENE-4087:


OK I guess that makes sense.  Basically we sign up, now, to allow certain DV 
type changes in the schema, just like how we allow omitNorms to change from 
false to true, but not vice/versa.

 Provide consistent IW behavior for illegal meta data changes
 

 Key: LUCENE-4087
 URL: https://issues.apache.org/jira/browse/LUCENE-4087
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
 Fix For: 4.0, 4.1, 5.0

 Attachments: LUCENE-4087.patch, LUCENE-4087.patch


 Currently IW fails late and inconsistent if field metadata like an already 
 defined DocValues type or un-omitting norms.
 we can approach this similar to how we handle consistent field number and:
  
 * throw exception if indexOptions conflict (e.g. omitTF=true versus
 false) instead of silently dropping positions on merge
 * same with omitNorms
 * same with norms types and docvalues types
 * still keeping field numbers consistent
 this way we could eliminate all these traps and just give an
 exception instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-tests-only-trunk - Build # 14637 - Still Failing

Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/14637/

All tests passed

Build Log:
[...truncated 24254 lines...]

[...truncated 24254 lines...]

[...truncated 24254 lines...]

[...truncated 24201 lines...]
javadocs-lint:
 [exec] 
 [exec] Crawl/parse...
 [exec] 
 [exec]   build/docs/core/org/apache/lucene/store/package-use.html
 [exec] WARNING: anchor 
../../../../org/apache/lucene/store/subclasses appears more than once
 [exec] 
 [exec] Verify...
 [exec] 
 [exec] 
build/docs/analyzers-common/org/apache/lucene/analysis/payloads/IdentityEncoder.html
 [exec]   BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html
 [exec]   BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html
 [exec] 
 [exec] 
build/docs/analyzers-common/org/apache/lucene/analysis/payloads/package-summary.html
 [exec]   BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html
 [exec]   BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html
 [exec] 
 [exec] 
build/docs/analyzers-common/org/apache/lucene/analysis/payloads/class-use/AbstractEncoder.html
 [exec]   BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html
 [exec]   BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html
 [exec] 
 [exec] 
build/docs/analyzers-common/org/apache/lucene/analysis/payloads/IntegerEncoder.html
 [exec]   BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html
 [exec]   BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html
 [exec]   BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html
 [exec] 
 [exec] 
build/docs/analyzers-common/org/apache/lucene/analysis/payloads/FloatEncoder.html
 [exec]   BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html
 [exec]   BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html
 [exec]   BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html
 [exec] 
 [exec] 
build/docs/analyzers-common/org/apache/lucene/analysis/payloads/PayloadEncoder.html
 [exec]   BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html
 [exec]   BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html
 [exec] 
 [exec] 
build/docs/analyzers-common/org/apache/lucene/analysis/payloads/class-use/PayloadEncoder.html
 [exec]   BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html
 [exec]   BROKEN LINK: build/docs/core/org/apache/lucene/index/Payload.html
 [exec] 
 [exec] Broken javadocs links were found!

BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build.xml:194:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/common-build.xml:1613:
 exec returned: 1

Total time: 3 minutes 16 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Publishing Clover coverage report...
No Clover report will be published due to a Build Failure
Email was triggered for: Failure
Sending email for trigger: Failure

[...truncated 24254 lines...]

[...truncated 24254 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4115) JAR resolution/ cleanup should be done automatically for ant clean/ eclipse/ resolve.

2012-06-08 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291980#comment-13291980
 ] 

Hoss Man commented on LUCENE-4115:
--

bq. A true nice solution would be to revisit the issue where classpaths are 
constructed to ivy cache directly (they're always correct then) and just use 
copying for packaging.

seems like that might introduce some risk of the classpath(s) used by 
developers/jenkins for running tests deviating from the ones people would get 
if they use the binary distributions (particularly solr users who don't 
know/understand java classpaths and just copy the example  lib dirs as a 
starting point).


 JAR resolution/ cleanup should be done automatically for ant clean/ eclipse/ 
 resolve.
 -

 Key: LUCENE-4115
 URL: https://issues.apache.org/jira/browse/LUCENE-4115
 Project: Lucene - Java
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 4.0, 5.0

 Attachments: LUCENE-4111.patch


 I think we should add the following target deps:
 ant clean [depends on] clean-jars
 ant resolve [depends on] clean-jars
 ant eclipse [depends on] resolve, clean-jars
 ant idea [depends on] resolve, clean-jars
 This eliminates the need to remember about cleaning up stale jars which users 
 complain about (and I think they're right about it). The overhead will be 
 minimal since resolve is only going to copy jars from cache. Eclipse won't 
 have a problem with updated JARs if they end up at the same location.
 If there are no objections I will fix this in a few hours.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-tests-only-4.x - Build # 31 - Failure

Build: https://builds.apache.org/job/Lucene-Solr-tests-only-4.x/31/

All tests passed

Build Log:
[...truncated 9163 lines...]
jar-misc:

check-spatial-uptodate:

jar-spatial:

check-grouping-uptodate:

jar-grouping:

check-queries-uptodate:

jar-queries:

check-queryparser-uptodate:

jar-queryparser:

prep-lucene-jars:

resolve-example:

ivy-availability-check:

ivy-fail:

ivy-configure:

resolve:
[ivy:retrieve] :: loading settings :: url = 
jar:file:/home/hudson/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml

common.init:

compile-lucene-core:

init:

clover.setup:

clover.info:
 [echo] 
 [echo]   Clover not found. Code coverage reports disabled.
 [echo] 

clover:

common.compile-core:
[mkdir] Created dir: 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/build/solr-core/classes/java
[javac] Compiling 572 source files to 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/build/solr-core/classes/java
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:29:
 cannot find symbol
[javac] symbol  : class Payload
[javac] location: package org.apache.lucene.index
[javac] import org.apache.lucene.index.Payload;
[javac]   ^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/schema/JsonPreAnalyzedParser.java:22:
 cannot find symbol
[javac] symbol  : class Payload
[javac] location: package org.apache.lucene.index
[javac] import org.apache.lucene.index.Payload;
[javac]   ^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/schema/SimplePreAnalyzedParser.java:36:
 cannot find symbol
[javac] symbol  : class Payload
[javac] location: package org.apache.lucene.index
[javac] import org.apache.lucene.index.Payload;
[javac]   ^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:276:
 cannot find symbol
[javac] symbol: class Payload
[javac]   if (value instanceof Payload) {
[javac]^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:277:
 cannot find symbol
[javac] symbol: class Payload
[javac] final Payload p = (Payload) value;
[javac]   ^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:277:
 cannot find symbol
[javac] symbol: class Payload
[javac] final Payload p = (Payload) value;
[javac]^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/schema/JsonPreAnalyzedParser.java:174:
 cannot find symbol
[javac] symbol  : class Payload
[javac] location: class org.apache.solr.schema.JsonPreAnalyzedParser
[javac]   p.setPayload(new Payload(data));
[javac]^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/schema/JsonPreAnalyzedParser.java:251:
 cannot find symbol
[javac] symbol  : class Payload
[javac] location: class org.apache.solr.schema.JsonPreAnalyzedParser
[javac]   Payload p = ((PayloadAttribute)att).getPayload();
[javac]   ^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/schema/SimplePreAnalyzedParser.java:440:
 cannot find symbol
[javac] symbol  : class Payload
[javac] location: class org.apache.solr.schema.SimplePreAnalyzedParser
[javac]   p.setPayload(new Payload(data));
[javac]^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/schema/SimplePreAnalyzedParser.java:501:
 cannot find symbol
[javac] symbol  : class Payload
[javac] location: class org.apache.solr.schema.SimplePreAnalyzedParser
[javac]   Payload p = ((PayloadAttribute)att).getPayload();
[javac]   ^
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.
[javac] 10 errors
[...truncated 18

[jira] [Commented] (LUCENE-4115) JAR resolution/ cleanup should be done automatically for ant clean/ eclipse/ resolve.

2012-06-08 Thread Dawid Weiss (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291989#comment-13291989
 ] 

Dawid Weiss commented on LUCENE-4115:
-

Why would this be so? I mean -- the risk of users messing up their classpath 
with lib/*.jar is pretty much the same compared to an ivy classpath from cache 
+ ivy classpath from cache copied to lib/ at distribution time?

 JAR resolution/ cleanup should be done automatically for ant clean/ eclipse/ 
 resolve.
 -

 Key: LUCENE-4115
 URL: https://issues.apache.org/jira/browse/LUCENE-4115
 Project: Lucene - Java
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 4.0, 5.0

 Attachments: LUCENE-4111.patch


 I think we should add the following target deps:
 ant clean [depends on] clean-jars
 ant resolve [depends on] clean-jars
 ant eclipse [depends on] resolve, clean-jars
 ant idea [depends on] resolve, clean-jars
 This eliminates the need to remember about cleaning up stale jars which users 
 complain about (and I think they're right about it). The overhead will be 
 minimal since resolve is only going to copy jars from cache. Eclipse won't 
 have a problem with updated JARs if they end up at the same location.
 If there are no objections I will fix this in a few hours.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4087) Provide consistent IW behavior for illegal meta data changes


 [ 
https://issues.apache.org/jira/browse/LUCENE-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer resolved LUCENE-4087.
-

   Resolution: Fixed
 Assignee: Simon Willnauer
Lucene Fields: New,Patch Available  (was: New)

committed to trunk and 4x

 Provide consistent IW behavior for illegal meta data changes
 

 Key: LUCENE-4087
 URL: https://issues.apache.org/jira/browse/LUCENE-4087
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.0, 4.1, 5.0

 Attachments: LUCENE-4087.patch, LUCENE-4087.patch


 Currently IW fails late and inconsistent if field metadata like an already 
 defined DocValues type or un-omitting norms.
 we can approach this similar to how we handle consistent field number and:
  
 * throw exception if indexOptions conflict (e.g. omitTF=true versus
 false) instead of silently dropping positions on merge
 * same with omitNorms
 * same with norms types and docvalues types
 * still keeping field numbers consistent
 this way we could eliminate all these traps and just give an
 exception instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4078) PatternReplaceCharFilter assertion error

2012-06-08 Thread Dawid Weiss (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291996#comment-13291996
 ] 

Dawid Weiss commented on LUCENE-4078:
-

Follow-up discussion on core-libs-dev. The bottom line: this is the expected 
behavior... 

http://mail.openjdk.java.net/pipermail/core-libs-dev/2012-June/010455.html

 PatternReplaceCharFilter assertion error
 

 Key: LUCENE-4078
 URL: https://issues.apache.org/jira/browse/LUCENE-4078
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-4078.patch


 Build: https://builds.apache.org/job/Lucene-trunk/1942/
 1 tests failed.
 REGRESSION:  
 org.apache.lucene.analysis.pattern.TestPatternReplaceCharFilter.testRandomStrings
 Error Message:
 Stack Trace:
 java.lang.AssertionError
at 
 __randomizedtesting.SeedInfo.seed([8E91A6AC395FEED9:618A6129A5BB9EC]:0)
at 
 org.apache.lucene.analysis.MockTokenizer.readCodePoint(MockTokenizer.java:153)
at 
 org.apache.lucene.analysis.MockTokenizer.incrementToken(MockTokenizer.java:123)
at 
 org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:558)
at 
 org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:488)
at 
 org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:430)
at 
 org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:424)
at 
 org.apache.lucene.analysis.pattern.TestPatternReplaceCharFilter.testRandomStrings(TestPatternReplaceCharFilter.java:323)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969)
at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132)
at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814)
at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875)
at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889)
at 
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
 org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32)
at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
 org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
at 
 org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(Randomized

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-4.x-Linux-Java7-64 - Build # 29 - Failure!

Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Linux-Java7-64/29/

All tests passed

Build Log:
[...truncated 10051 lines...]

check-spatial-uptodate:

jar-spatial:

check-grouping-uptodate:

jar-grouping:

check-queries-uptodate:

jar-queries:

check-queryparser-uptodate:

jar-queryparser:

prep-lucene-jars:

resolve-example:

ivy-availability-check:

ivy-fail:

ivy-configure:

resolve:
[ivy:retrieve] :: loading settings :: url = 
jar:file:/var/lib/jenkins/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml

common.init:

compile-lucene-core:

init:

clover.setup:

clover.info:
 [echo] 
 [echo]   Clover not found. Code coverage reports disabled.
 [echo] 

clover:

common.compile-core:
[mkdir] Created dir: 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java7-64/checkout/solr/build/solr-core/classes/java
[javac] Compiling 572 source files to 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java7-64/checkout/solr/build/solr-core/classes/java
[javac] warning: [options] bootstrap class path not set in conjunction with 
-source 1.6
[javac] 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java7-64/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:29:
 error: cannot find symbol
[javac] import org.apache.lucene.index.Payload;
[javac]   ^
[javac]   symbol:   class Payload
[javac]   location: package org.apache.lucene.index
[javac] 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java7-64/checkout/solr/core/src/java/org/apache/solr/schema/JsonPreAnalyzedParser.java:22:
 error: cannot find symbol
[javac] import org.apache.lucene.index.Payload;
[javac]   ^
[javac]   symbol:   class Payload
[javac]   location: package org.apache.lucene.index
[javac] 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java7-64/checkout/solr/core/src/java/org/apache/solr/schema/SimplePreAnalyzedParser.java:36:
 error: cannot find symbol
[javac] import org.apache.lucene.index.Payload;
[javac]   ^
[javac]   symbol:   class Payload
[javac]   location: package org.apache.lucene.index
[javac] 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java7-64/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:276:
 error: cannot find symbol
[javac]   if (value instanceof Payload) {
[javac]^
[javac]   symbol: class Payload
[javac] 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java7-64/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:277:
 error: cannot find symbol
[javac] final Payload p = (Payload) value;
[javac]   ^
[javac]   symbol: class Payload
[javac] 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java7-64/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:277:
 error: cannot find symbol
[javac] final Payload p = (Payload) value;
[javac]^
[javac]   symbol: class Payload
[javac] 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java7-64/checkout/solr/core/src/java/org/apache/solr/schema/JsonPreAnalyzedParser.java:174:
 error: cannot find symbol
[javac]   p.setPayload(new Payload(data));
[javac]^
[javac]   symbol:   class Payload
[javac]   location: class JsonPreAnalyzedParser
[javac] 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java7-64/checkout/solr/core/src/java/org/apache/solr/schema/JsonPreAnalyzedParser.java:251:
 error: cannot find symbol
[javac]   Payload p = ((PayloadAttribute)att).getPayload();
[javac]   ^
[javac]   symbol:   class Payload
[javac]   location: class JsonPreAnalyzedParser
[javac] 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java7-64/checkout/solr/core/src/java/org/apache/solr/schema/SimplePreAnalyzedParser.java:440:
 error: cannot find symbol
[javac]   p.setPayload(new Payload(data));
[javac]^
[javac]   symbol:   class Payload
[javac]   location: class SimplePreAnalyzedParser
[javac] 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux-Java7-64/checkout/solr/core/src/java/org/apache/solr/schema/SimplePreAnalyzedParser.java:501:
 error: cannot find symbol
[javac]   Payload p = ((PayloadAttribute)att).getPayload();
[javac]   ^
[javac]   symbol:   class Payload
[javac]   location: class SimplePreAnalyzedParser
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.
[javac] 10 errors
[...truncated 17

Re: [JENKINS] Lucene-Solr-tests-only-4.x - Build # 31 - Failure

2012-06-08 Thread Robert Muir

I think the merge done here in 1348227 must have been done from the
lucene/ directory.

I merged the rest (r1348248)

On Fri, Jun 8, 2012 at 4:58 PM, Apache Jenkins Server
jenk...@builds.apache.org wrote:
 Build: https://builds.apache.org/job/Lucene-Solr-tests-only-4.x/31/

 All tests passed

 Build Log:
 [...truncated 9163 lines...]
 jar-misc:

 check-spatial-uptodate:

 jar-spatial:

 check-grouping-uptodate:

 jar-grouping:

 check-queries-uptodate:

 jar-queries:

 check-queryparser-uptodate:

 jar-queryparser:

 prep-lucene-jars:

 resolve-example:

 ivy-availability-check:

 ivy-fail:

 ivy-configure:

 resolve:
 [ivy:retrieve] :: loading settings :: url = 
 jar:file:/home/hudson/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml

 common.init:

 compile-lucene-core:

 init:

 clover.setup:

 clover.info:
     [echo]
     [echo]       Clover not found. Code coverage reports disabled.
     [echo]

 clover:

 common.compile-core:
    [mkdir] Created dir: 
 /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/build/solr-core/classes/java
    [javac] Compiling 572 source files to 
 /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/build/solr-core/classes/java
    [javac] 
 /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:29:
  cannot find symbol
    [javac] symbol  : class Payload
    [javac] location: package org.apache.lucene.index
    [javac] import org.apache.lucene.index.Payload;
    [javac]                               ^
    [javac] 
 /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/schema/JsonPreAnalyzedParser.java:22:
  cannot find symbol
    [javac] symbol  : class Payload
    [javac] location: package org.apache.lucene.index
    [javac] import org.apache.lucene.index.Payload;
    [javac]                               ^
    [javac] 
 /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/schema/SimplePreAnalyzedParser.java:36:
  cannot find symbol
    [javac] symbol  : class Payload
    [javac] location: package org.apache.lucene.index
    [javac] import org.apache.lucene.index.Payload;
    [javac]                               ^
    [javac] 
 /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:276:
  cannot find symbol
    [javac] symbol: class Payload
    [javac]           if (value instanceof Payload) {
    [javac]                                ^
    [javac] 
 /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:277:
  cannot find symbol
    [javac] symbol: class Payload
    [javac]             final Payload p = (Payload) value;
    [javac]                   ^
    [javac] 
 /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/handler/AnalysisRequestHandlerBase.java:277:
  cannot find symbol
    [javac] symbol: class Payload
    [javac]             final Payload p = (Payload) value;
    [javac]                                ^
    [javac] 
 /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/schema/JsonPreAnalyzedParser.java:174:
  cannot find symbol
    [javac] symbol  : class Payload
    [javac] location: class org.apache.solr.schema.JsonPreAnalyzedParser
    [javac]               p.setPayload(new Payload(data));
    [javac]                                ^
    [javac] 
 /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/schema/JsonPreAnalyzedParser.java:251:
  cannot find symbol
    [javac] symbol  : class Payload
    [javac] location: class org.apache.solr.schema.JsonPreAnalyzedParser
    [javac]               Payload p = ((PayloadAttribute)att).getPayload();
    [javac]               ^
    [javac] 
 /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/schema/SimplePreAnalyzedParser.java:440:
  cannot find symbol
    [javac] symbol  : class Payload
    [javac] location: class org.apache.solr.schema.SimplePreAnalyzedParser
    [javac]           p.setPayload(new Payload(data));
    [javac]                            ^
    [javac] 
 /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-4.x/checkout/solr/core/src/java/org/apache/solr/schema/SimplePreAnalyzedParser.java:501:
  cannot find symbol
    [javac] symbol  : class Payload
    [javac] location: class org.apache.solr.schema.SimplePreAnalyzedParser
    [javac]               Payload p = ((PayloadAttribute)att).getPayload();
    [javac]               ^
    [javac] Note: Some input files use or override a

[JENKINS] Lucene-Solr-4.x-Windows-Java6-64 - Build # 25 - Failure!

Build: 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java6-64/25/

All tests passed

Build Log:
[...truncated 9091 lines...]
jar-misc:

check-spatial-uptodate:

jar-spatial:

check-grouping-uptodate:

jar-grouping:

check-queries-uptodate:

jar-queries:

check-queryparser-uptodate:

jar-queryparser:

prep-lucene-jars:

resolve-example:

ivy-availability-check:

ivy-fail:

ivy-configure:

resolve:
[ivy:retrieve] :: loading settings :: url = 
jar:file:/C:/Users/JenkinsSlave/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml

common.init:

compile-lucene-core:

init:

clover.setup:

clover.info:
 [echo] 
 [echo]   Clover not found. Code coverage reports disabled.
 [echo] 

clover:

common.compile-core:
[mkdir] Created dir: 
C:\Jenkins\workspace\Lucene-Solr-4.x-Windows-Java6-64\solr\build\solr-core\classes\java
[javac] Compiling 572 source files to 
C:\Jenkins\workspace\Lucene-Solr-4.x-Windows-Java6-64\solr\build\solr-core\classes\java
[javac] 
C:\Jenkins\workspace\Lucene-Solr-4.x-Windows-Java6-64\solr\core\src\java\org\apache\solr\handler\AnalysisRequestHandlerBase.java:29:
 cannot find symbol
[javac] symbol  : class Payload
[javac] location: package org.apache.lucene.index
[javac] import org.apache.lucene.index.Payload;
[javac]   ^
[javac] 
C:\Jenkins\workspace\Lucene-Solr-4.x-Windows-Java6-64\solr\core\src\java\org\apache\solr\schema\JsonPreAnalyzedParser.java:22:
 cannot find symbol
[javac] symbol  : class Payload
[javac] location: package org.apache.lucene.index
[javac] import org.apache.lucene.index.Payload;
[javac]   ^
[javac] 
C:\Jenkins\workspace\Lucene-Solr-4.x-Windows-Java6-64\solr\core\src\java\org\apache\solr\schema\SimplePreAnalyzedParser.java:36:
 cannot find symbol
[javac] symbol  : class Payload
[javac] location: package org.apache.lucene.index
[javac] import org.apache.lucene.index.Payload;
[javac]   ^
[javac] 
C:\Jenkins\workspace\Lucene-Solr-4.x-Windows-Java6-64\solr\core\src\java\org\apache\solr\handler\AnalysisRequestHandlerBase.java:276:
 cannot find symbol
[javac] symbol: class Payload
[javac]   if (value instanceof Payload) {
[javac]^
[javac] 
C:\Jenkins\workspace\Lucene-Solr-4.x-Windows-Java6-64\solr\core\src\java\org\apache\solr\handler\AnalysisRequestHandlerBase.java:277:
 cannot find symbol
[javac] symbol: class Payload
[javac] final Payload p = (Payload) value;
[javac]   ^
[javac] 
C:\Jenkins\workspace\Lucene-Solr-4.x-Windows-Java6-64\solr\core\src\java\org\apache\solr\handler\AnalysisRequestHandlerBase.java:277:
 cannot find symbol
[javac] symbol: class Payload
[javac] final Payload p = (Payload) value;
[javac]^
[javac] 
C:\Jenkins\workspace\Lucene-Solr-4.x-Windows-Java6-64\solr\core\src\java\org\apache\solr\schema\JsonPreAnalyzedParser.java:174:
 cannot find symbol
[javac] symbol  : class Payload
[javac] location: class org.apache.solr.schema.JsonPreAnalyzedParser
[javac]   p.setPayload(new Payload(data));
[javac]^
[javac] 
C:\Jenkins\workspace\Lucene-Solr-4.x-Windows-Java6-64\solr\core\src\java\org\apache\solr\schema\JsonPreAnalyzedParser.java:251:
 cannot find symbol
[javac] symbol  : class Payload
[javac] location: class org.apache.solr.schema.JsonPreAnalyzedParser
[javac]   Payload p = ((PayloadAttribute)att).getPayload();
[javac]   ^
[javac] 
C:\Jenkins\workspace\Lucene-Solr-4.x-Windows-Java6-64\solr\core\src\java\org\apache\solr\schema\SimplePreAnalyzedParser.java:440:
 cannot find symbol
[javac] symbol  : class Payload
[javac] location: class org.apache.solr.schema.SimplePreAnalyzedParser
[javac]   p.setPayload(new Payload(data));
[javac]^
[javac] 
C:\Jenkins\workspace\Lucene-Solr-4.x-Windows-Java6-64\solr\core\src\java\org\apache\solr\schema\SimplePreAnalyzedParser.java:501:
 cannot find symbol
[javac] symbol  : class Payload
[javac] location: class org.apache.solr.schema.SimplePreAnalyzedParser
[javac]   Payload p = ((PayloadAttribute)att).getPayload();
[javac]   ^
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.
[javac] 10 errors
[...truncated 15 lines...]

[...truncated 9207 lines...]

[...truncated 9207 lines...]

[...truncated 9207 lines...]

[...truncated 9207 lines...]

[...truncated 9207 lines...]



-
To

[JENKINS] Lucene-Solr-4.x-Linux-Java6-64 - Build # 36 - Failure!