[jira] [Created] (SOLR-2592) Pluggable shard lookup mechanism for SolrCloud
Pluggable shard lookup mechanism for SolrCloud -- Key: SOLR-2592 URL: https://issues.apache.org/jira/browse/SOLR-2592 Project: Solr Issue Type: New Feature Components: SolrCloud Affects Versions: 4.0 Reporter: Noble Paul If the data in a cloud can be partitioned on some criteria (say range, hash, attribute value etc) It will be easy to narrow down the search to a smaller subset of shards and in effect can achieve more efficient search. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2583) Make external scoring more efficient (ExternalFileField, FileFloatSource)
[ https://issues.apache.org/jira/browse/SOLR-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049674#comment-13049674 ] Martin Grotzke commented on SOLR-2583: -- The test that produced this output can be found in my lucene-solr fork on github: https://github.com/magro/lucene-solr/commit/b9af87b1 The test method that was executed was testCompareMemoryUsage, for measuring memory usage I used http://code.google.com/p/memory-measurer/ and ran the test/jvm with -Xmx1G -javaagent:solr/lib/object-explorer.jar (just from eclipse). I just added another test, that uses a fixed size and an increasing number of puts (testCompareMemoryUsageWithFixSizeAndIncreasingNumPuts, https://github.com/magro/lucene-solr/blob/trunk/solr/src/test/org/apache/solr/search/function/FileFloatSourceMemoryTest.java#L56), with the following results: {noformat} Size: 100 NumPuts 1.000 (0,1%), CompactFloatArray 918.616, float[] 4.000.016, HashMap 72.128 NumPuts 10.000 (1,0%), CompactFloatArray 3.738.712,float[] 4.000.016, HashMap 701.696 NumPuts 50.000 (5,0%), CompactFloatArray 4.016.472,float[] 4.000.016, HashMap 3.383.104 NumPuts 55.000 (5,5%), CompactFloatArray 4.016.472,float[] 4.000.016, HashMap 3.949.120 NumPuts 60.000 (6,0%), CompactFloatArray 4.016.472,float[] 4.000.016, HashMap 4.254.848 NumPuts 100.000 (10,0%),CompactFloatArray 4.016.472,float[] 4.000.016, HashMap 6.622.272 NumPuts 500.000 (50,0%),CompactFloatArray 4.016.472,float[] 4.000.016, HashMap 27.262.976 NumPuts 1.000.000 (100,0%), CompactFloatArray 4.016.472,float[] 4.000.016, HashMap 44.649.664 {noformat} It seems that the HashMap is the most efficient solution up to ~5.5%. Starting from this threshold CompactFloatArray and float[] use less memory, while the CompactFloatArray has no advantages over float[] for puts 5%. Therefore I'd suggest that we use an adaptive strategy that uses a HashMap up to 5,5% of number of scores compared to numdocs, and starting from this threshold the original float[] approach is used. What do you say? Make external scoring more efficient (ExternalFileField, FileFloatSource) - Key: SOLR-2583 URL: https://issues.apache.org/jira/browse/SOLR-2583 Project: Solr Issue Type: Improvement Components: search Reporter: Martin Grotzke Priority: Minor Attachments: FileFloatSource.java.patch, patch.txt External scoring eats much memory, depending on the number of documents in the index. The ExternalFileField (used for external scoring) uses FileFloatSource, where one FileFloatSource is created per external scoring file. FileFloatSource creates a float array with the size of the number of docs (this is also done if the file to load is not found). If there are much less entries in the scoring file than there are number of docs in total the big float array wastes much memory. This could be optimized by using a map of doc - score, so that the map contains as many entries as there are scoring entries in the external file, but not more. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2551) Check dataimport.properties for write access before starting import
[ https://issues.apache.org/jira/browse/SOLR-2551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar updated SOLR-2551: Fix Version/s: 4.0 3.3 Summary: Check dataimport.properties for write access before starting import (was: Checking dataimport.properties for write access during startup) Check dataimport.properties for write access before starting import --- Key: SOLR-2551 URL: https://issues.apache.org/jira/browse/SOLR-2551 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Affects Versions: 1.4.1, 3.1 Reporter: C S Assignee: Shalin Shekhar Mangar Priority: Minor Fix For: 3.3, 4.0 Attachments: SOLR-2551.patch A common mistake is that the /conf (respectively the dataimport.properties) file is not writable for solr. It would be great if that were detected on starting a dataimport job. Currently and import might grind away for days and fail if it can't write its timestamp to the dataimport.properties file. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2593) A new command 'split' for splitting index
A new command 'split' for splitting index - Key: SOLR-2593 URL: https://issues.apache.org/jira/browse/SOLR-2593 Project: Solr Issue Type: New Feature Reporter: Noble Paul If an index is too large/hot it would be desirable to split it out to another core There can be to be multiple strategies * random split of x or x% * fq=user:johndoe example example : command=splitsplit=20percentnewcore=my_new_index or command=splitfq=user:johndoenewcore=john_doe_index -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2551) Check dataimport.properties for write access before starting import
[ https://issues.apache.org/jira/browse/SOLR-2551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar resolved SOLR-2551. - Resolution: Fixed Committed revision 1135954 on trunk and 1135956 on branch_3x. Check dataimport.properties for write access before starting import --- Key: SOLR-2551 URL: https://issues.apache.org/jira/browse/SOLR-2551 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Affects Versions: 1.4.1, 3.1 Reporter: C S Assignee: Shalin Shekhar Mangar Priority: Minor Fix For: 3.3, 4.0 Attachments: SOLR-2551.patch A common mistake is that the /conf (respectively the dataimport.properties) file is not writable for solr. It would be great if that were detected on starting a dataimport job. Currently and import might grind away for days and fail if it can't write its timestamp to the dataimport.properties file. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2593) A new core admin command 'split' for splitting index
[ https://issues.apache.org/jira/browse/SOLR-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-2593: - Summary: A new core admin command 'split' for splitting index (was: A new command 'split' for splitting index) A new core admin command 'split' for splitting index Key: SOLR-2593 URL: https://issues.apache.org/jira/browse/SOLR-2593 Project: Solr Issue Type: New Feature Reporter: Noble Paul If an index is too large/hot it would be desirable to split it out to another core . This core may eventually be replicated out to another host. There can be to be multiple strategies * random split of x or x% * fq=user:johndoe example example : command=splitsplit=20percentnewcore=my_new_index or command=splitfq=user:johndoenewcore=john_doe_index -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2593) A new command 'split' for splitting index
[ https://issues.apache.org/jira/browse/SOLR-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-2593: - Description: If an index is too large/hot it would be desirable to split it out to another core . This core may eventually be replicated out to another host. There can be to be multiple strategies * random split of x or x% * fq=user:johndoe example example : command=splitsplit=20percentnewcore=my_new_index or command=splitfq=user:johndoenewcore=john_doe_index was: If an index is too large/hot it would be desirable to split it out to another core There can be to be multiple strategies * random split of x or x% * fq=user:johndoe example example : command=splitsplit=20percentnewcore=my_new_index or command=splitfq=user:johndoenewcore=john_doe_index A new command 'split' for splitting index - Key: SOLR-2593 URL: https://issues.apache.org/jira/browse/SOLR-2593 Project: Solr Issue Type: New Feature Reporter: Noble Paul If an index is too large/hot it would be desirable to split it out to another core . This core may eventually be replicated out to another host. There can be to be multiple strategies * random split of x or x% * fq=user:johndoe example example : command=splitsplit=20percentnewcore=my_new_index or command=splitfq=user:johndoenewcore=john_doe_index -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
SolrCloud: Automatic master failover
Hello, What is the status for automatic master failover (leader election) in SolrCloud? Is there an issue open? I'm interested in this and I've some time to take it up. -- Regards, Shalin Shekhar Mangar.
[jira] [Updated] (SOLR-2355) simple distrib update processor
[ https://issues.apache.org/jira/browse/SOLR-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar updated SOLR-2355: Component/s: update SolrCloud simple distrib update processor --- Key: SOLR-2355 URL: https://issues.apache.org/jira/browse/SOLR-2355 Project: Solr Issue Type: New Feature Components: SolrCloud, update Reporter: Yonik Seeley Priority: Minor Fix For: 3.3 Attachments: DistributedUpdateProcessorFactory.java, TestDistributedUpdate.java Here's a simple update processor for distributed indexing that I implemented years ago. It implements a simple hash(id) MOD nservers and just fails if any servers are down. Given the recent activity in distributed indexing, I thought this might be at least a good source for ideas. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2341) Shard distribution policy
[ https://issues.apache.org/jira/browse/SOLR-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar updated SOLR-2341: Component/s: SolrCloud Fix Version/s: 4.0 Shard distribution policy - Key: SOLR-2341 URL: https://issues.apache.org/jira/browse/SOLR-2341 Project: Solr Issue Type: New Feature Components: SolrCloud Reporter: William Mayor Priority: Minor Fix For: 4.0 Attachments: SOLR-2341.patch, SOLR-2341.patch A first crack at creating policies to be used for determining to which of a list of shards a document should go. See discussion on Distributed Indexing on dev-list. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2358) Distributing Indexing
[ https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar updated SOLR-2358: Component/s: update SolrCloud Fix Version/s: 4.0 Distributing Indexing - Key: SOLR-2358 URL: https://issues.apache.org/jira/browse/SOLR-2358 Project: Solr Issue Type: New Feature Components: SolrCloud, update Reporter: William Mayor Priority: Minor Fix For: 4.0 Attachments: SOLR-2358.patch The first steps towards creating distributed indexing functionality in Solr -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2593) A new core admin command 'split' for splitting index
[ https://issues.apache.org/jira/browse/SOLR-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar updated SOLR-2593: Fix Version/s: 4.0 A new core admin command 'split' for splitting index Key: SOLR-2593 URL: https://issues.apache.org/jira/browse/SOLR-2593 Project: Solr Issue Type: New Feature Reporter: Noble Paul Fix For: 4.0 If an index is too large/hot it would be desirable to split it out to another core . This core may eventually be replicated out to another host. There can be to be multiple strategies * random split of x or x% * fq=user:johndoe example example : command=splitsplit=20percentnewcore=my_new_index or command=splitfq=user:johndoenewcore=john_doe_index -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2594) Make Replication Handler cloud aware
Make Replication Handler cloud aware Key: SOLR-2594 URL: https://issues.apache.org/jira/browse/SOLR-2594 Project: Solr Issue Type: Improvement Components: replication (java), SolrCloud Reporter: Shalin Shekhar Mangar Fix For: 4.0 Replication handler should be cloud aware. It should be possible to switch roles from slave to master as well as switch masterUrls based on the cluster topology and state. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2595) Split and migrate indexes
Split and migrate indexes - Key: SOLR-2595 URL: https://issues.apache.org/jira/browse/SOLR-2595 Project: Solr Issue Type: New Feature Components: multicore, replication (java), SolrCloud Reporter: Shalin Shekhar Mangar Fix For: 4.0 When an shard's index grows too large or a shard becomes too loaded, it should be possible to split parts of a shard's index and migrate/merge to a less loaded node. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2596) Enhance CoreAdmin mergeindexes to use a core's index as the source
Enhance CoreAdmin mergeindexes to use a core's index as the source -- Key: SOLR-2596 URL: https://issues.apache.org/jira/browse/SOLR-2596 Project: Solr Issue Type: Improvement Components: multicore, update Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Fix For: 4.0 Enhance CoreAdmin mergeindexes to use a core's index as the source. Right now the mergeindexes command accepts a list of index directories on the local disk which is not very convenient. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2595) Split and migrate indexes
[ https://issues.apache.org/jira/browse/SOLR-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049703#comment-13049703 ] Shalin Shekhar Mangar commented on SOLR-2595: - Example: Lets say you have a core C1 on host H1 which you want to split and move a part of the index to core C2 on host H2 The sequence of operation can be: # Use SOLR-2593 to split C1 and move the part to be migrated into a temporary core, say S # Create a temporary core on H2 host, say T # Assign T to be a slave of S # When replication completes, use SOLR-2596 to merge T into C2 - perhaps update some ZK flags so that Some details still need to be figured out e.g. * What strategy to use for splitting? * How to delete the migrated part from the source index? * How to update the shard lookup and distributed indexing schemes for the migrated part? * What happens to writes during the migration? Should we disallow it? Split and migrate indexes - Key: SOLR-2595 URL: https://issues.apache.org/jira/browse/SOLR-2595 Project: Solr Issue Type: New Feature Components: multicore, replication (java), SolrCloud Reporter: Shalin Shekhar Mangar Fix For: 4.0 When an shard's index grows too large or a shard becomes too loaded, it should be possible to split parts of a shard's index and migrate/merge to a less loaded node. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1431) CommComponent abstracted
[ https://issues.apache.org/jira/browse/SOLR-1431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049705#comment-13049705 ] Noble Paul commented on SOLR-1431: -- Jason, the configuration which I have specified lets you do ShardHandler specific configuration. It goes well with the general Solr configuration. {code:xml} requestHandler name=standard class=solr.SearchHandler default=true !-- other params go here -- shardHandler class=HttpShardHandler !-- To be implemented-- int name=httpReadTimeOut1000/int int name=httpConnTimeOut5000/int /shardHandler /requestHandler {code} Creating a new instance per request is not wise. CommComponent abstracted Key: SOLR-1431 URL: https://issues.apache.org/jira/browse/SOLR-1431 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.0 Reporter: Jason Rutherglen Assignee: Mark Miller Fix For: 4.0 Attachments: SOLR-1431.patch, SOLR-1431.patch, SOLR-1431.patch, SOLR-1431.patch, SOLR-1431.patch, SOLR-1431.patch, SOLR-1431.patch, SOLR-1431.patch, SOLR-1431.patch We'll abstract CommComponent in this issue. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2583) Make external scoring more efficient (ExternalFileField, FileFloatSource)
[ https://issues.apache.org/jira/browse/SOLR-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049706#comment-13049706 ] Robert Muir commented on SOLR-2583: --- Are you sure real floats are actually needed? Why not use compactbytearray with smallfloat encoding? it would also good to measure performance... doesn't a hashmap have to box *per-docid* into an Integer for lookup? Make external scoring more efficient (ExternalFileField, FileFloatSource) - Key: SOLR-2583 URL: https://issues.apache.org/jira/browse/SOLR-2583 Project: Solr Issue Type: Improvement Components: search Reporter: Martin Grotzke Priority: Minor Attachments: FileFloatSource.java.patch, patch.txt External scoring eats much memory, depending on the number of documents in the index. The ExternalFileField (used for external scoring) uses FileFloatSource, where one FileFloatSource is created per external scoring file. FileFloatSource creates a float array with the size of the number of docs (this is also done if the file to load is not found). If there are much less entries in the scoring file than there are number of docs in total the big float array wastes much memory. This could be optimized by using a map of doc - score, so that the map contains as many entries as there are scoring entries in the external file, but not more. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2583) Make external scoring more efficient (ExternalFileField, FileFloatSource)
[ https://issues.apache.org/jira/browse/SOLR-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049709#comment-13049709 ] Robert Muir commented on SOLR-2583: --- bq. that uses a fixed size and an increasing number of puts I'm not certain how realistic that is, remember behind the scenes compactbytearray uses blocks, and if you touch every one (by putting every K docid or something) then you are just testing the worst case. Make external scoring more efficient (ExternalFileField, FileFloatSource) - Key: SOLR-2583 URL: https://issues.apache.org/jira/browse/SOLR-2583 Project: Solr Issue Type: Improvement Components: search Reporter: Martin Grotzke Priority: Minor Attachments: FileFloatSource.java.patch, patch.txt External scoring eats much memory, depending on the number of documents in the index. The ExternalFileField (used for external scoring) uses FileFloatSource, where one FileFloatSource is created per external scoring file. FileFloatSource creates a float array with the size of the number of docs (this is also done if the file to load is not found). If there are much less entries in the scoring file than there are number of docs in total the big float array wastes much memory. This could be optimized by using a map of doc - score, so that the map contains as many entries as there are scoring entries in the external file, but not more. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2593) A new core admin command 'split' for splitting index
[ https://issues.apache.org/jira/browse/SOLR-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049713#comment-13049713 ] Koji Sekiguchi commented on SOLR-2593: -- CoreAdminHandler uses action, not command. A new core admin command 'split' for splitting index Key: SOLR-2593 URL: https://issues.apache.org/jira/browse/SOLR-2593 Project: Solr Issue Type: New Feature Reporter: Noble Paul Fix For: 4.0 If an index is too large/hot it would be desirable to split it out to another core . This core may eventually be replicated out to another host. There can be to be multiple strategies * random split of x or x% * fq=user:johndoe example example : command=splitsplit=20percentnewcore=my_new_index or command=splitfq=user:johndoenewcore=john_doe_index -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3205) remove MultiTermQuery get/inc/clear totalNumberOfTerms
remove MultiTermQuery get/inc/clear totalNumberOfTerms -- Key: LUCENE-3205 URL: https://issues.apache.org/jira/browse/LUCENE-3205 Project: Lucene - Java Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-3205.patch This method is not correct if the index has more than one segment. Its also not thread safe, and it means calling query.rewrite() modifies the original query. All of these things add up to confusion, I think we should remove this from multitermquery, the only thing that uses it is the NRQ tests, which conditionalizes all the asserts anyway. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3205) remove MultiTermQuery get/inc/clear totalNumberOfTerms
[ https://issues.apache.org/jira/browse/LUCENE-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3205: Attachment: LUCENE-3205.patch remove MultiTermQuery get/inc/clear totalNumberOfTerms -- Key: LUCENE-3205 URL: https://issues.apache.org/jira/browse/LUCENE-3205 Project: Lucene - Java Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-3205.patch This method is not correct if the index has more than one segment. Its also not thread safe, and it means calling query.rewrite() modifies the original query. All of these things add up to confusion, I think we should remove this from multitermquery, the only thing that uses it is the NRQ tests, which conditionalizes all the asserts anyway. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2593) A new core admin command 'split' for splitting index
[ https://issues.apache.org/jira/browse/SOLR-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049727#comment-13049727 ] Peter Sturge commented on SOLR-2593: This is a really great idea, thanks! If it's possible, it would be cool to have config parameters to: create a new core overwrite an existing core rename an existing core, then create (rolling backup) merge with an existing core (ever-growing, but kind of an accessible 'archive' index) A new core admin command 'split' for splitting index Key: SOLR-2593 URL: https://issues.apache.org/jira/browse/SOLR-2593 Project: Solr Issue Type: New Feature Reporter: Noble Paul Fix For: 4.0 If an index is too large/hot it would be desirable to split it out to another core . This core may eventually be replicated out to another host. There can be to be multiple strategies * random split of x or x% * fq=user:johndoe example example : command=splitsplit=20percentnewcore=my_new_index or command=splitfq=user:johndoenewcore=john_doe_index -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3205) remove MultiTermQuery get/inc/clear totalNumberOfTerms
[ https://issues.apache.org/jira/browse/LUCENE-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3205: Fix Version/s: 4.0 3.3 remove MultiTermQuery get/inc/clear totalNumberOfTerms -- Key: LUCENE-3205 URL: https://issues.apache.org/jira/browse/LUCENE-3205 Project: Lucene - Java Issue Type: Bug Reporter: Robert Muir Fix For: 3.3, 4.0 Attachments: LUCENE-3205.patch This method is not correct if the index has more than one segment. Its also not thread safe, and it means calling query.rewrite() modifies the original query. All of these things add up to confusion, I think we should remove this from multitermquery, the only thing that uses it is the NRQ tests, which conditionalizes all the asserts anyway. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3205) remove MultiTermQuery get/inc/clear totalNumberOfTerms
[ https://issues.apache.org/jira/browse/LUCENE-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049733#comment-13049733 ] Uwe Schindler commented on LUCENE-3205: --- I am perfectly fine to remove it. For analysis and debugging NRQ, it would still be good to have something, but I suggest to change the tests (I will simply request TermsEnum and count terms, possibly on MultiTerms). Should I take the issue and modify my tests? remove MultiTermQuery get/inc/clear totalNumberOfTerms -- Key: LUCENE-3205 URL: https://issues.apache.org/jira/browse/LUCENE-3205 Project: Lucene - Java Issue Type: Bug Reporter: Robert Muir Fix For: 3.3, 4.0 Attachments: LUCENE-3205.patch This method is not correct if the index has more than one segment. Its also not thread safe, and it means calling query.rewrite() modifies the original query. All of these things add up to confusion, I think we should remove this from multitermquery, the only thing that uses it is the NRQ tests, which conditionalizes all the asserts anyway. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3205) remove MultiTermQuery get/inc/clear totalNumberOfTerms
[ https://issues.apache.org/jira/browse/LUCENE-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049734#comment-13049734 ] Robert Muir commented on LUCENE-3205: - yes, please do? remove MultiTermQuery get/inc/clear totalNumberOfTerms -- Key: LUCENE-3205 URL: https://issues.apache.org/jira/browse/LUCENE-3205 Project: Lucene - Java Issue Type: Bug Reporter: Robert Muir Fix For: 3.3, 4.0 Attachments: LUCENE-3205.patch This method is not correct if the index has more than one segment. Its also not thread safe, and it means calling query.rewrite() modifies the original query. All of these things add up to confusion, I think we should remove this from multitermquery, the only thing that uses it is the NRQ tests, which conditionalizes all the asserts anyway. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: SolrCloud: Automatic master failover
On Wed, Jun 15, 2011 at 5:31 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: Hello, What is the status for automatic master failover (leader election) in SolrCloud? Is there an issue open? I'm interested in this and I've some time to take it up. Awesome! I'm hoping to find time next week myself to start doing more on cloud stuff! Do you mean master in the traditional sense (master in a whole index replication sense), or leader (where decisions for changing the configuration of a cluster get made)? -Yonik http://www.lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: XmlCharFilter
Yonik's law of patches states: A half-baked patch in Jira, with no documentation, no tests and no backwards compatibility is better than no patch at all. and what you've described sounds wy better than that! Anyway, I doubt you'll *ever* find someone on the dev list *complain* about opening up a JIRA on something when you're willing to attach a patch, especially one with unit tests Although you might have to nudge people to follow up on it... Best Erick On Tue, Jun 14, 2011 at 9:50 PM, Michael Sokolov soko...@ifactory.com wrote: I work with a lot of XML data sources and have needed to implement an analysis chain for Solr/Lucene that accepts XML. In the course of doing that, I found I needed something very much like HTMLCharFilter, but that does standard XML parsing (understands XML entities defined in an internal or external DTD, for example). So I wrote XmlCharFilter, which uses the Woodstox XML parser (already used by Solr). I think this could be useful for others, and it would be nice for me if it were committed here, so I'd like to contribute. Should I open a JIRA for this? Is there anybody that can spare the time to review? It is basically one class (plus a factory class) and has a fairly complete set of tests. -Mike Sokolov Engineering Directory iFactory.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: SolrCloud: Automatic master failover
On Wed, Jun 15, 2011 at 5:45 PM, Yonik Seeley yo...@lucidimagination.comwrote: Awesome! I'm hoping to find time next week myself to start doing more on cloud stuff! Do you mean master in the traditional sense (master in a whole index replication sense), or leader (where decisions for changing the configuration of a cluster get made)? In this particular case, I meant the traditional replication master. We'll need a cluster leader for a sharded cloud setup but lets solve that separately. -- Regards, Shalin Shekhar Mangar.
[jira] [Assigned] (LUCENE-3205) remove MultiTermQuery get/inc/clear totalNumberOfTerms
[ https://issues.apache.org/jira/browse/LUCENE-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler reassigned LUCENE-3205: - Assignee: Uwe Schindler remove MultiTermQuery get/inc/clear totalNumberOfTerms -- Key: LUCENE-3205 URL: https://issues.apache.org/jira/browse/LUCENE-3205 Project: Lucene - Java Issue Type: Bug Reporter: Robert Muir Assignee: Uwe Schindler Fix For: 3.3, 4.0 Attachments: LUCENE-3205.patch This method is not correct if the index has more than one segment. Its also not thread safe, and it means calling query.rewrite() modifies the original query. All of these things add up to confusion, I think we should remove this from multitermquery, the only thing that uses it is the NRQ tests, which conditionalizes all the asserts anyway. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: XmlCharFilter
On Wed, Jun 15, 2011 at 2:24 PM, Erick Erickson erickerick...@gmail.com wrote: Yonik's law of patches states: A half-baked patch in Jira, with no documentation, no tests and no backwards compatibility is better than no patch at all. +1 simon and what you've described sounds wy better than that! Anyway, I doubt you'll *ever* find someone on the dev list *complain* about opening up a JIRA on something when you're willing to attach a patch, especially one with unit tests Although you might have to nudge people to follow up on it... Best Erick On Tue, Jun 14, 2011 at 9:50 PM, Michael Sokolov soko...@ifactory.com wrote: I work with a lot of XML data sources and have needed to implement an analysis chain for Solr/Lucene that accepts XML. In the course of doing that, I found I needed something very much like HTMLCharFilter, but that does standard XML parsing (understands XML entities defined in an internal or external DTD, for example). So I wrote XmlCharFilter, which uses the Woodstox XML parser (already used by Solr). I think this could be useful for others, and it would be nice for me if it were committed here, so I'd like to contribute. Should I open a JIRA for this? Is there anybody that can spare the time to review? It is basically one class (plus a factory class) and has a fairly complete set of tests. -Mike Sokolov Engineering Directory iFactory.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: XmlCharFilter
OK - thanks for the encouragement, Erick; I'll open a JIRA then. -Mike On 06/15/2011 08:24 AM, Erick Erickson wrote: Yonik's law of patches states: A half-baked patch in Jira, with no documentation, no tests and no backwards compatibility is better than no patch at all. and what you've described sounds wy better than that! Anyway, I doubt you'll *ever* find someone on the dev list *complain* about opening up a JIRA on something when you're willing to attach a patch, especially one with unit tests Although you might have to nudge people to follow up on it... Best Erick On Tue, Jun 14, 2011 at 9:50 PM, Michael Sokolovsoko...@ifactory.com wrote: I work with a lot of XML data sources and have needed to implement an analysis chain for Solr/Lucene that accepts XML. In the course of doing that, I found I needed something very much like HTMLCharFilter, but that does standard XML parsing (understands XML entities defined in an internal or external DTD, for example). So I wrote XmlCharFilter, which uses the Woodstox XML parser (already used by Solr). I think this could be useful for others, and it would be nice for me if it were committed here, so I'd like to contribute. Should I open a JIRA for this? Is there anybody that can spare the time to review? It is basically one class (plus a factory class) and has a fairly complete set of tests. -Mike Sokolov Engineering Directory iFactory.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2597) XmlCharFilter
XmlCharFilter - Key: SOLR-2597 URL: https://issues.apache.org/jira/browse/SOLR-2597 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 4.0 Reporter: Mike Sokolov This CharFilter processes incoming XML using the Woodstox parser, stripping all non-text content and remembering offsets, just like HTMLCharFilter, but respecting XML conventions like XML entities defined in a DTD. XmlCharFilter also provides the ability to exclude (and include) the content of certain named elements. In order to compute character offsets properly when mixed line termination styles are present (\r, \r\n), or when XML character entities (lt;, quot;, amp;) are present, we require a newer version of Woodstox (4.1.1) than is currently in solr/lib. The earlier versions of the parser could not report these entity events, so we couldn't tell the difference between and lt; and the offsets could be wrong. The upgraded version is in the patch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [jira] [Created] (SOLR-2597) XmlCharFilter
Did you mean Xml*Strip*CharFilter? koji -- http://www.rondhuit.com/en/ (11/06/15 22:12), Mike Sokolov (JIRA) wrote: XmlCharFilter - Key: SOLR-2597 URL: https://issues.apache.org/jira/browse/SOLR-2597 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 4.0 Reporter: Mike Sokolov This CharFilter processes incoming XML using the Woodstox parser, stripping all non-text content and remembering offsets, just like HTMLCharFilter, but respecting XML conventions like XML entities defined in a DTD. XmlCharFilter also provides the ability to exclude (and include) the content of certain named elements. In order to compute character offsets properly when mixed line termination styles are present (\r, \r\n), or when XML character entities (lt;,quot;,amp;) are present, we require a newer version of Woodstox (4.1.1) than is currently in solr/lib. The earlier versions of the parser could not report these entity events, so we couldn't tell the difference between andlt; and the offsets could be wrong. The upgraded version is in the patch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3197) Optimize runs forever if you keep deleting docs at the same time
[ https://issues.apache.org/jira/browse/LUCENE-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049786#comment-13049786 ] Michael McCandless commented on LUCENE-3197: Right, this has been the intended semantics of a background optimize for some time, ie, when it returns it only ensures that whatever was not optimized as of when it was called has been merged away. This already works correctly for newly added docs, meaning if you continue adding docs / flushing new segments while the optimize runs, it knows that the newly flushed segments do not have to be merged away. But for new deletions we are not handling it correctly, which leads to the forever running merges. Optimize runs forever if you keep deleting docs at the same time Key: LUCENE-3197 URL: https://issues.apache.org/jira/browse/LUCENE-3197 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Michael McCandless Assignee: Michael McCandless Priority: Minor Fix For: 3.3, 4.0 Because we cascade merges for an optimize... if you also delete documents while the merges are running, then the merge policy will see the resulting single segment as still not optimized (since it has pending deletes) and do a single-segment merge, and will repeat indefinitely (as long as your app keeps deleting docs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [jira] [Created] (SOLR-2597) XmlCharFilter
Perhaps that name would be more consistent with HTMLStripCharFilter, yes, but it wasn't the one I was using. Also - I mean to post a patch here, but left the important files on a machine which is inaccessible at the moment, so I will post this evening. -Mike On 06/15/2011 09:28 AM, Koji Sekiguchi wrote: Did you mean Xml*Strip*CharFilter? koji - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2598) exampledocs/books.json should use name instead of title
exampledocs/books.json should use name instead of title --- Key: SOLR-2598 URL: https://issues.apache.org/jira/browse/SOLR-2598 Project: Solr Issue Type: Improvement Reporter: Jan Høydahl Assignee: Jan Høydahl Priority: Minor Fix For: 3.3 The file exampledocs/books.json currently contains two books. But they do not show up in the default solr/browse interface because they use title instead of name, which the Velocity template does not show. Also we should include a few more books -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2554) RandomSortField values are cached in the FieldCache
[ https://issues.apache.org/jira/browse/SOLR-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley resolved SOLR-2554. Resolution: Fixed Fix Version/s: 3.3 RandomSortField values are cached in the FieldCache --- Key: SOLR-2554 URL: https://issues.apache.org/jira/browse/SOLR-2554 Project: Solr Issue Type: Bug Components: search Affects Versions: 3.1 Reporter: Vadim Geshel Fix For: 3.3 The values of RandomSortField get cached in the FieldCache. When using many RandomSortFields over time, this leads to running out of memory. This may be one of the cases already covered in SOLR- but I'm not sure. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2598) exampledocs/books.json should use name instead of title
[ https://issues.apache.org/jira/browse/SOLR-2598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl updated SOLR-2598: -- Attachment: (was: SOLR-2589.patch) exampledocs/books.json should use name instead of title --- Key: SOLR-2598 URL: https://issues.apache.org/jira/browse/SOLR-2598 Project: Solr Issue Type: Improvement Reporter: Jan Høydahl Assignee: Jan Høydahl Priority: Minor Fix For: 3.3 Attachments: SOLR-2598.patch The file exampledocs/books.json currently contains two books. But they do not show up in the default solr/browse interface because they use title instead of name, which the Velocity template does not show. Also we should include a few more books -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2598) exampledocs/books.json should use name instead of title
[ https://issues.apache.org/jira/browse/SOLR-2598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl updated SOLR-2598: -- Attachment: SOLR-2589.patch Attaching patch which changes title to name and adds two more books to the json file exampledocs/books.json should use name instead of title --- Key: SOLR-2598 URL: https://issues.apache.org/jira/browse/SOLR-2598 Project: Solr Issue Type: Improvement Reporter: Jan Høydahl Assignee: Jan Høydahl Priority: Minor Fix For: 3.3 Attachments: SOLR-2598.patch The file exampledocs/books.json currently contains two books. But they do not show up in the default solr/browse interface because they use title instead of name, which the Velocity template does not show. Also we should include a few more books -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2598) exampledocs/books.json should use name instead of title
[ https://issues.apache.org/jira/browse/SOLR-2598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl updated SOLR-2598: -- Attachment: SOLR-2598.patch Attaching patch which renames title to name and adds two more books exampledocs/books.json should use name instead of title --- Key: SOLR-2598 URL: https://issues.apache.org/jira/browse/SOLR-2598 Project: Solr Issue Type: Improvement Reporter: Jan Høydahl Assignee: Jan Høydahl Priority: Minor Fix For: 3.3 Attachments: SOLR-2598.patch The file exampledocs/books.json currently contains two books. But they do not show up in the default solr/browse interface because they use title instead of name, which the Velocity template does not show. Also we should include a few more books -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-3190) TestStressIndexing2 testMultiConfig failure
[ https://issues.apache.org/jira/browse/LUCENE-3190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer resolved LUCENE-3190. - Resolution: Fixed Lucene Fields: [New, Patch Available] (was: [New]) fixed in rev 1136086 TestStressIndexing2 testMultiConfig failure --- Key: LUCENE-3190 URL: https://issues.apache.org/jira/browse/LUCENE-3190 Project: Lucene - Java Issue Type: Bug Reporter: selckin Assignee: Simon Willnauer Attachments: LUCENE-3190.patch trunk: r1134311 reproducible {code} [junit] Testsuite: org.apache.lucene.index.TestStressIndexing2 [junit] Tests run: 1, Failures: 2, Errors: 0, Time elapsed: 0.882 sec [junit] [junit] - Standard Error - [junit] java.lang.AssertionError: ram was 460908 expected: 408216 flush mem: 395100 active: 65808 [junit] at org.apache.lucene.index.DocumentsWriterFlushControl.assertMemory(DocumentsWriterFlushControl.java:102) [junit] at org.apache.lucene.index.DocumentsWriterFlushControl.doAfterDocument(DocumentsWriterFlushControl.java:164) [junit] at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:380) [junit] at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1473) [junit] at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1445) [junit] at org.apache.lucene.index.TestStressIndexing2$IndexingThread.indexDoc(TestStressIndexing2.java:723) [junit] at org.apache.lucene.index.TestStressIndexing2$IndexingThread.run(TestStressIndexing2.java:757) [junit] NOTE: reproduce with: ant test -Dtestcase=TestStressIndexing2 -Dtestmethod=testMultiConfig -Dtests.seed=2571834029692482827:-8116419692655152763 [junit] NOTE: reproduce with: ant test -Dtestcase=TestStressIndexing2 -Dtestmethod=testMultiConfig -Dtests.seed=2571834029692482827:-8116419692655152763 [junit] The following exceptions were thrown by threads: [junit] *** Thread: Thread-0 *** [junit] junit.framework.AssertionFailedError: java.lang.AssertionError: ram was 460908 expected: 408216 flush mem: 395100 active: 65808 [junit] at junit.framework.Assert.fail(Assert.java:47) [junit] at org.apache.lucene.index.TestStressIndexing2$IndexingThread.run(TestStressIndexing2.java:762) [junit] NOTE: test params are: codec=RandomCodecProvider: {f33=Standard, f57=MockFixedIntBlock(blockSize=649), f11=Standard, f41=MockRandom, f40=Standard, f62=MockRandom, f75=Standard, f73=MockSep, f29=MockFixedIntBlock(blockSize=649), f83=MockRandom, f66=MockSep, f49=MockVariableIntBlock(baseBlockSize=9), f72=Pulsing(freqCutoff=7), f54=Standard, id=MockFixedIntBlock(blockSize=649), f80=MockRandom, f94=MockSep, f93=Pulsing(freqCutoff=7), f95=Standard}, locale=en_SG, timezone=Pacific/Palau [junit] NOTE: all tests run in this JVM: [junit] [TestStressIndexing2] [junit] NOTE: Linux 2.6.39-gentoo amd64/Sun Microsystems Inc. 1.6.0_25 (64-bit)/cpus=8,threads=1,free=133324528,total=158400512 [junit] - --- [junit] Testcase: testMultiConfig(org.apache.lucene.index.TestStressIndexing2): FAILED [junit] r1.numDocs()=17 vs r2.numDocs()=16 [junit] junit.framework.AssertionFailedError: r1.numDocs()=17 vs r2.numDocs()=16 [junit] at org.apache.lucene.index.TestStressIndexing2.verifyEquals(TestStressIndexing2.java:308) [junit] at org.apache.lucene.index.TestStressIndexing2.verifyEquals(TestStressIndexing2.java:278) [junit] at org.apache.lucene.index.TestStressIndexing2.testMultiConfig(TestStressIndexing2.java:124) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1403) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1321) [junit] [junit] [junit] Testcase: testMultiConfig(org.apache.lucene.index.TestStressIndexing2): FAILED [junit] Some threads threw uncaught exceptions! [junit] junit.framework.AssertionFailedError: Some threads threw uncaught exceptions! [junit] at org.apache.lucene.util.LuceneTestCase.tearDown(LuceneTestCase.java:603) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1403) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1321) [junit] [junit] [junit] Test org.apache.lucene.index.TestStressIndexing2 FAILED {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (LUCENE-3191) Add TopDocs.merge to merge multiple TopDocs
[ https://issues.apache.org/jira/browse/LUCENE-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049908#comment-13049908 ] Michael McCandless commented on LUCENE-3191: For 3.x, I think we should make an exception to back-compat and break the API (changing FieldComp.value(..) to return T not Comparable; changing FieldDoc.fields from Comparable[] to Object[]). I'll advertise the break in CHANGES. Add TopDocs.merge to merge multiple TopDocs --- Key: LUCENE-3191 URL: https://issues.apache.org/jira/browse/LUCENE-3191 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.3, 4.0 Attachments: LUCENE-3191.patch, LUCENE-3191.patch, LUCENE-3191.patch, LUCENE-3191.patch It's not easy today to merge TopDocs, eg produced by multiple shards, supporting arbitrary Sort. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3191) Add TopDocs.merge to merge multiple TopDocs
[ https://issues.apache.org/jira/browse/LUCENE-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049928#comment-13049928 ] Uwe Schindler commented on LUCENE-3191: --- I think this has less impact on users. Two user types: - People using FieldDoc.fields[] would always cast the return type, so a simple recompile should be fine - People writing own FieldComparators must change return value of getValue() and maybe add generics (not required) - People that dont implement compareValue() will be also fine, as the default impl casts to Comparable and that will have the same behaviour The 3.x impl just have to fix FieldDocSortedHitQueue to use compareValue() and remove the negation for scores. Add TopDocs.merge to merge multiple TopDocs --- Key: LUCENE-3191 URL: https://issues.apache.org/jira/browse/LUCENE-3191 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.3, 4.0 Attachments: LUCENE-3191.patch, LUCENE-3191.patch, LUCENE-3191.patch, LUCENE-3191.patch It's not easy today to merge TopDocs, eg produced by multiple shards, supporting arbitrary Sort. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3191) Add TopDocs.merge to merge multiple TopDocs
[ https://issues.apache.org/jira/browse/LUCENE-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3191: --- Attachment: LUCENE-3191-3x.patch Patch for merging back to 3.x. Add TopDocs.merge to merge multiple TopDocs --- Key: LUCENE-3191 URL: https://issues.apache.org/jira/browse/LUCENE-3191 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.3, 4.0 Attachments: LUCENE-3191-3x.patch, LUCENE-3191.patch, LUCENE-3191.patch, LUCENE-3191.patch, LUCENE-3191.patch It's not easy today to merge TopDocs, eg produced by multiple shards, supporting arbitrary Sort. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3191) Add TopDocs.merge to merge multiple TopDocs
[ https://issues.apache.org/jira/browse/LUCENE-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049932#comment-13049932 ] Uwe Schindler commented on LUCENE-3191: --- Patch looks good, let the BackwardsPoliceman think about some possibilities to lower the risk of breaking code. Of course nothing sophisticated... Add TopDocs.merge to merge multiple TopDocs --- Key: LUCENE-3191 URL: https://issues.apache.org/jira/browse/LUCENE-3191 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.3, 4.0 Attachments: LUCENE-3191-3x.patch, LUCENE-3191.patch, LUCENE-3191.patch, LUCENE-3191.patch, LUCENE-3191.patch It's not easy today to merge TopDocs, eg produced by multiple shards, supporting arbitrary Sort. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[Lucene.Net] [jira] [Commented] (LUCENENET-417) implement streams as field values
[ https://issues.apache.org/jira/browse/LUCENENET-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049937#comment-13049937 ] Digy commented on LUCENENET-417: Maybe something like this doc.Add(new Field(name,- doc.Add(new Field(metadata,- doc.Add(new Field(content,part1- doc.Add(new Field(content,part2- doc.Add(new Field(content,partN- DIGY implement streams as field values - Key: LUCENENET-417 URL: https://issues.apache.org/jira/browse/LUCENENET-417 Project: Lucene.Net Issue Type: New Feature Components: Lucene.Net Core Reporter: Christopher Currens Attachments: StreamValues.patch Adding binary values to a field is an expensive operation, as the whole binary data must be loaded into memory and then written to the index. Adding the ability to use a stream instead of a byte array could not only speed up the indexing process, but reducing the memory footprint as well. -Java lucene has the ability to use a TextReader the both analyze and store text in the index.- Lucene.NET lacks the ability to store string data in the index via streams. This should be a feature added into Lucene .NET as well. My thoughts are to add another Field constructor, that is Field(string name, System.IO.Stream stream, System.Text.Encoding encoding), that will allow the text to be analyzed and stored into the index. Comments about this approach are greatly appreciated. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (LUCENE-3191) Add TopDocs.merge to merge multiple TopDocs
[ https://issues.apache.org/jira/browse/LUCENE-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3191: --- Attachment: LUCENE-3191.patch Small further patch for trunk: * Simplifies the API by moving shardIndex onto ScoreDoc * Fixes TopDocs.merge to return TopFieldDocs if the Sort != null * A couple FieldComparators must override compareValue because the values may be null. Add TopDocs.merge to merge multiple TopDocs --- Key: LUCENE-3191 URL: https://issues.apache.org/jira/browse/LUCENE-3191 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.3, 4.0 Attachments: LUCENE-3191-3x.patch, LUCENE-3191.patch, LUCENE-3191.patch, LUCENE-3191.patch, LUCENE-3191.patch, LUCENE-3191.patch It's not easy today to merge TopDocs, eg produced by multiple shards, supporting arbitrary Sort. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3197) Optimize runs forever if you keep deleting docs at the same time
[ https://issues.apache.org/jira/browse/LUCENE-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3197: --- Attachment: LUCENE-3197.patch Patch. Optimize runs forever if you keep deleting docs at the same time Key: LUCENE-3197 URL: https://issues.apache.org/jira/browse/LUCENE-3197 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Michael McCandless Assignee: Michael McCandless Priority: Minor Fix For: 3.3, 4.0 Attachments: LUCENE-3197.patch Because we cascade merges for an optimize... if you also delete documents while the merges are running, then the merge policy will see the resulting single segment as still not optimized (since it has pending deletes) and do a single-segment merge, and will repeat indefinitely (as long as your app keeps deleting docs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2599) FieldCopy Update Processor
FieldCopy Update Processor -- Key: SOLR-2599 URL: https://issues.apache.org/jira/browse/SOLR-2599 Project: Solr Issue Type: New Feature Components: update Reporter: Jan Høydahl Need an UpdateProcessor which can copy and move fields -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2597) XmlCharFilter
[ https://issues.apache.org/jira/browse/SOLR-2597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Sokolov updated SOLR-2597: --- Attachment: SOLR-2597.patch I tried to include the upgraded Woodstox jars, but I don't think I figured how to put binaries in the patch actually. What's needed are: http://repository.codehaus.org/org/codehaus/woodstox/woodstox-core-asl/4.1.1/woodstox-core-asl-4.1.1.jar and http://repository.codehaus.org/org/codehaus/woodstox/stax2-api/3.1.1/stax2-api-3.1.1.jar which replace the existing wstx-asl-xxx.jar. XmlCharFilter - Key: SOLR-2597 URL: https://issues.apache.org/jira/browse/SOLR-2597 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 4.0 Reporter: Mike Sokolov Attachments: SOLR-2597.patch This CharFilter processes incoming XML using the Woodstox parser, stripping all non-text content and remembering offsets, just like HTMLCharFilter, but respecting XML conventions like XML entities defined in a DTD. XmlCharFilter also provides the ability to exclude (and include) the content of certain named elements. In order to compute character offsets properly when mixed line termination styles are present (\r, \r\n), or when XML character entities (lt;, quot;, amp;) are present, we require a newer version of Woodstox (4.1.1) than is currently in solr/lib. The earlier versions of the parser could not report these entity events, so we couldn't tell the difference between and lt; and the offsets could be wrong. The upgraded version is in the patch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2593) A new core admin command 'split' for splitting index
[ https://issues.apache.org/jira/browse/SOLR-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050137#comment-13050137 ] Hoss Man commented on SOLR-2593: bq. If it's possible, it would be cool to have config parameters to: ...those seem like they should be discrete actions that can be taken after the split has happened. the simplest thing is to have a split action that _just_ creates a new core with the docs selected either using the fq (or randomly selection) and then use other CoreAdmin actions for the other stuff: rename, swap, swap+delete (the old one), merge ... merge is really the only one we don't have at a core level yet (i think) A new core admin command 'split' for splitting index Key: SOLR-2593 URL: https://issues.apache.org/jira/browse/SOLR-2593 Project: Solr Issue Type: New Feature Reporter: Noble Paul Fix For: 4.0 If an index is too large/hot it would be desirable to split it out to another core . This core may eventually be replicated out to another host. There can be to be multiple strategies * random split of x or x% * fq=user:johndoe example example : command=splitsplit=20percentnewcore=my_new_index or command=splitfq=user:johndoenewcore=john_doe_index -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2593) A new core admin command 'split' for splitting index
[ https://issues.apache.org/jira/browse/SOLR-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050139#comment-13050139 ] Hoss Man commented on SOLR-2593: one thing to think about when talking about the API is how the implementation will actually work. the fq type option is basically going to require making a full copy of hte index and then deleting by query. (unless i'm missing something) but for people who don't care how the index is partitioned a more efficient approach could probably happen by working at the segment level -- let the user say split off a hunk of at least 20% but no more then 50% and then you can look at individual segments and doc counts and see if it's possible to just move segments around (and maybe only do the copy+deleteByQuery logic on a single segment. A new core admin command 'split' for splitting index Key: SOLR-2593 URL: https://issues.apache.org/jira/browse/SOLR-2593 Project: Solr Issue Type: New Feature Reporter: Noble Paul Fix For: 4.0 If an index is too large/hot it would be desirable to split it out to another core . This core may eventually be replicated out to another host. There can be to be multiple strategies * random split of x or x% * fq=user:johndoe example example : command=splitsplit=20percentnewcore=my_new_index or command=splitfq=user:johndoenewcore=john_doe_index -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2596) Enhance CoreAdmin mergeindexes to use a core's index as the source
[ https://issues.apache.org/jira/browse/SOLR-2596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-2596. Resolution: Duplicate Fix Version/s: (was: 4.0) dup of SOLR-1331 Enhance CoreAdmin mergeindexes to use a core's index as the source -- Key: SOLR-2596 URL: https://issues.apache.org/jira/browse/SOLR-2596 Project: Solr Issue Type: Improvement Components: multicore, update Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Enhance CoreAdmin mergeindexes to use a core's index as the source. Right now the mergeindexes command accepts a list of index directories on the local disk which is not very convenient. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2600) ensure example schema.xml has some mention/explanation of per field similarity vs similarityprovider vs (global) similarity
ensure example schema.xml has some mention/explanation of per field similarity vs similarityprovider vs (global) similarity --- Key: SOLR-2600 URL: https://issues.apache.org/jira/browse/SOLR-2600 Project: Solr Issue Type: Task Components: documentation Reporter: Hoss Man Priority: Blocker Fix For: 4.0 when SOLR-2338 was commited, there wasn't yet clear understanding of how much the new feature per field similarity fields (vs custom similarity provider (vs global similarity factory)) should be advertised in the example configs, and what type of usage should be encouraged/promoted. it's likely that by the time 4.0 is released, new language specific field types will already demonstrate these features, and no additional artificial usages of them will be needed, but one way or another we should ensure that they are either demoed or mentioned in comments -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-3.x - Build # 409 - Failure
Build: https://builds.apache.org/job/Lucene-3.x/409/ 1 tests failed. FAILED: org.apache.lucene.util.fst.TestFSTs.testBigSet Error Message: Java heap space Stack Trace: java.lang.OutOfMemoryError: Java heap space at org.apache.lucene.util.fst.NodeHash.rehash(NodeHash.java:156) at org.apache.lucene.util.fst.NodeHash.add(NodeHash.java:126) at org.apache.lucene.util.fst.Builder.compileNode(Builder.java:118) at org.apache.lucene.util.fst.Builder.compilePrevTail(Builder.java:204) at org.apache.lucene.util.fst.Builder.add(Builder.java:321) at org.apache.lucene.util.fst.TestFSTs$FSTTester.doTest(TestFSTs.java:463) at org.apache.lucene.util.fst.TestFSTs$FSTTester.doTest(TestFSTs.java:359) at org.apache.lucene.util.fst.TestFSTs.doTest(TestFSTs.java:211) at org.apache.lucene.util.fst.TestFSTs.testRandomWords(TestFSTs.java:944) at org.apache.lucene.util.fst.TestFSTs.testBigSet(TestFSTs.java:964) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1268) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1186) Build Log (for compile errors): [...truncated 12477 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2597) XmlCharFilter
[ https://issues.apache.org/jira/browse/SOLR-2597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050184#comment-13050184 ] Hoss Man commented on SOLR-2597: Mike: thanks for the patch! as Koji mentioned on the mailing list, might want to consider naming this XmlStripCharFilter ... that was my first opinion, but reading the docs the include and exclude options definitely make it a bit more generic, so i'm leaning towards the opinion that XmlCharFilter is better. (there's an argument to be made that we should have an XmlStripCharFilter that only removes pi/comments/whitespace and resolves entities, and then a distinct XmlTagCharFilter that does the include/exclude -- but i'm guessing that would be less efficient since this makes it possible to do in one pass, and anyone who wants include/exclude at the tag level is almost certainly going to want the striping/entities as well) skiming the patch i'm +1 except for the new Random in the test case ... if you take a look at the existing test cases you'll see how you can hook into the solr test framework to get random values that are consistent with a global seed -- that way if a test fails, it will report which seed was used and people can reproduce it using system properties. would also be nice to have a test of the Factory (using a schema.xml declaration) but that's not nearly as important. and of course: would be great if the xml policeman uwe could review. bq. I tried to include the upgraded Woodstox jars, but I don't think I figured how to put binaries in the patch actually. it's not possible, so don't worry about it. the important thing is noting in a comment (like you did) exactly what new/upgraded jars are needed. XmlCharFilter - Key: SOLR-2597 URL: https://issues.apache.org/jira/browse/SOLR-2597 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 4.0 Reporter: Mike Sokolov Attachments: SOLR-2597.patch This CharFilter processes incoming XML using the Woodstox parser, stripping all non-text content and remembering offsets, just like HTMLCharFilter, but respecting XML conventions like XML entities defined in a DTD. XmlCharFilter also provides the ability to exclude (and include) the content of certain named elements. In order to compute character offsets properly when mixed line termination styles are present (\r, \r\n), or when XML character entities (lt;, quot;, amp;) are present, we require a newer version of Woodstox (4.1.1) than is currently in solr/lib. The earlier versions of the parser could not report these entity events, so we couldn't tell the difference between and lt; and the offsets could be wrong. The upgraded version is in the patch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-trunk - Build # 1596 - Still Failing
Build: https://builds.apache.org/job/Lucene-trunk/1596/ No tests ran. Build Log (for compile errors): [...truncated 11265 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2509) spellcheck: StringIndexOutOfBoundsException: String index out of range: -1
[ https://issues.apache.org/jira/browse/SOLR-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-2509: --- Description: Hi, I'm a french user of SOLR and i've encountered a problem since i've installed SOLR 3.1. I've got an error with this query : cle_frbr:LYSROUGE1149-73190 *SEE COMMENTS BELOW* I've tested to escape the minus char and the query worked : cle_frbr:LYSROUGE1149(BACKSLASH)-73190 But, strange fact, if i change one letter in my query it works : cle_frbr:LASROUGE1149-73190 I've tested the same query on SOLR 1.4 and it works ! Can someone test the query on next line on a 3.1 SOLR version and tell me if he have the same problem ? yourfield:LYSROUGE1149-73190 Where do the problem come from ? Thank you by advance for your help. Tom was: Hi, I'm a french user of SOLR and i've encountered a problem since i've installed SOLR 3.1. I've got an error with this query : cle_frbr:LYSROUGE1149-73190 The error is : HTTP ERROR 500 Problem accessing /solr/select. Reason: String index out of range: -1 java.lang.StringIndexOutOfBoundsException: String index out of range: -1 at java.lang.AbstractStringBuilder.replace(AbstractStringBuilder.java:797) at java.lang.StringBuilder.replace(StringBuilder.java:271) at org.apache.solr.spelling.SpellCheckCollator.getCollation(SpellCheckCollator.java:131) at org.apache.solr.spelling.SpellCheckCollator.collate(SpellCheckCollator.java:69) at org.apache.solr.handler.component.SpellCheckComponent.addCollationsToResponse(SpellCheckComponent.java:179) at org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:157) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1360) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) I've tested to escape the minus char and the query worked : cle_frbr:LYSROUGE1149(BACKSLASH)-73190 But, strange fact, if i change one letter in my query it works : cle_frbr:LASROUGE1149-73190 I've tested the same query on SOLR 1.4 and it works ! Can someone test the query on next line on a 3.1 SOLR version and tell me if he have the same problem ? yourfield:LYSROUGE1149-73190 Where do the problem come from ? Thank you by advance for your help. Tom Summary: spellcheck: StringIndexOutOfBoundsException: String index out of range: -1 (was: String index out of range: -1) Moved original stack trace out of description for brevity... {noformat} The error is : HTTP ERROR 500 Problem accessing /solr/select. Reason: String index out of range: -1 java.lang.StringIndexOutOfBoundsException: String index out of range: -1 at java.lang.AbstractStringBuilder.replace(AbstractStringBuilder.java:797) at java.lang.StringBuilder.replace(StringBuilder.java:271) at org.apache.solr.spelling.SpellCheckCollator.getCollation(SpellCheckCollator.java:131) at org.apache.solr.spelling.SpellCheckCollator.collate(SpellCheckCollator.java:69) at org.apache.solr.handler.component.SpellCheckComponent.addCollationsToResponse(SpellCheckComponent.java:179) at
[jira] [Commented] (SOLR-2509) spellcheck: StringIndexOutOfBoundsException: String index out of range: -1
[ https://issues.apache.org/jira/browse/SOLR-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050212#comment-13050212 ] Hoss Man commented on SOLR-2509: the stack trace is different, but based on the fact that it has to do with using spellcheck.collate and - in the query this smells like it might be realted to SOLR-1630 spellcheck: StringIndexOutOfBoundsException: String index out of range: -1 -- Key: SOLR-2509 URL: https://issues.apache.org/jira/browse/SOLR-2509 Project: Solr Issue Type: Bug Affects Versions: 3.1 Environment: Debian Lenny JAVA Version 1.6.0_20 Reporter: Thomas Gambier Priority: Blocker Hi, I'm a french user of SOLR and i've encountered a problem since i've installed SOLR 3.1. I've got an error with this query : cle_frbr:LYSROUGE1149-73190 *SEE COMMENTS BELOW* I've tested to escape the minus char and the query worked : cle_frbr:LYSROUGE1149(BACKSLASH)-73190 But, strange fact, if i change one letter in my query it works : cle_frbr:LASROUGE1149-73190 I've tested the same query on SOLR 1.4 and it works ! Can someone test the query on next line on a 3.1 SOLR version and tell me if he have the same problem ? yourfield:LYSROUGE1149-73190 Where do the problem come from ? Thank you by advance for your help. Tom -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-1331) Support merging multiple cores
[ https://issues.apache.org/jira/browse/SOLR-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar reassigned SOLR-1331: --- Assignee: Shalin Shekhar Mangar Support merging multiple cores -- Key: SOLR-1331 URL: https://issues.apache.org/jira/browse/SOLR-1331 Project: Solr Issue Type: New Feature Components: multicore Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Fix For: 3.3 There should be a provision to merge one core with another. It should be possible to create a core, add documents to it and then just merge it into the main core which is serving requests. This way, the user will not need to know the filesystem as it is needed for SOLR-1051 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org