Re: Errors using the Embedded Solar Server
OK - I figured out the logging. Here is the logging output plus the console output and the stack trace: main] INFO org.apache.solr.core.SolrResourceLoader - new SolrResourceLoader for directory: '/Users/carlroberts/dev/solr-4.10.3/' [main] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/lib/commons-logging-1.2.jar' to classloader [main] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/lib/servlet-api.jar' to classloader [main] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/lib/slf4j-simple-1.7.5.jar' to classloader [main] INFO org.apache.solr.core.ConfigSolr - Loading container configuration from /Users/carlroberts/dev/solr-4.10.3/solr.xml [main] INFO org.apache.solr.core.CoreContainer - New CoreContainer 2050551931 db /Users/carlroberts/dev/solr-4.10.3/[main] INFO org.apache.solr.core.CoreContainer - Loading cores into CoreContainer [instanceDir=/Users/carlroberts/dev/solr-4.10.3/] [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting socketTimeout to: 0 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting urlScheme to: null [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting connTimeout to: 0 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting maxConnectionsPerHost to: 20 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting corePoolSize to: 0 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting maximumPoolSize to: 2147483647 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting maxThreadIdleTime to: 5 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting sizeOfQueue to: -1 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting fairnessPolicy to: false [main] INFO org.apache.solr.update.UpdateShardHandler - Creating UpdateShardHandler HTTP client with params: socketTimeout=0connTimeout=0retry=false [main] INFO org.apache.solr.logging.LogWatcher - SLF4J impl is org.slf4j.impl.SimpleLoggerFactory [main] INFO org.apache.solr.logging.LogWatcher - No LogWatcher configured [main] INFO org.apache.solr.core.CoreContainer - Host Name: null [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - new SolrResourceLoader for directory: '/Users/carlroberts/dev/solr-4.10.3/db/' [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrConfig - Adding specified lib dirs to ClassLoader [coreLoadExecutor-5-thread-1] WARN org.apache.solr.core.SolrResourceLoader - Can't find (or read) directory to add to classloader: ../../../contrib/extraction/lib (resolved as: /Users/carlroberts/dev/solr-4.10.3/db/../../../contrib/extraction/lib). [coreLoadExecutor-5-thread-1] WARN org.apache.solr.core.SolrResourceLoader - Can't find (or read) directory to add to classloader: ../../../dist/ (resolved as: /Users/carlroberts/dev/solr-4.10.3/db/../../../dist). [coreLoadExecutor-5-thread-1] WARN org.apache.solr.core.SolrResourceLoader - Can't find (or read) directory to add to classloader: ../../../contrib/clustering/lib/ (resolved as: /Users/carlroberts/dev/solr-4.10.3/db/../../../contrib/clustering/lib). [coreLoadExecutor-5-thread-1] WARN org.apache.solr.core.SolrResourceLoader - Can't find (or read) directory to add to classloader: ../../../dist/ (resolved as: /Users/carlroberts/dev/solr-4.10.3/db/../../../dist). [coreLoadExecutor-5-thread-1] WARN org.apache.solr.core.SolrResourceLoader - Can't find (or read) directory to add to classloader: ../../../contrib/langid/lib/ (resolved as: /Users/carlroberts/dev/solr-4.10.3/db/../../../contrib/langid/lib). [coreLoadExecutor-5-thread-1] WARN org.apache.solr.core.SolrResourceLoader - Can't find (or read) directory to add to classloader: ../../../dist/ (resolved as: /Users/carlroberts/dev/solr-4.10.3/db/../../../dist). [coreLoadExecutor-5-thread-1] WARN org.apache.solr.core.SolrResourceLoader - Can't find (or read) directory to add to classloader: ../../../contrib/velocity/lib (resolved as: /Users/carlroberts/dev/solr-4.10.3/db/../../../contrib/velocity/lib). [coreLoadExecutor-5-thread-1] WARN org.apache.solr.core.SolrResourceLoader - Can't find (or read) directory to add to classloader: ../../../dist/ (resolved as: /Users/carlroberts/dev/solr-4.10.3/db/../../../dist). [coreLoadExecutor-5-thread-1] INFO org.apache.solr.update.SolrIndexConfig - IndexWriter infoStream solr logging is enabled [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrConfig - Using Lucene MatchVersion: 4.10.3 [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.Config - Loaded SolrConfig: solrconfig.xml [coreLoadExecutor-5-thread-1] INFO org.apache.solr.schema.IndexSchema - Reading Solr Schema from /Users/carlroberts/dev/solr-4.10.3/db/conf/schema.xml
Re: Errors using the Embedded Solar Server
On 1/21/2015 9:56 AM, Carl Roberts wrote: BTW - I don't know if this will help also, but here is a screen shot of my classpath in eclipse. The URL in the slf4j error message does describe the problem with logging, but if you know nothing about slf4j, it probably won't help you much. Make sure you're including all the jars from example/lib/ext in the download in your project's lib directory, as well as the log4j.properties file from the resources directory. You will probably need to edit the log4j.properties file to change the location of the logfile. Thanks, Shawn
Solr Recovery process
Hello Everyone, I am hitting a few issues with solr replicas going into recovery and then doing a full index copy.I am trying to understand the solr recovery process.I have read a few blogs on this and saw that when leader notifies a replica to recover(in my case it is due to connection resets) it will try to do a peer sync first and if the missed updates are more than 100 it will do a full index copy from the leader.I am trying to understand what peer sync is and where does tlog come into picture.Are tlogs replayed only during server restart?.Can some one help me with this? Thanks, Nishanth
Re: AW: AW: AW: transactions@Solr(J)
On 1/21/2015 9:15 AM, Clemens Wyss DEV wrote: What I meant is: If I do SolrServer#rollback after 11 documents were added, will then only 1 or all 11 docments that have been added in the SolrServer-tranascation/context? If autoCommit is set to 10 docs and openSearcher is true, it would roll back one document, assuming that the 11th document didn't make it into the index before the commit actually started. I'm not sure if the autoCommit settings are perfectly atomic, or if there would be enough of a time gap to allow a few more documents to make it in. If you added the documents one at a time, I could be sure that the rollback would be one document. If openSearcher is false, I'm not sure whether it would do one or 11. I just don't know enough about the underlying API. Thanks, Shawn
Re: Errors using the Embedded Solar Server
I had to hardcode the path in solrconfig.xml from this: ${solr.install.dir:} to this: /Users/carlroberts/dev/solr-4.10.3/ to avoid the classloader warnings, but I still get the same error. I am not sure where the ${solr.install.dir:} value gets pulled from but apparently that is not working. Here is the new output: [main] INFO org.apache.solr.core.SolrResourceLoader - new SolrResourceLoader for directory: '/Users/carlroberts/dev/solr-4.10.3/' [main] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/lib/commons-logging-1.2.jar' to classloader [main] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/lib/servlet-api.jar' to classloader [main] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/lib/slf4j-simple-1.7.5.jar' to classloader [main] INFO org.apache.solr.core.ConfigSolr - Loading container configuration from /Users/carlroberts/dev/solr-4.10.3/solr.xml [main] INFO org.apache.solr.core.CoreContainer - New CoreContainer 1023143764 [main] INFO org.apache.solr.core.CoreContainer - Loading cores into CoreContainer [instanceDir=/Users/carlroberts/dev/solr-4.10.3/] db /Users/carlroberts/dev/solr-4.10.3/ [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting socketTimeout to: 0 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting urlScheme to: null [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting connTimeout to: 0 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting maxConnectionsPerHost to: 20 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting corePoolSize to: 0 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting maximumPoolSize to: 2147483647 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting maxThreadIdleTime to: 5 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting sizeOfQueue to: -1 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting fairnessPolicy to: false [main] INFO org.apache.solr.update.UpdateShardHandler - Creating UpdateShardHandler HTTP client with params: socketTimeout=0connTimeout=0retry=false [main] INFO org.apache.solr.logging.LogWatcher - SLF4J impl is org.slf4j.impl.SimpleLoggerFactory [main] INFO org.apache.solr.logging.LogWatcher - No LogWatcher configured [main] INFO org.apache.solr.core.CoreContainer - Host Name: null [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - new SolrResourceLoader for directory: '/Users/carlroberts/dev/solr-4.10.3/db/' [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrConfig - Adding specified lib dirs to ClassLoader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-core-0.7.2.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-dom-0.7.2.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/aspectjrt-1.6.11.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/bcmail-jdk15-1.45.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/bcprov-jdk15-1.45.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/boilerpipe-1.1.0.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/commons-compress-1.7.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/dom4j-1.6.1.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/fontbox-1.8.4.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/icu4j-53.1.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/isoparser-1.0-RC-1.jar' to classloader [coreLoadExecutor-5-thread-1] INFO
Re: Errors using the Embedded Solar Server
Ah, OK, you need to include a logging jar in your classpath - the log4j and slf4j-log4j jars in the solr distribution will help here. Once you've got some logging set up, then you should be able to work out what's going wrong! Alan Woodward www.flax.co.uk On 21 Jan 2015, at 16:53, Carl Roberts wrote: So far I have not been able to get the logging to work - here is what I get in the console prior to the exception: SLF4J: Failed to load class org.slf4j.impl.StaticLoggerBinder. SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. db /Users/carlroberts/dev/solr-4.10.3/ false {} [] /Users/carlroberts/dev/solr-4.10.3/ On 1/21/15, 11:50 AM, Alan Woodward wrote: That certainly looks like it ought to work. Is there log output that you could show us as well? Alan Woodward www.flax.co.uk On 21 Jan 2015, at 16:09, Carl Roberts wrote: Hi, I have downloaded the code and documentation for Solr version 4.10.3. I am trying to follow SolrJ Wiki guide and I am running into errors. The latest error is this one: Exception in thread main org.apache.solr.common.SolrException: No such core: db at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:112) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54) at solr.Test.main(Test.java:39) My code is this: package solr; import java.io.File; import java.io.IOException; import java.util.ArrayList; import java.util.Collection; import org.apache.solr.client.solrj.SolrServerException; import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer; import org.apache.solr.common.SolrInputDocument; import org.apache.solr.core.CoreContainer; import org.apache.solr.core.SolrCore; public class Test { public static void main(String [] args){ CoreContainer container = new CoreContainer(/Users/carlroberts/dev/solr-4.10.3); System.out.println(container.getDefaultCoreName()); System.out.println(container.getSolrHome()); container.load(); System.out.println(container.isLoaded(db)); System.out.println(container.getCoreInitFailures()); CollectionSolrCore cores = container.getCores(); System.out.println(cores); EmbeddedSolrServer server = new EmbeddedSolrServer( container, db ); SolrInputDocument doc1 = new SolrInputDocument(); doc1.addField( id, id1, 1.0f ); doc1.addField( name, doc1, 1.0f ); doc1.addField( price, 10 ); SolrInputDocument doc2 = new SolrInputDocument(); doc2.addField( id, id2, 1.0f ); doc2.addField( name, doc2, 1.0f ); doc2.addField( price, 20 ); CollectionSolrInputDocument docs = new ArrayListSolrInputDocument(); docs.add( doc1 ); docs.add( doc2 ); try{ server.add( docs ); server.commit(); server.deleteByQuery( *:* ); }catch(IOException e){ e.printStackTrace(); }catch(SolrServerException e){ e.printStackTrace(); } } } My solr.xml file is this: ?xml version=1.0 encoding=UTF-8 ? !-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the License); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. -- !-- This is an example of a simple solr.xml file for configuring one or more Solr Cores, as well as allowing Cores to be added, removed, and reloaded via HTTP requests. More information about options available in this configuration file, and Solr Core administration can be found online: http://wiki.apache.org/solr/CoreAdmin -- solr cores adminPath=/admin/cores defaultCoreName=db core default=true instanceDir=db/ name=db/ /cores /solr And my db/conf directory was copied from example/solr/collection/conf directory and it contains the solrconfig.xml file and schema.xml file. I have noticed that the documentation that shows how to use the EmbeddedSolarServer is outdated as it indicates I should use
Re: Errors using the Embedded Solar Server
Hi, Could there be a bug in the EmbeddedSolrServer that is causing this? Is it still supported in version 4.10.3? If it is, can someone please provide me assistance with this? Regards, Joe On 1/21/15, 12:18 PM, Carl Roberts wrote: I had to hardcode the path in solrconfig.xml from this: ${solr.install.dir:} to this: /Users/carlroberts/dev/solr-4.10.3/ to avoid the classloader warnings, but I still get the same error. I am not sure where the ${solr.install.dir:} value gets pulled from but apparently that is not working. Here is the new output: [main] INFO org.apache.solr.core.SolrResourceLoader - new SolrResourceLoader for directory: '/Users/carlroberts/dev/solr-4.10.3/' [main] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/lib/commons-logging-1.2.jar' to classloader [main] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/lib/servlet-api.jar' to classloader [main] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/lib/slf4j-simple-1.7.5.jar' to classloader [main] INFO org.apache.solr.core.ConfigSolr - Loading container configuration from /Users/carlroberts/dev/solr-4.10.3/solr.xml [main] INFO org.apache.solr.core.CoreContainer - New CoreContainer 1023143764 [main] INFO org.apache.solr.core.CoreContainer - Loading cores into CoreContainer [instanceDir=/Users/carlroberts/dev/solr-4.10.3/] db /Users/carlroberts/dev/solr-4.10.3/ [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting socketTimeout to: 0 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting urlScheme to: null [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting connTimeout to: 0 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting maxConnectionsPerHost to: 20 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting corePoolSize to: 0 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting maximumPoolSize to: 2147483647 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting maxThreadIdleTime to: 5 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting sizeOfQueue to: -1 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting fairnessPolicy to: false [main] INFO org.apache.solr.update.UpdateShardHandler - Creating UpdateShardHandler HTTP client with params: socketTimeout=0connTimeout=0retry=false [main] INFO org.apache.solr.logging.LogWatcher - SLF4J impl is org.slf4j.impl.SimpleLoggerFactory [main] INFO org.apache.solr.logging.LogWatcher - No LogWatcher configured [main] INFO org.apache.solr.core.CoreContainer - Host Name: null [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - new SolrResourceLoader for directory: '/Users/carlroberts/dev/solr-4.10.3/db/' [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrConfig - Adding specified lib dirs to ClassLoader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-core-0.7.2.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-dom-0.7.2.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/aspectjrt-1.6.11.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/bcmail-jdk15-1.45.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/bcprov-jdk15-1.45.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/boilerpipe-1.1.0.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/commons-compress-1.7.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/dom4j-1.6.1.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/fontbox-1.8.4.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/icu4j-53.1.jar' to
Solr 4.10.3 start up issue
Hi everyone - I posted a question on stackoverflow but in hindsight this would have been a better place to start. Below is the link. Basically I can't get the example working when using an external ZK cluster and auto-core discovery. Solr 4.10.1 works fine, but the newest release never gets new nodes into the active state. There are no errors or warnings, and compared to the log output of 4.10.1, the difference is that nodes never make it to leader election. Here is the stackoverflow question, along with the full log output: http://stackoverflow.com/questions/28004832/solr-4-10-3-is-not-proceeding-to-leader-election-on-new-cluster-startup-hangs Any help and guidance would be appreciated. Thanks! -- Darren
Is Solr a good candidate to index 100s of nodes in one XML file?
Hi, Is Solr a good candidate to index 100s of nodes in one XML file? I have an RSS feed XML file that has 100s of nodes with several elements in each node that I have to index, so I was planning to parse the XML with Stax and extract the data from each node and add it to Solr. There will always be only one one file to start with and then a second file as the RSS feeds supplies updates. I want to return certain fields of each node when I search certain fields of the same node. Is Solr overkill in this case? Should I just use Lucene instead? Regards, Joe
Ignore whitesapce, underscore using KeywordTokenizer... EdgeNGramFilter
I am trying to implement type-ahead suggestion for single field which should ignore whitesapce, underscore or special characters in autosuggest. It works as suggested by Alex using KeywordTokenizerFactory but how to ignore whitesapce, underscore... Example itemName data can be : ABC E12 : if user types ABCE suggestion should be ABC E12 ABCE_12 : if user types ABCE1 suggestion should be ABCE_12 Schema.xml field name=itemName type=text_general_edge_ngram indexed=true stored=true multiValued=false / fieldType name=text_general_edge_ngram class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=1 maxGramSize=15 side=front/ /analyzer analyzer type=query tokenizer class=solr.LowerCaseTokenizerFactory/ /analyzer /fieldType
Re: Errors using the Embedded Solar Server
Already did. And the logging gets me no closer to fixing the issue. Here is the logging. [main] INFO org.apache.solr.core.SolrResourceLoader - new SolrResourceLoader for directory: '/Users/carlroberts/dev/solr-4.10.3/' [main] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/lib/commons-logging-1.2.jar' to classloader [main] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/lib/servlet-api.jar' to classloader [main] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/lib/slf4j-simple-1.7.5.jar' to classloader [main] INFO org.apache.solr.core.ConfigSolr - Loading container configuration from /Users/carlroberts/dev/solr-4.10.3/solr.xml [main] INFO org.apache.solr.core.CoreContainer - New CoreContainer 1727098510 [main] INFO org.apache.solr.core.CoreContainer - Loading cores into CoreContainer [instanceDir=/Users/carlroberts/dev/solr-4.10.3/] [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting socketTimeout to: 0 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting urlScheme to: null [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting connTimeout to: 0 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting maxConnectionsPerHost to: 20 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting corePoolSize to: 0 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting maximumPoolSize to: 2147483647 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting maxThreadIdleTime to: 5 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting sizeOfQueue to: -1 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting fairnessPolicy to: false [main] INFO org.apache.solr.update.UpdateShardHandler - Creating UpdateShardHandler HTTP client with params: socketTimeout=0connTimeout=0retry=false [main] INFO org.apache.solr.logging.LogWatcher - SLF4J impl is org.slf4j.impl.SimpleLoggerFactory [main] INFO org.apache.solr.logging.LogWatcher - No LogWatcher configured [main] INFO org.apache.solr.core.CoreContainer - Host Name: null [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - new SolrResourceLoader for directory: '/Users/carlroberts/dev/solr-4.10.3/db/' [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrConfig - Adding specified lib dirs to ClassLoader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-core-0.7.2.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-dom-0.7.2.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/aspectjrt-1.6.11.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/bcmail-jdk15-1.45.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/bcprov-jdk15-1.45.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/boilerpipe-1.1.0.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/commons-compress-1.7.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/dom4j-1.6.1.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/fontbox-1.8.4.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/icu4j-53.1.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/isoparser-1.0-RC-1.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/jdom-1.0.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding
Re: How to return custom collector info
I was confused because I couldn't believe my jars might be out of sync. But of course they were. I had to create a new eclipse project to sort it out, but that exception has disappeared. Sorry for the confusing post. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-return-custom-collector-info-tp4180502p4180877.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How much maximum data can we hard commit in Solr?
On 1/21/2015 9:13 AM, Nitin Solanki wrote: Thanks. Great Explanation.. One more thing I want to ask. Which is best doing only hard commit or both hard and soft commit? I want to index 21 GB of data. My recommendations for the autoCommit settings are on that URL that I linked - maxTime set to five minutes with openSearcher set to false. It also has a maxDocs setting ... you would need to come up with a reasonable setting for that, or just leave it out. That takes care of all hard commit requirements as they relate to the transaction log. Aside from that, I would recommend using soft commits (either explicit or autoSoftCommit) for document visibility. Hard commits with opensearcher=true work fine, but soft commits have a *little* bit less impact. It would be up to you to decide when and how often to do that, but I wouldn't do it more frequently than once a minute unless you can take steps to make those soft commits happen REALLY fast. Making commits happen faster is a separate discussion. Further reading about commits and the transaction log: http://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ Thanks, Shawn
Re: How to index data from multiple data source
Hi Yusniel, Solr manages documents as a whole. This means updating an existing document means replacing. So you should/could index metadata and full text in one step, one solr document under one unique ID. That would the simplest case. You could also also use nested child documents to use block joins(depending on what version of Solr you are using, more info here: http://blog.griddynamics.com/2013/09/solr-block-join-support.html), but in my opinion this would be an overkill. We also manage a type of semantic - linked data mimic using additional fields(named by real ontology predicate/property names to join documents that are related, see https://wiki.apache.org/solr/Join). So you could add the full text as an additional document with it's own ID and fill a solr document field with the ID of the parent metadata document. The on query time you can join them. Joins in solr always give as result the joined document(TO), not both (it's no like a SQL join, more like and inner query), so we experimented with self joins (the field holding the parent ID document also holds it's own ID), but as you can understand this is in no way optimal. Related: We are using a Digital Objects Repository (Fedora Commons + Islandora) to archive exactly what you wan't to do. Our PDF files, and also many other type of data and metadata, are ingested as objects inside the repository, including technical metadata, MODS, DC, binary stream and full text. Then this whole object (as a FOXML) goes through an XSLT transformation and into Solr. If you are interested you can browse Islandoras google group. https://groups.google.com/forum/#!forum/islandora and visit Islandora's WIKI. https://wiki.duraspace.org/display/ISLANDORA714/Islandora. There is much documentation under the fedoragsearch module that does the real indexing. You can see our schemas and solr config there. Feel free to write me if you need/wan't more data. Cheers Diego Pino Navarro Krayon Media Pedro de Valdivia 575 Pucón - Chile F:+56-45-2442469 On Jan 21, 2015, at 2:43 AM, Yusniel Hidalgo Delgado yhdelg...@uci.cu wrote: Dear Solr community, I am diving into Solr recently and I need help in the following usage scenery. I am working on a project for extract and search bibliographic metadata from PDF files. Firstly, my PDF files are processed to extract bibliographic metadata such as title, authors, affiliations, keywords and abstract. These metadata are stored in a relational database and then are indexed in Solr via DIH, however, I need to index also the fulltext of PDF and maintain the same ID between metadata indexed and fulltext of PDF indexed in Solr index. How to do that? How to configure sorlconfig.xml and schema.xml to do it? Thanks in advance. Best regards Yusniel Hidalgo Delgado Semantic Web Research Group University of Informatics Sciences http://gws-uci.blogspot.com/ Havana, Cuba --- XII Aniversario de la creación de la Universidad de las Ciencias Informáticas. 12 años de historia junto a Fidel. 12 de diciembre de 2014.
Errors using the Embedded Solar Server
Hi, I have downloaded the code and documentation for Solr version 4.10.3. I am trying to follow SolrJ Wiki guide and I am running into errors. The latest error is this one: Exception in thread main org.apache.solr.common.SolrException: No such core: db at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:112) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54) at solr.Test.main(Test.java:39) My code is this: package solr; import java.io.File; import java.io.IOException; import java.util.ArrayList; import java.util.Collection; import org.apache.solr.client.solrj.SolrServerException; import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer; import org.apache.solr.common.SolrInputDocument; import org.apache.solr.core.CoreContainer; import org.apache.solr.core.SolrCore; public class Test { public static void main(String [] args){ CoreContainer container = new CoreContainer(/Users/carlroberts/dev/solr-4.10.3); System.out.println(container.getDefaultCoreName()); System.out.println(container.getSolrHome()); container.load(); System.out.println(container.isLoaded(db)); System.out.println(container.getCoreInitFailures()); CollectionSolrCore cores = container.getCores(); System.out.println(cores); EmbeddedSolrServer server = new EmbeddedSolrServer( container, db ); SolrInputDocument doc1 = new SolrInputDocument(); doc1.addField( id, id1, 1.0f ); doc1.addField( name, doc1, 1.0f ); doc1.addField( price, 10 ); SolrInputDocument doc2 = new SolrInputDocument(); doc2.addField( id, id2, 1.0f ); doc2.addField( name, doc2, 1.0f ); doc2.addField( price, 20 ); CollectionSolrInputDocument docs = new ArrayListSolrInputDocument(); docs.add( doc1 ); docs.add( doc2 ); try{ server.add( docs ); server.commit(); server.deleteByQuery( *:* ); }catch(IOException e){ e.printStackTrace(); }catch(SolrServerException e){ e.printStackTrace(); } } } My solr.xml file is this: ?xml version=1.0 encoding=UTF-8 ? !-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the License); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. -- !-- This is an example of a simple solr.xml file for configuring one or more Solr Cores, as well as allowing Cores to be added, removed, and reloaded via HTTP requests. More information about options available in this configuration file, and Solr Core administration can be found online: http://wiki.apache.org/solr/CoreAdmin -- solr cores adminPath=/admin/cores defaultCoreName=db core default=true instanceDir=db/ name=db/ /cores /solr And my db/conf directory was copied from example/solr/collection/conf directory and it contains the solrconfig.xml file and schema.xml file. I have noticed that the documentation that shows how to use the EmbeddedSolarServer is outdated as it indicates I should use CoreContainer.Initializer class which doesn't exist, and container.load(path, file) which also doesn't exist. At this point I have no idea why I am getting the No such core error and I have googled it and there seems to be tons of threads showing this error but for different reasons, and I have tried all the suggested resolutions and get nowhere with this. Can you please help? Regards, Joe
Re: Errors using the Embedded Solar Server
So far I have not been able to get the logging to work - here is what I get in the console prior to the exception: SLF4J: Failed to load class org.slf4j.impl.StaticLoggerBinder. SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. db /Users/carlroberts/dev/solr-4.10.3/ false {} [] /Users/carlroberts/dev/solr-4.10.3/ On 1/21/15, 11:50 AM, Alan Woodward wrote: That certainly looks like it ought to work. Is there log output that you could show us as well? Alan Woodward www.flax.co.uk On 21 Jan 2015, at 16:09, Carl Roberts wrote: Hi, I have downloaded the code and documentation for Solr version 4.10.3. I am trying to follow SolrJ Wiki guide and I am running into errors. The latest error is this one: Exception in thread main org.apache.solr.common.SolrException: No such core: db at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:112) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54) at solr.Test.main(Test.java:39) My code is this: package solr; import java.io.File; import java.io.IOException; import java.util.ArrayList; import java.util.Collection; import org.apache.solr.client.solrj.SolrServerException; import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer; import org.apache.solr.common.SolrInputDocument; import org.apache.solr.core.CoreContainer; import org.apache.solr.core.SolrCore; public class Test { public static void main(String [] args){ CoreContainer container = new CoreContainer(/Users/carlroberts/dev/solr-4.10.3); System.out.println(container.getDefaultCoreName()); System.out.println(container.getSolrHome()); container.load(); System.out.println(container.isLoaded(db)); System.out.println(container.getCoreInitFailures()); CollectionSolrCore cores = container.getCores(); System.out.println(cores); EmbeddedSolrServer server = new EmbeddedSolrServer( container, db ); SolrInputDocument doc1 = new SolrInputDocument(); doc1.addField( id, id1, 1.0f ); doc1.addField( name, doc1, 1.0f ); doc1.addField( price, 10 ); SolrInputDocument doc2 = new SolrInputDocument(); doc2.addField( id, id2, 1.0f ); doc2.addField( name, doc2, 1.0f ); doc2.addField( price, 20 ); CollectionSolrInputDocument docs = new ArrayListSolrInputDocument(); docs.add( doc1 ); docs.add( doc2 ); try{ server.add( docs ); server.commit(); server.deleteByQuery( *:* ); }catch(IOException e){ e.printStackTrace(); }catch(SolrServerException e){ e.printStackTrace(); } } } My solr.xml file is this: ?xml version=1.0 encoding=UTF-8 ? !-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the License); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. -- !-- This is an example of a simple solr.xml file for configuring one or more Solr Cores, as well as allowing Cores to be added, removed, and reloaded via HTTP requests. More information about options available in this configuration file, and Solr Core administration can be found online: http://wiki.apache.org/solr/CoreAdmin -- solr cores adminPath=/admin/cores defaultCoreName=db core default=true instanceDir=db/ name=db/ /cores /solr And my db/conf directory was copied from example/solr/collection/conf directory and it contains the solrconfig.xml file and schema.xml file. I have noticed that the documentation that shows how to use the EmbeddedSolarServer is outdated as it indicates I should use CoreContainer.Initializer class which doesn't exist, and container.load(path, file) which also doesn't exist. At this point I have no idea why I am getting the No such core error and I have googled it and there seems to be tons of threads showing this error but for different reasons, and I have tried all the suggested resolutions and get nowhere with this. Can you please help? Regards, Joe
Re: How much maximum data can we hard commit in Solr?
Thanks. Great Explanation.. One more thing I want to ask. Which is best doing only hard commit or both hard and soft commit? I want to index 21 GB of data. On Wed, Jan 21, 2015 at 7:48 PM, Shawn Heisey apa...@elyograg.org wrote: On 1/21/2015 6:01 AM, Nitin Solanki wrote: How much of maximum data we can commit on Solr using hard commit without using Soft commit. maxTime is 1000 in autoCommit Details explanation is on Stackoverflow http://stackoverflow.com/questions/28067853/how-much-maximum-data-can-we-hard-commit-in-solr The answer to the question you asked: All of it. I suspect you are actually trying to ask a different question. Some additional info, hopefully you can use it to answer what you'd really like to know: You could build your entire index with no commits and then issue a single hard commit and everything would work. The problem with that approach is that if you have the updateLog turned on, then every single one of those documents will be reindexed from the transaction log at Solr startup - it could take a REALLY long time. http://wiki.apache.org/solr/SolrPerformanceProblems#Slow_startup Hard commits are the only way to close a transaction log and open a new one. Solr keeps enough transaction logs around so that it can re-index a minimum of 100 documents ... but it can't break the transaction logs into parts, so if everything is in one log, then that giant log will be replayed on startup. A maxTime of 1000 on autoCommit or autoSoftCommit is usually way too low. We find that this setting is normally driven by unrealistic requirements from sales or marketing, who say that data must be available within one second of indexing. It is extremely rare for this to be truly required. The autoCommit settings control automatic hard commits, and autoSoftCommit naturally controls automatic soft commits. With a maxTime of 1000, you will be issuing a commit every single second while you index. Commits are very resource-intensive operations, doing them once a second will keep your hardware VERY busy. Normally a commit operation will take a lot longer than one second to complete, so if you are starting another one a second later, they will overlap, and that can cause a lot of problems. Thanks, Shawn
AW: AW: AW: transactions@Solr(J)
What I meant is: If I do SolrServer#rollback after 11 documents were added, will then only 1 or all 11 docments that have been added in the SolrServer-tranascation/context? -Ursprüngliche Nachricht- Von: Shawn Heisey [mailto:apa...@elyograg.org] Gesendet: Mittwoch, 21. Januar 2015 15:24 An: solr-user@lucene.apache.org Betreff: Re: AW: AW: transactions@Solr(J) On 1/20/2015 11:42 PM, Clemens Wyss DEV wrote: But then what happens if: Autocommit is set to 10 docs and I add 11 docs and then decide (due to an exception?) to rollback. Will only one (i.e. the last added) document be rollen back? The way I understand the low-level architecture, yes -- assuming that all 11 documents actually got indexed. If the exception happened because of document 5 was badly formed, only documents 1-4 will have been indexed, and in that case, all four of them would get rolled back. Thanks, Shawn
Re: MultiPhraseQuery:Rewrite to BooleanQuery
Tomoko Uchida wrote Hi, Strictly speaking, MultiPhraseQuery and BooleanQuery wrapping PhraseQuerys are not equal. For each query, Query.rewrite() returns different object. (with Lucene 4.10.3) q1.rewrite(reader).toString() returns: body:blueberry chocolate (pie tart), where q1 is your first multi phrase query. q2.rewrite(reader).toString() returns: body:blueberry chocolate pie body:blueberry chocolate tart, where q2 is your second boolean query. In practice... I *think* two queries may return same set of documents, but I'm not sure about scoring/ranking. I suggest you ask to java-user@lucene mailing list as for Lucene API. Regards, Tomoko 2015-01-21 19:12 GMT+09:00 ku3ia lt; demesg@ gt;: Any ideas? -- View this message in context: http://lucene.472066.n3.nabble.com/MultiPhraseQuery-Rewrite-to-BooleanQuery-tp4180638p4180820.html Sent from the Solr - User mailing list archive at Nabble.com. Thanks, I'll try it. -- View this message in context: http://lucene.472066.n3.nabble.com/MultiPhraseQuery-Rewrite-to-BooleanQuery-tp4180638p4180887.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Errors using the Embedded Solar Server
That certainly looks like it ought to work. Is there log output that you could show us as well? Alan Woodward www.flax.co.uk On 21 Jan 2015, at 16:09, Carl Roberts wrote: Hi, I have downloaded the code and documentation for Solr version 4.10.3. I am trying to follow SolrJ Wiki guide and I am running into errors. The latest error is this one: Exception in thread main org.apache.solr.common.SolrException: No such core: db at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:112) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54) at solr.Test.main(Test.java:39) My code is this: package solr; import java.io.File; import java.io.IOException; import java.util.ArrayList; import java.util.Collection; import org.apache.solr.client.solrj.SolrServerException; import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer; import org.apache.solr.common.SolrInputDocument; import org.apache.solr.core.CoreContainer; import org.apache.solr.core.SolrCore; public class Test { public static void main(String [] args){ CoreContainer container = new CoreContainer(/Users/carlroberts/dev/solr-4.10.3); System.out.println(container.getDefaultCoreName()); System.out.println(container.getSolrHome()); container.load(); System.out.println(container.isLoaded(db)); System.out.println(container.getCoreInitFailures()); CollectionSolrCore cores = container.getCores(); System.out.println(cores); EmbeddedSolrServer server = new EmbeddedSolrServer( container, db ); SolrInputDocument doc1 = new SolrInputDocument(); doc1.addField( id, id1, 1.0f ); doc1.addField( name, doc1, 1.0f ); doc1.addField( price, 10 ); SolrInputDocument doc2 = new SolrInputDocument(); doc2.addField( id, id2, 1.0f ); doc2.addField( name, doc2, 1.0f ); doc2.addField( price, 20 ); CollectionSolrInputDocument docs = new ArrayListSolrInputDocument(); docs.add( doc1 ); docs.add( doc2 ); try{ server.add( docs ); server.commit(); server.deleteByQuery( *:* ); }catch(IOException e){ e.printStackTrace(); }catch(SolrServerException e){ e.printStackTrace(); } } } My solr.xml file is this: ?xml version=1.0 encoding=UTF-8 ? !-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the License); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. -- !-- This is an example of a simple solr.xml file for configuring one or more Solr Cores, as well as allowing Cores to be added, removed, and reloaded via HTTP requests. More information about options available in this configuration file, and Solr Core administration can be found online: http://wiki.apache.org/solr/CoreAdmin -- solr cores adminPath=/admin/cores defaultCoreName=db core default=true instanceDir=db/ name=db/ /cores /solr And my db/conf directory was copied from example/solr/collection/conf directory and it contains the solrconfig.xml file and schema.xml file. I have noticed that the documentation that shows how to use the EmbeddedSolarServer is outdated as it indicates I should use CoreContainer.Initializer class which doesn't exist, and container.load(path, file) which also doesn't exist. At this point I have no idea why I am getting the No such core error and I have googled it and there seems to be tons of threads showing this error but for different reasons, and I have tried all the suggested resolutions and get nowhere with this. Can you please help? Regards, Joe
Re: Errors using the Embedded Solar Server
Aha, I think you're being stung by https://issues.apache.org/jira/browse/SOLR-6643. Which will be fixed in the upcoming 5.0 release, or you can patch your system with the patch attached to that issue. Alan Woodward www.flax.co.uk On 21 Jan 2015, at 19:44, Carl Roberts wrote: Already did. And the logging gets me no closer to fixing the issue. Here is the logging. [main] INFO org.apache.solr.core.SolrResourceLoader - new SolrResourceLoader for directory: '/Users/carlroberts/dev/solr-4.10.3/' [main] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/lib/commons-logging-1.2.jar' to classloader [main] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/lib/servlet-api.jar' to classloader [main] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/lib/slf4j-simple-1.7.5.jar' to classloader [main] INFO org.apache.solr.core.ConfigSolr - Loading container configuration from /Users/carlroberts/dev/solr-4.10.3/solr.xml [main] INFO org.apache.solr.core.CoreContainer - New CoreContainer 1727098510 [main] INFO org.apache.solr.core.CoreContainer - Loading cores into CoreContainer [instanceDir=/Users/carlroberts/dev/solr-4.10.3/] [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting socketTimeout to: 0 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting urlScheme to: null [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting connTimeout to: 0 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting maxConnectionsPerHost to: 20 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting corePoolSize to: 0 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting maximumPoolSize to: 2147483647 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting maxThreadIdleTime to: 5 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting sizeOfQueue to: -1 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting fairnessPolicy to: false [main] INFO org.apache.solr.update.UpdateShardHandler - Creating UpdateShardHandler HTTP client with params: socketTimeout=0connTimeout=0retry=false [main] INFO org.apache.solr.logging.LogWatcher - SLF4J impl is org.slf4j.impl.SimpleLoggerFactory [main] INFO org.apache.solr.logging.LogWatcher - No LogWatcher configured [main] INFO org.apache.solr.core.CoreContainer - Host Name: null [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - new SolrResourceLoader for directory: '/Users/carlroberts/dev/solr-4.10.3/db/' [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrConfig - Adding specified lib dirs to ClassLoader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-core-0.7.2.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-dom-0.7.2.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/aspectjrt-1.6.11.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/bcmail-jdk15-1.45.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/bcprov-jdk15-1.45.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/boilerpipe-1.1.0.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/commons-compress-1.7.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/dom4j-1.6.1.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/fontbox-1.8.4.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/icu4j-53.1.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding
permanently reducing logging levels for Solr
All, How can I reduce the logging levels to SEVERE that survives a Tomcat restart or a machine reboot in Solr. As you may know, I can change the logging levels from the logging page in admin console but those changes are not persistent across Tomcat server restart or machine reboot. Following is the information about the Solr version from Info page in admin console. Solr Specification Version: 3.2.0 Solr Implementation Version: 3.2.0 1129474 - rmuir - 2011-05-30 23:07:15 Lucene Specification Version: 3.2.0 Lucene Implementation Version: 3.2.0 1129474 - 2011-05-30 23:08:57 Please let me know if there is any other information that you may need. Thank you in advance for your help Raj
Re: Is Solr a good candidate to index 100s of nodes in one XML file?
On 1/21/2015 12:53 PM, Carl Roberts wrote: Is Solr a good candidate to index 100s of nodes in one XML file? I have an RSS feed XML file that has 100s of nodes with several elements in each node that I have to index, so I was planning to parse the XML with Stax and extract the data from each node and add it to Solr. There will always be only one one file to start with and then a second file as the RSS feeds supplies updates. I want to return certain fields of each node when I search certain fields of the same node. Is Solr overkill in this case? Should I just use Lucene instead? Effectively, Solr *is* Lucene. You edit configuration files instead of writing Lucene code, because Solr is a fully customizable search server, not a programming API. That also means that it's not as flexible as Lucene ... but it's a lot easier. If you're capable of writing Lucene code, chances are that you'll be able to write an application that is highly tailored to your situation that will have better performance than Solr ... but you'll be writing the entire program yourself. Solr lets you install an existing program and just change the configuration. Thanks, Shawn
boosting by geodist - GC Overhead Limit exceeded
I am running solr 4.10.2 with geofilt (~20% of docs have 30+ lat/lon points) and everything work hunky dori. Than I added a bf with geodist along the lines of: recip(geodist(),5,20,5) after few hours of running I end up with OOM GC overhead limit exceeded. I've seen this https://issues.apache.org/jira/browse/LUCENE-4698 and few other relevant tickets. Wanted to check if anyone has any successful remedies. Many thanks, Mihran My gc params on amazon xl instance: -server -Xmx8g -Xms8g -XX:+HeapDumpOnOutOfMemoryError \ -XX:NewRatio=3 \ -XX:SurvivorRatio=4 \ -XX:TargetSurvivorRatio=90 \ -XX:MaxTenuringThreshold=8 \ -XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 \ -XX:+CMSScavengeBeforeRemark \ -XX:PretenureSizeThreshold=64m \ -XX:+UseCMSInitiatingOccupancyOnly \ -XX:CMSInitiatingOccupancyFraction=50 \ -XX:CMSMaxAbortablePrecleanTime=6000 \ -XX:+CMSParallelRemarkEnabled \ -XX:+ParallelRefProcEnabled Screenshot from Eclipse Mat [image: Inline image 1]
Re: Ignore whitesapce, underscore using KeywordTokenizer... EdgeNGramFilter
Hi, Not sure, but I think that the PatternReplaceFilterFactory or the PatternReplaceCharFilterFactory could help you deleting those characters. Regards. On Jan 21, 2015 7:59 PM, Vishal Swaroop vishal@gmail.com wrote: I am trying to implement type-ahead suggestion for single field which should ignore whitesapce, underscore or special characters in autosuggest. It works as suggested by Alex using KeywordTokenizerFactory but how to ignore whitesapce, underscore... Example itemName data can be : ABC E12 : if user types ABCE suggestion should be ABC E12 ABCE_12 : if user types ABCE1 suggestion should be ABCE_12 Schema.xml field name=itemName type=text_general_edge_ngram indexed=true stored=true multiValued=false / fieldType name=text_general_edge_ngram class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=1 maxGramSize=15 side=front/ /analyzer analyzer type=query tokenizer class=solr.LowerCaseTokenizerFactory/ /analyzer /fieldType
Re: Is Solr a good candidate to index 100s of nodes in one XML file?
Solr is just fine for this. It even ships with an example of how to read an RSS file under the DIH directory. DIH is also most likely what you will use for the first implementation. Don't need to worry about Stax or anything, unless your file format is very weird or has overlapping namespaces (DIH XML parser does not care about namespaces). Regards, Alex. Sign up for my Solr resources newsletter at http://www.solr-start.com/ On 21 January 2015 at 14:53, Carl Roberts carl.roberts.zap...@gmail.com wrote: Hi, Is Solr a good candidate to index 100s of nodes in one XML file? I have an RSS feed XML file that has 100s of nodes with several elements in each node that I have to index, so I was planning to parse the XML with Stax and extract the data from each node and add it to Solr. There will always be only one one file to start with and then a second file as the RSS feeds supplies updates. I want to return certain fields of each node when I search certain fields of the same node. Is Solr overkill in this case? Should I just use Lucene instead? Regards, Joe
RE: Ignore whitesapce, underscore using KeywordTokenizer... EdgeNGramFilter
This is what we use for our autosuggest field in Solr 3.4. It works for us as you describe below. fieldType name=autocomplete_edge class=solr.TextField analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=([\.,;:-_]) replacement= replace=all/ filter class=solr.EdgeNGramFilterFactory maxGramSize=30 minGramSize=1/ filter class=solr.PatternReplaceFilterFactory pattern=([^\w\d]) replacement= replace=all/ /analyzer analyzer type=query tokenizer class=solr.KeywordTokenizerFactory/ charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=([\.,;:-_]) replacement= replace=all/ filter class=solr.PatternReplaceFilterFactory pattern=([^\w\d]) replacement= replace=all/ filter class=solr.PatternReplaceFilterFactory pattern=^(.{30})(.*)? replacement=$1 replace=all/ /analyzer /fieldType -Original Message- From: Vishal Swaroop [mailto:vishal@gmail.com] Sent: Wednesday, January 21, 2015 4:40 PM To: solr-user@lucene.apache.org Subject: Re: Ignore whitesapce, underscore using KeywordTokenizer... EdgeNGramFilter I tried adding *PatternReplaceFilterFactory *in index section but it is not working Example itemName data can be : - ABC E12 : if user types ABCE suggestion should be ABC E12 - ABCE_12 : if user types ABCE1 suggestion should be ABCE_12 field name=itemName type=text_general_edge_ngram indexed=true stored=true multiValued=false / fieldType name=text_general_edge_ngram class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ *filter class=solr.PatternReplaceFilterFactory pattern=(\s+) replacement= replace=all /* filter class=solr.LowerCaseFilterFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=1 maxGramSize=15 side=front/ /analyzer analyzer type=query tokenizer class=solr.LowerCaseTokenizerFactory/ /analyzer /fieldType On Wed, Jan 21, 2015 at 3:31 PM, Alvaro Cabrerizo topor...@gmail.com wrote: Hi, Not sure, but I think that the PatternReplaceFilterFactory or the PatternReplaceCharFilterFactory could help you deleting those characters. Regards. On Jan 21, 2015 7:59 PM, Vishal Swaroop vishal@gmail.com wrote: I am trying to implement type-ahead suggestion for single field which should ignore whitesapce, underscore or special characters in autosuggest. It works as suggested by Alex using KeywordTokenizerFactory but how to ignore whitesapce, underscore... Example itemName data can be : ABC E12 : if user types ABCE suggestion should be ABC E12 ABCE_12 : if user types ABCE1 suggestion should be ABCE_12 Schema.xml field name=itemName type=text_general_edge_ngram indexed=true stored=true multiValued=false / fieldType name=text_general_edge_ngram class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=1 maxGramSize=15 side=front/ /analyzer analyzer type=query tokenizer class=solr.LowerCaseTokenizerFactory/ /analyzer /fieldType Nothing in this message is intended to constitute an electronic signature unless a specific statement to the contrary is included in this message. Confidentiality Note: This message is intended only for the person or entity to which it is addressed. It may contain confidential and/or privileged material. Any review, transmission, dissemination or other use, or taking of any action in reliance upon this message by persons or entities other than the intended recipient is prohibited and may be unlawful. If you received this message in error, please contact the sender and delete it from your computer.
Re: permanently reducing logging levels for Solr
Hi, Just add log4j.logger.org.apache.solr=SEVERE to you log4j properties. *Thanks,* *Rajesh,* *(mobile) : 8328789519.* On Wed, Jan 21, 2015 at 3:14 PM, Nemani, Raj raj.nem...@turner.com wrote: All, How can I reduce the logging levels to SEVERE that survives a Tomcat restart or a machine reboot in Solr. As you may know, I can change the logging levels from the logging page in admin console but those changes are not persistent across Tomcat server restart or machine reboot. Following is the information about the Solr version from Info page in admin console. Solr Specification Version: 3.2.0 Solr Implementation Version: 3.2.0 1129474 - rmuir - 2011-05-30 23:07:15 Lucene Specification Version: 3.2.0 Lucene Implementation Version: 3.2.0 1129474 - 2011-05-30 23:08:57 Please let me know if there is any other information that you may need. Thank you in advance for your help Raj
Re: boosting by geodist - GC Overhead Limit exceeded
On Wed, 21 Jan 2015, Mihran Shahinian wrote: : Date: Wed, 21 Jan 2015 16:06:18 -0600 : From: Mihran Shahinian slowmih...@gmail.com : Reply-To: solr-user@lucene.apache.org : To: solr-user@lucene.apache.org : Subject: boosting by geodist - GC Overhead Limit exceeded : : I am running solr 4.10.2 with geofilt (~20% of docs have 30+ lat/lon : points) and everything work hunky dori. Than I added a bf with geodist : along the lines of: : recip(geodist(),5,20,5) after few hours of running I end up with OOM : GC overhead limit exceeded. I've seen this : https://issues.apache.org/jira/browse/LUCENE-4698 and few other relevant : tickets. Wanted to check if anyone has any successful remedies. : : Many thanks, : Mihran : : My gc params on amazon xl instance: : -server -Xmx8g -Xms8g : -XX:+HeapDumpOnOutOfMemoryError \ : -XX:NewRatio=3 \ : -XX:SurvivorRatio=4 \ : -XX:TargetSurvivorRatio=90 \ : -XX:MaxTenuringThreshold=8 \ : -XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 \ : -XX:+CMSScavengeBeforeRemark \ : -XX:PretenureSizeThreshold=64m \ : -XX:+UseCMSInitiatingOccupancyOnly \ : -XX:CMSInitiatingOccupancyFraction=50 \ : -XX:CMSMaxAbortablePrecleanTime=6000 \ : -XX:+CMSParallelRemarkEnabled \ : -XX:+ParallelRefProcEnabled : : Screenshot from Eclipse Mat : [image: Inline image 1] : -Hoss http://www.lucidworks.com/
Re: Solr 4.10.3 start up issue
Hi Darren, Can you please show the contents of the clusterstate.json from ZooKeeper? Please use github gist or a pastebin like service. The Admin UI has a dump screen which shows the entire content of ZooKeeper as a json. On Wed, Jan 21, 2015 at 6:15 PM, Darren Spehr darre...@gmail.com wrote: Hi everyone - I posted a question on stackoverflow but in hindsight this would have been a better place to start. Below is the link. Basically I can't get the example working when using an external ZK cluster and auto-core discovery. Solr 4.10.1 works fine, but the newest release never gets new nodes into the active state. There are no errors or warnings, and compared to the log output of 4.10.1, the difference is that nodes never make it to leader election. Here is the stackoverflow question, along with the full log output: http://stackoverflow.com/questions/28004832/solr-4-10-3-is-not-proceeding-to-leader-election-on-new-cluster-startup-hangs Any help and guidance would be appreciated. Thanks! -- Darren -- Regards, Shalin Shekhar Mangar.
Re: Errors using the Embedded Solar Server
On 1/21/2015 5:16 PM, Carl Roberts wrote: BTW - it seems that is very hard to get started with the Embedded server. The doc is out of date. The code seems to be untested and buggy. On 1/21/15, 7:15 PM, Carl Roberts wrote: HmmmIt looks like FutureTask is calling setException(Throwable t) with this exception which is not making it to the console. What I don't understand is why it is throwing that exception. I made sure that I added lucene-queries-4.10.3.jar file to the classpath by adding it to the solr home directory. See the new tracing: I'm pretty sure that all the lucene jars need to be available *before* Solr reaches the point in the log that you have quoted, where it adds jars from ${solr.solr.home}/lib. This would be the same location where the solrj and solr-core jars live. The only kind of jars that should be in the solr home lib directory are extra jars for extra features that you might specify in schema.xml (or some places in solrconfig.xml), like the ICU analysis jars, tika, mysql, etc. Thanks, Shawn
Re: Issue with Solr multiple sort
: I'm facing a problem with multiple field sort in Solr. I'm using the : following fields in sort : : : PublishDate asc,DocumentType asc correction: you are using: PublishDate desc,DocumentType desc : The sort is only happening on PublishDate, DocumentType seemsto completely : ignored. Here's my field type definition. the results you posted are perfectly sorted according to the criteria in your URL... 2015-01-17, 2014-11-17, 2013-01-17, 2012-10-17, 2012-01-17, 2011-01-17, then 2 docs from 2006-01-17 correctly ordered by secondary sort: O before H. ...did you not post the query/results you ment to post? what exactly is it about the result ordering that you are getting do you think is incorrect? : result name=response numFound=8 start=0 : doc : date name=PublishDate2015-01-17T00:00:00Z/date : str name=DocumentTypeHotfixes/str : /doc : doc : date name=PublishDate2014-11-17T00:00:00Z/date : str name=DocumentTypeHotfixes/str : /doc : doc : date name=PublishDate2013-01-17T00:00:00Z/date : str name=DocumentTypeTutorials/str : /doc : doc : date name=PublishDate2012-10-17T00:00:00Z/date : str name=DocumentTypeService Packs/str : /doc : doc : date name=PublishDate2012-01-17T00:00:00Z/date : str name=DocumentTypeTutorials/str : /doc : doc : date name=PublishDate2011-01-17T00:00:00Z/date : str name=DocumentTypeTutorials /str : /doc : doc : date name=PublishDate2006-01-17T00:00:00Z/date : str name=DocumentTypeObject Enablers/str : /doc : doc : date name=PublishDate2006-01-17T00:00:00Z/date : str name=DocumentTypeHotfixes/str : /doc : /result : : As you can see, the sorting happened only on PublishDate. I'm using Solr : 4.7. : : Not sure what I'm missing here, any pointers will be appreciated. : : Thanks, : Shamik : -Hoss http://www.lucidworks.com/
Re: Solr Recovery process
Hi Nishanth, The recovery happens as follows: 1. PeerSync is attempted first. If the number of new updates on leader is less than 100 then the missing documents are fetched directly and indexed locally. The tlog tells us the last 100 updates very quickly. Other uses of the tlog are for durability of updates and of course, startup recovery. 2. If the above step fails then replication recovery is attempted. A hard commit is called on the leader and then the leader is polled for the latest index version and generation. If the leader's version and generation are greater than local index's version/generation then the difference of the index files between leader and replica are fetched and installed. 3. If the above fails (because leader's version/generation is somehow equal or more than local) then a full index recovery happens and the entire index from the leader is fetched and installed locally. There are some other details involved in this process too but probably not worth going into here. On Wed, Jan 21, 2015 at 5:13 PM, Nishanth S nishanth.2...@gmail.com wrote: Hello Everyone, I am hitting a few issues with solr replicas going into recovery and then doing a full index copy.I am trying to understand the solr recovery process.I have read a few blogs on this and saw that when leader notifies a replica to recover(in my case it is due to connection resets) it will try to do a peer sync first and if the missed updates are more than 100 it will do a full index copy from the leader.I am trying to understand what peer sync is and where does tlog come into picture.Are tlogs replayed only during server restart?.Can some one help me with this? Thanks, Nishanth -- Regards, Shalin Shekhar Mangar.
Re: Ignore whitesapce, underscore using KeywordTokenizer... EdgeNGramFilter
Hi Visal, Maybe the next pattern can help you (the conf attached by David is really nice): ...pattern=(\s)+ replacement= replace=all/ Hope it helps. On Wed, Jan 21, 2015 at 10:57 PM, David M Giannone david.giann...@gm.com wrote: This is what we use for our autosuggest field in Solr 3.4. It works for us as you describe below. fieldType name=autocomplete_edge class=solr.TextField analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=([\.,;:-_]) replacement= replace=all/ filter class=solr.EdgeNGramFilterFactory maxGramSize=30 minGramSize=1/ filter class=solr.PatternReplaceFilterFactory pattern=([^\w\d]) replacement= replace=all/ /analyzer analyzer type=query tokenizer class=solr.KeywordTokenizerFactory/ charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=([\.,;:-_]) replacement= replace=all/ filter class=solr.PatternReplaceFilterFactory pattern=([^\w\d]) replacement= replace=all/ filter class=solr.PatternReplaceFilterFactory pattern=^(.{30})(.*)? replacement=$1 replace=all/ /analyzer /fieldType -Original Message- From: Vishal Swaroop [mailto:vishal@gmail.com] Sent: Wednesday, January 21, 2015 4:40 PM To: solr-user@lucene.apache.org Subject: Re: Ignore whitesapce, underscore using KeywordTokenizer... EdgeNGramFilter I tried adding *PatternReplaceFilterFactory *in index section but it is not working Example itemName data can be : - ABC E12 : if user types ABCE suggestion should be ABC E12 - ABCE_12 : if user types ABCE1 suggestion should be ABCE_12 field name=itemName type=text_general_edge_ngram indexed=true stored=true multiValued=false / fieldType name=text_general_edge_ngram class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ *filter class=solr.PatternReplaceFilterFactory pattern=(\s+) replacement= replace=all /* filter class=solr.LowerCaseFilterFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=1 maxGramSize=15 side=front/ /analyzer analyzer type=query tokenizer class=solr.LowerCaseTokenizerFactory/ /analyzer /fieldType On Wed, Jan 21, 2015 at 3:31 PM, Alvaro Cabrerizo topor...@gmail.com wrote: Hi, Not sure, but I think that the PatternReplaceFilterFactory or the PatternReplaceCharFilterFactory could help you deleting those characters. Regards. On Jan 21, 2015 7:59 PM, Vishal Swaroop vishal@gmail.com wrote: I am trying to implement type-ahead suggestion for single field which should ignore whitesapce, underscore or special characters in autosuggest. It works as suggested by Alex using KeywordTokenizerFactory but how to ignore whitesapce, underscore... Example itemName data can be : ABC E12 : if user types ABCE suggestion should be ABC E12 ABCE_12 : if user types ABCE1 suggestion should be ABCE_12 Schema.xml field name=itemName type=text_general_edge_ngram indexed=true stored=true multiValued=false / fieldType name=text_general_edge_ngram class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=1 maxGramSize=15 side=front/ /analyzer analyzer type=query tokenizer class=solr.LowerCaseTokenizerFactory/ /analyzer /fieldType Nothing in this message is intended to constitute an electronic signature unless a specific statement to the contrary is included in this message. Confidentiality Note: This message is intended only for the person or entity to which it is addressed. It may contain confidential and/or privileged material. Any review, transmission, dissemination or other use, or taking of any action in reliance upon this message by persons or entities other than the intended recipient is prohibited and may be unlawful. If you received this message in error, please contact the sender and delete it from your computer.
Issue with Solr multiple sort
Hi, I'm facing a problem with multiple field sort in Solr. I'm using the following fields in sort : PublishDate asc,DocumentType asc The sort is only happening on PublishDate, DocumentType seemsto completely ignored. Here's my field type definition. field name=PublishDate type=tdate indexed=true stored=true default=NOW/ field name=DocumentType type=string indexed=true stored=true multiValued=false required=false omitNorms=true/ Here's the sample query: http://localhost:8983/solr/select?sort=PublishDate+desc%2CDocumentType+descq=cat:searchfl=PublishDate,DocumentTypedebugQuery=true Here's the output : result name=response numFound=8 start=0 doc date name=PublishDate2015-01-17T00:00:00Z/date str name=DocumentTypeHotfixes/str /doc doc date name=PublishDate2014-11-17T00:00:00Z/date str name=DocumentTypeHotfixes/str /doc doc date name=PublishDate2013-01-17T00:00:00Z/date str name=DocumentTypeTutorials/str /doc doc date name=PublishDate2012-10-17T00:00:00Z/date str name=DocumentTypeService Packs/str /doc doc date name=PublishDate2012-01-17T00:00:00Z/date str name=DocumentTypeTutorials/str /doc doc date name=PublishDate2011-01-17T00:00:00Z/date str name=DocumentTypeTutorials /str /doc doc date name=PublishDate2006-01-17T00:00:00Z/date str name=DocumentTypeObject Enablers/str /doc doc date name=PublishDate2006-01-17T00:00:00Z/date str name=DocumentTypeHotfixes/str /doc /result As you can see, the sorting happened only on PublishDate. I'm using Solr 4.7. Not sure what I'm missing here, any pointers will be appreciated. Thanks, Shamik
Re: Solr 4.10.3 start up issue
: I posted a question on stackoverflow but in hindsight this would have been : a better place to start. Below is the link. : : Basically I can't get the example working when using an external ZK cluster : and auto-core discovery. Solr 4.10.1 works fine, but the newest release your SO URL shows the output of using your custom configs, but not what you got with the example configs -- so it's not clear to me if there is really just one problem, or perhaps 2? you also mentioned a lot of details about how you are using solr with zk, and what doens't work, but it's not clear if you tried other simpler steps using your configs -- or the example configs -- and if those simpler *did* work (ie: single node solr startup?) my best guess, based on the logs you did post and the mention of lib/mq/solr-search-ahead-2.0.0.jar in those logs, is that the entire question of zk and slcuster state and leaders is a red herring, and what you are running into is: SOLR-6643... https://issues.apache.org/jira/browse/SOLR-6643 ...if i'm right, then simple core discovery with your configs on a single node solr instance w/o any knowledge of ZK will also fail to init the core -- and if you try to use the CoreAdmin API to CREATE a core, you'll ge some kind of LinkageError. : Here is the stackoverflow question, along with the full log output: : http://stackoverflow.com/questions/28004832/solr-4-10-3-is-not-proceeding-to-leader-election-on-new-cluster-startup-hangs -Hoss http://www.lucidworks.com/
Re: permanently reducing logging levels for Solr
On 1/21/2015 1:14 PM, Nemani, Raj wrote: How can I reduce the logging levels to SEVERE that survives a Tomcat restart or a machine reboot in Solr. As you may know, I can change the logging levels from the logging page in admin console but those changes are not persistent across Tomcat server restart or machine reboot. Following is the information about the Solr version from Info page in admin console. Solr Specification Version: 3.2.0 Solr Implementation Version: 3.2.0 1129474 - rmuir - 2011-05-30 23:07:15 Lucene Specification Version: 3.2.0 Lucene Implementation Version: 3.2.0 1129474 - 2011-05-30 23:08:57 Please let me know if there is any other information that you may need. Thank you in advance for your help The Solr 3.x example uses java.util.logging, not the log4j that was introduced in the example for 4.3.0. Your other reply talks about log4j, which may not be the right framework for your install. I have no way to know what container or logging framework you're using. You will need to create a configuration file for whatever slf4j binding is in use on your install and most likely add a system property to your java commandline for startup so that your logging config gets used. If you're using java.util.logging, look for help with the java.util.logging.config.file system property. FYI -- if you reduce the logging level to WARN, a normally functioning Solr will log almost nothing, and you'll be able to see ERROR and WARN messages, which is extremely important for troubleshooting. Dropping the level to SEVERE is not necessary, and will make it impossible to tell what happened when something goes wrong. Thanks, Shawn
Re: permanently reducing logging levels for Solr
On 1/21/2015 7:24 PM, Shawn Heisey wrote: I have no way to know what container or logging framework you're using. Followup on this: Unless you have modified the solr war for version 3.2.0 to change the logging jars, you will definitely be using java.util.logging. Here's some URLs that may offer insight on the config file you'll need: http://www.javapractices.com/topic/TopicAction.do?Id=143 http://tutorials.jenkov.com/java-logging/configuration.html http://www.java2s.com/Code/Java/Language-Basics/ConfiguringLoggerDefaultValueswithaPropertiesFile.htm Thanks, Shawn
Re: Solr 4.10.3 start up issue
Thanks Hoss, this is exactly what I needed. I had previously run the example using nothing more than an external ZK hosting my own configuration. This of course means one of two things - my conf was bad, or Solr was at fault. The conf has been working for ages so I didn't test a replacement (it's amazing how a little frustration can fuel such hubris). I had thought to do this before - and should have; I uploaded the full example collection configuration to ZK just now and tried again. Magic, it worked, which left me feeling a bit glum. Well, happy that it wasn't Solr. Now if you'll excuse me, I have a conf review to perform. Darren On Wed, Jan 21, 2015 at 6:48 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : I posted a question on stackoverflow but in hindsight this would have been : a better place to start. Below is the link. : : Basically I can't get the example working when using an external ZK cluster : and auto-core discovery. Solr 4.10.1 works fine, but the newest release your SO URL shows the output of using your custom configs, but not what you got with the example configs -- so it's not clear to me if there is really just one problem, or perhaps 2? you also mentioned a lot of details about how you are using solr with zk, and what doens't work, but it's not clear if you tried other simpler steps using your configs -- or the example configs -- and if those simpler *did* work (ie: single node solr startup?) my best guess, based on the logs you did post and the mention of lib/mq/solr-search-ahead-2.0.0.jar in those logs, is that the entire question of zk and slcuster state and leaders is a red herring, and what you are running into is: SOLR-6643... https://issues.apache.org/jira/browse/SOLR-6643 ...if i'm right, then simple core discovery with your configs on a single node solr instance w/o any knowledge of ZK will also fail to init the core -- and if you try to use the CoreAdmin API to CREATE a core, you'll ge some kind of LinkageError. : Here is the stackoverflow question, along with the full log output: : http://stackoverflow.com/questions/28004832/solr-4-10-3-is-not-proceeding-to-leader-election-on-new-cluster-startup-hangs -Hoss http://www.lucidworks.com/ -- Darren
If I change schema.xml then reIndex is neccessary in Solr or not?
I *indexed* *2GB* of data. Now I want to *change* the *type* of *field* from *textSpell* to *string* type into *schema.xml.* Detail Explanation on Stackoverflow. Below is the link: http://stackoverflow.com/questions/28072109/if-i-change-schema-xml-then-reindex-is-neccessary-in-solr-or-not/28073815#28073815
Re: If I change schema.xml then reIndex is neccessary in Solr or not?
On 22 January 2015 at 11:23, Nitin Solanki nitinml...@gmail.com wrote: I *indexed* *2GB* of data. Now I want to *change* the *type* of *field* from *textSpell* to *string* type into Yes, one would need to reindex. Regards, Gora
Re: Errors using the Embedded Solar Server
Hi Shawn, Many thanks for all your help. Moving the lucene JARs from solr.solr.home/lib to the same classpath directory as the solr JARs plus adding a bunch more dependency JAR files and most of the files from the collection1/conf directory - these ones to be exact, has me a lot closer to my goal: rw-r--r-- 1 carlroberts staff 38 Jan 21 20:41 _rest_managed.json -rw-r--r-- 1 carlroberts staff 56 Jan 21 20:41 _schema_analysis_stopwords_english.json -rw-r--r-- 1 carlroberts staff 4041 Dec 10 00:37 currency.xml -rw-r--r-- 1 carlroberts staff 1386 Dec 10 00:37 elevate.xml drwxr-xr-x 41 carlroberts staff 1394 Dec 10 00:37 lang -rw-r--r-- 1 carlroberts staff894 Dec 10 00:37 protwords.txt -rw-r--r--@ 1 carlroberts staff 62063 Jan 21 13:02 schema.xml -rw-r--r--@ 1 carlroberts staff 76821 Jan 21 13:03 solrconfig.xml -rw-r--r-- 1 carlroberts staff 16 Dec 10 00:37 spellings.txt -rw-r--r-- 1 carlroberts staff795 Dec 10 00:37 stopwords.txt -rw-r--r-- 1 carlroberts staff 1148 Dec 10 00:37 synonyms.txt I am now getting this: [main] INFO org.apache.solr.core.SolrResourceLoader - new SolrResourceLoader for directory: '/Users/carlroberts/dev/solr-4.10.3/' [main] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/lib/commons-logging-1.2.jar' to classloader [main] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/lib/servlet-api.jar' to classloader [main] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/lib/slf4j-simple-1.7.5.jar' to classloader [main] INFO org.apache.solr.core.ConfigSolr - Loading container configuration from /Users/carlroberts/dev/solr-4.10.3/solr.xml [main] INFO org.apache.solr.core.CoreContainer - New CoreContainer 139145087 [main] INFO org.apache.solr.core.CoreContainer - Loading cores into CoreContainer [instanceDir=/Users/carlroberts/dev/solr-4.10.3/] [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting socketTimeout to: 0 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting urlScheme to: null [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting connTimeout to: 0 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting maxConnectionsPerHost to: 20 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting corePoolSize to: 0 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting maximumPoolSize to: 2147483647 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting maxThreadIdleTime to: 5 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting sizeOfQueue to: -1 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting fairnessPolicy to: false [main] INFO org.apache.solr.update.UpdateShardHandler - Creating UpdateShardHandler HTTP client with params: socketTimeout=0connTimeout=0retry=false [main] INFO org.apache.solr.logging.LogWatcher - SLF4J impl is org.slf4j.impl.SimpleLoggerFactory [main] INFO org.apache.solr.logging.LogWatcher - No LogWatcher configured [main] INFO org.apache.solr.core.CoreContainer - Host Name: null [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - new SolrResourceLoader for directory: '/Users/carlroberts/dev/solr-4.10.3/db/' [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrConfig - Adding specified lib dirs to ClassLoader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-core-0.7.2.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-dom-0.7.2.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/aspectjrt-1.6.11.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/bcmail-jdk15-1.45.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/bcprov-jdk15-1.45.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/boilerpipe-1.1.0.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/commons-compress-1.7.jar' to classloader [coreLoadExecutor-5-thread-1] INFO
Re: Errors using the Embedded Solar Server
On 1/21/2015 7:02 PM, Carl Roberts wrote: Got it all working...:) I just replaced the solrconfig.xml and schema.xml files that I was using with the ones from collection1 in one of the examples. I had modified those files to remove certain sections which I thought were not needed and apparently I don't understand those files very well yet...:) Glad you got it working. Here's the problem. In that log you included, the error was: ERROR org.apache.solr.core.SolrCore - org.apache.solr.common.SolrException: undefined field text Your solrconfig.xml file referenced a field named text (probably in the df parameter of a request handler) ... but your schema.xml did not have that field defined. Thanks, Shawn
Re: Errors using the Embedded Solar Server
Ah - OK - let me try that. BTW - I applied the fix from the bug link you gave me to log the errors and I am now at least getting the actual errors: *default core name=db solr home=/Users/carlroberts/dev/solr-4.10.3/ db is loaded=false core init failures={db=org.apache.solr.core.CoreContainer$CoreLoadFailure@4d351f9b} cores=[] Exception in thread main org.apache.solr.common.SolrException: SolrCore 'db' is not available due to init failure: JVM Error creating core [db]: org/apache/lucene/queries/function/ValueSource at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:749) at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:110) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54) at solr.Test.main(Test.java:38) Caused by: org.apache.solr.common.SolrException: JVM Error creating core [db]: org/apache/lucene/queries/function/ValueSource at org.apache.solr.core.CoreContainer.create(CoreContainer.java:508) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:255) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:249) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NoClassDefFoundError: org/apache/lucene/queries/function/ValueSource at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:274) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:484) at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:521) at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:517) at org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:81) at org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:43) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:151) at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:486) at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:166) at org.apache.solr.schema.IndexSchemaFactory.create(IndexSchemaFactory.java:55) at org.apache.solr.schema.IndexSchemaFactory.buildIndexSchema(IndexSchemaFactory.java:69) at org.apache.solr.core.ConfigSetService.createIndexSchema(ConfigSetService.java:90) at org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:62) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:489) ... 6 more Caused by: java.lang.ClassNotFoundException: org.apache.lucene.queries.function.ValueSource at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) ... 21 more * On 1/21/15, 7:32 PM, Shawn Heisey wrote: On 1/21/2015 5:16 PM, Carl Roberts wrote: BTW - it seems that is very hard to get started with the Embedded server. The doc is out of date. The code seems to be untested and buggy. On 1/21/15, 7:15 PM, Carl Roberts wrote: HmmmIt looks like FutureTask is calling setException(Throwable t) with this exception which is not making it to the console. What I don't understand is why it is throwing that exception. I made sure that I added lucene-queries-4.10.3.jar file to the classpath by adding it to the solr home directory. See the new tracing: I'm pretty sure that all the lucene jars need to be available *before* Solr reaches the point in the log that you have quoted, where it adds jars from ${solr.solr.home}/lib. This would be the same location where the solrj and solr-core jars live. The only kind of jars that should be in the solr home lib directory are extra jars for extra features that you might specify in schema.xml (or some places in solrconfig.xml), like the ICU analysis jars, tika, mysql, etc. Thanks, Shawn
Re: Issue with Solr multiple sort
Thanks Hoss for clearing up my doubt. I was confused with the ordering. So I guess, the first field is always the primary sort field followed by secondary. Thanks again. -- View this message in context: http://lucene.472066.n3.nabble.com/Issue-with-Solr-multiple-sort-tp4181056p4181062.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Errors using the Embedded Solar Server
Got it all working...:) I just replaced the solrconfig.xml and schema.xml files that I was using with the ones from collection1 in one of the examples. I had modified those files to remove certain sections which I thought were not needed and apparently I don't understand those files very well yet...:) Many thanks, Joe On 1/21/15, 8:47 PM, Carl Roberts wrote: Hi Shawn, Many thanks for all your help. Moving the lucene JARs from solr.solr.home/lib to the same classpath directory as the solr JARs plus adding a bunch more dependency JAR files and most of the files from the collection1/conf directory - these ones to be exact, has me a lot closer to my goal: rw-r--r-- 1 carlroberts staff 38 Jan 21 20:41 _rest_managed.json -rw-r--r-- 1 carlroberts staff 56 Jan 21 20:41 _schema_analysis_stopwords_english.json -rw-r--r-- 1 carlroberts staff 4041 Dec 10 00:37 currency.xml -rw-r--r-- 1 carlroberts staff 1386 Dec 10 00:37 elevate.xml drwxr-xr-x 41 carlroberts staff 1394 Dec 10 00:37 lang -rw-r--r-- 1 carlroberts staff894 Dec 10 00:37 protwords.txt -rw-r--r--@ 1 carlroberts staff 62063 Jan 21 13:02 schema.xml -rw-r--r--@ 1 carlroberts staff 76821 Jan 21 13:03 solrconfig.xml -rw-r--r-- 1 carlroberts staff 16 Dec 10 00:37 spellings.txt -rw-r--r-- 1 carlroberts staff795 Dec 10 00:37 stopwords.txt -rw-r--r-- 1 carlroberts staff 1148 Dec 10 00:37 synonyms.txt I am now getting this: [main] INFO org.apache.solr.core.SolrResourceLoader - new SolrResourceLoader for directory: '/Users/carlroberts/dev/solr-4.10.3/' [main] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/lib/commons-logging-1.2.jar' to classloader [main] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/lib/servlet-api.jar' to classloader [main] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/lib/slf4j-simple-1.7.5.jar' to classloader [main] INFO org.apache.solr.core.ConfigSolr - Loading container configuration from /Users/carlroberts/dev/solr-4.10.3/solr.xml [main] INFO org.apache.solr.core.CoreContainer - New CoreContainer 139145087 [main] INFO org.apache.solr.core.CoreContainer - Loading cores into CoreContainer [instanceDir=/Users/carlroberts/dev/solr-4.10.3/] [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting socketTimeout to: 0 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting urlScheme to: null [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting connTimeout to: 0 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting maxConnectionsPerHost to: 20 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting corePoolSize to: 0 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting maximumPoolSize to: 2147483647 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting maxThreadIdleTime to: 5 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting sizeOfQueue to: -1 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory - Setting fairnessPolicy to: false [main] INFO org.apache.solr.update.UpdateShardHandler - Creating UpdateShardHandler HTTP client with params: socketTimeout=0connTimeout=0retry=false [main] INFO org.apache.solr.logging.LogWatcher - SLF4J impl is org.slf4j.impl.SimpleLoggerFactory [main] INFO org.apache.solr.logging.LogWatcher - No LogWatcher configured [main] INFO org.apache.solr.core.CoreContainer - Host Name: null [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - new SolrResourceLoader for directory: '/Users/carlroberts/dev/solr-4.10.3/db/' [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrConfig - Adding specified lib dirs to ClassLoader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-core-0.7.2.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/apache-mime4j-dom-0.7.2.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/aspectjrt-1.6.11.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/bcmail-jdk15-1.45.jar' to classloader [coreLoadExecutor-5-thread-1] INFO org.apache.solr.core.SolrResourceLoader - Adding 'file:/Users/carlroberts/dev/solr-4.10.3/contrib/extraction/lib/bcprov-jdk15-1.45.jar' to classloader [coreLoadExecutor-5-thread-1] INFO
Re: Solr Recovery process
Thank you Shalin.So in a system where the indexing rate is more than 5K TPS or so the replica will never be able to recover through peer sync process.In my case I have mostly seen step 3 where a full copy happens and if the index size is huge it takes a very long time for replicas to recover.Is there a way we can configure the number of missed updates for peer sync. Thanks, Nishanth On Wed, Jan 21, 2015 at 4:47 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: Hi Nishanth, The recovery happens as follows: 1. PeerSync is attempted first. If the number of new updates on leader is less than 100 then the missing documents are fetched directly and indexed locally. The tlog tells us the last 100 updates very quickly. Other uses of the tlog are for durability of updates and of course, startup recovery. 2. If the above step fails then replication recovery is attempted. A hard commit is called on the leader and then the leader is polled for the latest index version and generation. If the leader's version and generation are greater than local index's version/generation then the difference of the index files between leader and replica are fetched and installed. 3. If the above fails (because leader's version/generation is somehow equal or more than local) then a full index recovery happens and the entire index from the leader is fetched and installed locally. There are some other details involved in this process too but probably not worth going into here. On Wed, Jan 21, 2015 at 5:13 PM, Nishanth S nishanth.2...@gmail.com wrote: Hello Everyone, I am hitting a few issues with solr replicas going into recovery and then doing a full index copy.I am trying to understand the solr recovery process.I have read a few blogs on this and saw that when leader notifies a replica to recover(in my case it is due to connection resets) it will try to do a peer sync first and if the missed updates are more than 100 it will do a full index copy from the leader.I am trying to understand what peer sync is and where does tlog come into picture.Are tlogs replayed only during server restart?.Can some one help me with this? Thanks, Nishanth -- Regards, Shalin Shekhar Mangar.
RE: Field collapsing memory usage
Norgorn [lsunnyd...@mail.ru] wrote: So, as we see, memory, used by first shard to group, wasn't released. Caches are already nearly zero. It should be one or the other: Either the memory is released or there is something in the caches. Anyway, DocValues is the way to go, so ensure that it turned on for your group field: We do grouping on indexes with 250M documents (and 200M+ unique values in the group field) without any significant memory overhead, using DocValues. Caveat: If you ask for very large result sets, the memory usage will be high. But only temporarily. - Toke Eskildsen
Field collapsing memory usage
We are trying to run SOLR with big index, using as little RAM as possible. Simple search for our cases works nice, but field collapsing (group=true) queries fall with OOM. Our setup is several shards per SOLR entity, each shard on it's own HDD. We've tried same queries, but to one specific shard, and those queries worked well (no OOMs). Then we changed shard being queried and measured RAM usage. We saw, that while there is only one shard being queried, used RAM increased significantly. So, as we see, memory, used by first shard to group, wasn't released. Caches are already nearly zero. Changing shards, we've managed to make SOLR fall. My question is, why is it so? What do we need to do, to release memory, to, at the end, be able to query shards alternately (cause parallel group query fails nearly always)? -- View this message in context: http://lucene.472066.n3.nabble.com/Field-collapsing-memory-usage-tp4181092.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: If I change schema.xml then reIndex is neccessary in Solr or not?
Ok. Thanx On Thu, Jan 22, 2015 at 11:38 AM, Gora Mohanty g...@mimirtech.com wrote: On 22 January 2015 at 11:23, Nitin Solanki nitinml...@gmail.com wrote: I *indexed* *2GB* of data. Now I want to *change* the *type* of *field* from *textSpell* to *string* type into Yes, one would need to reindex. Regards, Gora
Re: MultiPhraseQuery:Rewrite to BooleanQuery
Any ideas? -- View this message in context: http://lucene.472066.n3.nabble.com/MultiPhraseQuery-Rewrite-to-BooleanQuery-tp4180638p4180820.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Using SolrCloud to implement a kind of federated search
On Tue, 2015-01-20 at 15:41 +0100, Jürgen Wagner (DVT) wrote: [Snip: Valid concerns] 3. Cardinality: there may be rather large collections and some smaller collections in the federation. If you use SolrCloud to obtain results, the ones from smaller collections will get more significance in the result mixing than the ones from the larger collections, as relevance will be relative to each federated source. The math might be solvable or at least fuzzy solvable: SOLR-1632 takes care of unifying term stats and site-specific boosts, defined in the merger, can compensate somewhat for overall score-adjustments from the different sites. 4. Uniqueness: different systems may index the same documents. The idea of having a globally unique identifier should take this into account, i.e., it won't suffice to simply prefix each (locally unique) document id with a source identifier. The federated sources must be aware of being federated and possibly having overlaps. Otherwise, you will get multiple occurrences of very popular documents. Different sources might have different meta-data on the same entity. Some sort of nearly-duplicate-document-merge might be preferable. 6. Orchestration: there will be some issues with the orchestration of these services. Zookeeper won't scale to the multiple datacenter topology, effectively leaving node discovery to some other mechanism yet to be defined. If the nodes are locally run proxies exposed as a Solr shard, the connection details will be de-coupled from ZooKeeper. That would also allow for mapping of field names values and similar site-specific adjustments of requests queries. In my experience, there is a clear distinction between technical federated search (possibly something like the tribe nodes) and semantic federated search (requiring special processing of results obtained from different sources, ready to be consolidated). We have spend a fair amount of time getting semantic federated search (we call it integrated search) to work across our sources. The raw requesting merging is not too hard: Most of the development time has been spend mapping values and adjusting how the merger should order the documents. - Toke Eskildsen, State and University Library, Denmark
Re: shards per disk
On Wed, 2015-01-21 at 09:46 +0100, Toke Eskildsen wrote: Anyway, RAID 0 does really help for random access, [...] Should have been ...does not really help - Toke Eskildsen
Re: shards per disk
On Wed, 2015-01-21 at 07:56 +0100, Nimrod Cohen wrote: RAID [0] configuration each shard has data on each one of the 8 disks in the RAID, on each query to get 1K docs, each shard request to get data from the one RAID disk, so we get 8 request to get date from all of the disks and we get a queue. Your RAID-setup (whether it is hardware or software) should use a parallel queue, so that requests to different physical drives are issued in parallel under the hood. But RAID is not that well-defined, so maybe your controller or your software uses a single sequential queue. In that case, the pattern will be as you describe. Anyway, RAID 0 does really help for random access, when your access pattern is homogeneous across shards. Even if you fix the problem with your current RAID 0 setup, it is unlikely that you would get a noticeable performance advantage over separate drives. It would make it easier to add shards though, as you would not have to purchase a new drive or unbalance your setup by running multiple shards on some drives. Regarding the response time, 2-3 seconds is good for our usage also getting better is always better, if we will get better we might run the analysis on more than 1K. Limit the amount of fields you request and try experimenting with SolrJ and the binary protocol: I have found that the time for serializing the result to XML can be quite high for large responses. If the number of fields needed is very low and the content of those fields is not large, you could try using faceting with DocValues to get the content. - Toke Eskildsen, State and University Library, Denmark
Add user-defined field into suggestions block.
I am working on solr spell checker along with suggester. I am saving document like this : {ngram:the,count:10} {ngram:the age,count:5} {ngram:the age of,count:3} where *ngram* is unique key and applied *StandardTokenizer* and *ShingleFactoryFilter*(1 to 5 size). So, when I search word *the* it gives results along with suggetions like : response:{numFound:63,start:0,maxScore:15.783233,docs:[ { count:10, gram:the, _version_:1489726792958738435} }, suggestion:[{ word:that, freq:1169}, { word:they, freq:712}] ] So, suggestion gives *word* and *freq* field according to them. I want to *add one more field* - *count* into suggestion block where *count* should the same value which is available into documents. I don't want to use *freq* field into suggestion block. Instead of that I want *count* field. How can I do that?
How much maximum data can we hard commit in Solr?
How much of maximum data we can commit on Solr using hard commit without using Soft commit. maxTime is 1000 in autoCommit Details explanation is on Stackoverflow http://stackoverflow.com/questions/28067853/how-much-maximum-data-can-we-hard-commit-in-solr .
Re: How to make edge_ngram work with number, underscores, dashes and space
Thanks a lot Alex... It looks like it works as expected... I removed EdgeNGramFilterFactory from query section and used KeywordTokenizerFactory in index... this is final version.. fieldType name=text_general_edge_ngram class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=1 maxGramSize=15 side=front/ /analyzer analyzer type=query tokenizer class=solr.LowerCaseTokenizerFactory/ /analyzer /fieldType So... when is right to use *tokenizer * class=solr.EdgeNGramTokenizerFactory/... On Tue, Jan 20, 2015 at 11:46 PM, Alexandre Rafalovitch arafa...@gmail.com wrote: So, try the suggested tokenizers and dump the ngrams from query. See what happens. Ask a separate question with corrected config/output if you still have issues. Regards, Alex. Sign up for my Solr resources newsletter at http://www.solr-start.com/ On 20 January 2015 at 23:08, Vishal Swaroop vishal@gmail.com wrote: Thanks for the response.. a) I am trying to make it non-case-sensitive... itemName data is indexed in upper case b) I am looking to display the result as type-ahead suggestion which might include space, underscore, number... - ABC12DE : It does not work as soon as I type 1.. i.e. ABC1 Output expected A, AB, ABC, ABC1... so on Data can also have underscores, dashes - ABC_12DE, : Output expected A, AB, ABC, ABC_, ABC_1... so on Filed name type defined in schema : field name=itemName type=text_general_edge_ngram indexed=true stored=true multiValued=false / fieldType name=text_general_edge_ngram class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.LowerCaseTokenizerFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=1 maxGramSize=15 side=front/ /analyzer analyzer type=query tokenizer class=solr.LowerCaseTokenizerFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=1 maxGramSize=15 side=front/ /analyzer /fieldType On Tue, Jan 20, 2015 at 9:53 PM, Alexandre Rafalovitch arafa...@gmail.com wrote: Were you actually trying to ...divides text at non-letters and converts them to lower case? Or were you trying to make it non-case-sensitive, which would be KeywordTokenizer and LowerCaseFilter? Also, normally we do not use NGRam filter on both Index and Query. That just makes things to match on common prefixes instead of matching what you are searching for to a prefix of original word. Regards, Alex. Sign up for my Solr resources newsletter at http://www.solr-start.com/ On 20 January 2015 at 21:47, Vishal Swaroop vishal@gmail.com wrote: Hi, May be this is basic but I am trying to understand which Tokenizer and Filter to use. I followed some examples as mentioned in solr wiki but type-ahead does not show expected suggestions. Example itemName data can be : - ABC12DE : It does not work as soon as I type 1.. i.e. ABC1 - ABC_12DE, ABC 12DE - Data can also have underscores, dashes - I am tyring ignorecase auto suggest Filed name type defined in schema : field name=itemName type=text_general_edge_ngram indexed=true stored=true multiValued=false / fieldType name=text_general_edge_ngram class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.LowerCaseTokenizerFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=1 maxGramSize=15 side=front/ /analyzer analyzer type=query tokenizer class=solr.LowerCaseTokenizerFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=1 maxGramSize=15 side=front/ /analyzer /fieldType
Re: MultiPhraseQuery:Rewrite to BooleanQuery
Hi, Strictly speaking, MultiPhraseQuery and BooleanQuery wrapping PhraseQuerys are not equal. For each query, Query.rewrite() returns different object. (with Lucene 4.10.3) q1.rewrite(reader).toString() returns: body:blueberry chocolate (pie tart), where q1 is your first multi phrase query. q2.rewrite(reader).toString() returns: body:blueberry chocolate pie body:blueberry chocolate tart, where q2 is your second boolean query. In practice... I *think* two queries may return same set of documents, but I'm not sure about scoring/ranking. I suggest you ask to java-user@lucene mailing list as for Lucene API. Regards, Tomoko 2015-01-21 19:12 GMT+09:00 ku3ia dem...@gmail.com: Any ideas? -- View this message in context: http://lucene.472066.n3.nabble.com/MultiPhraseQuery-Rewrite-to-BooleanQuery-tp4180638p4180820.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to index data from multiple data source
On 1/20/2015 10:43 PM, Yusniel Hidalgo Delgado wrote: I am diving into Solr recently and I need help in the following usage scenery. I am working on a project for extract and search bibliographic metadata from PDF files. Firstly, my PDF files are processed to extract bibliographic metadata such as title, authors, affiliations, keywords and abstract. These metadata are stored in a relational database and then are indexed in Solr via DIH, however, I need to index also the fulltext of PDF and maintain the same ID between metadata indexed and fulltext of PDF indexed in Solr index. How to do that? How to configure sorlconfig.xml and schema.xml to do it? How are you doing the indexing? If it's in a program you wrote yourself, simply extend that program to obtain the information you need and add it to the document that you index. The Apache Tika project is one way to parse rich text documents. If you are using the dataimport handler, you are likely to need a nested entity to gather the additional information and include it in the document that is being indexed in the parent entity. The reply from Alvaro shows one way to integrate Tika into DIH. It looks like those instructions are geared to an extremely old Solr version (3.6.2) and probably won't work as-is on a newer version. Solr 4.x was already available when that blog post was written two years ago, so I don't know why they went with 3.6.2. Thanks, Shawn
Re: How much maximum data can we hard commit in Solr?
On 1/21/2015 6:01 AM, Nitin Solanki wrote: How much of maximum data we can commit on Solr using hard commit without using Soft commit. maxTime is 1000 in autoCommit Details explanation is on Stackoverflow http://stackoverflow.com/questions/28067853/how-much-maximum-data-can-we-hard-commit-in-solr The answer to the question you asked: All of it. I suspect you are actually trying to ask a different question. Some additional info, hopefully you can use it to answer what you'd really like to know: You could build your entire index with no commits and then issue a single hard commit and everything would work. The problem with that approach is that if you have the updateLog turned on, then every single one of those documents will be reindexed from the transaction log at Solr startup - it could take a REALLY long time. http://wiki.apache.org/solr/SolrPerformanceProblems#Slow_startup Hard commits are the only way to close a transaction log and open a new one. Solr keeps enough transaction logs around so that it can re-index a minimum of 100 documents ... but it can't break the transaction logs into parts, so if everything is in one log, then that giant log will be replayed on startup. A maxTime of 1000 on autoCommit or autoSoftCommit is usually way too low. We find that this setting is normally driven by unrealistic requirements from sales or marketing, who say that data must be available within one second of indexing. It is extremely rare for this to be truly required. The autoCommit settings control automatic hard commits, and autoSoftCommit naturally controls automatic soft commits. With a maxTime of 1000, you will be issuing a commit every single second while you index. Commits are very resource-intensive operations, doing them once a second will keep your hardware VERY busy. Normally a commit operation will take a lot longer than one second to complete, so if you are starting another one a second later, they will overlap, and that can cause a lot of problems. Thanks, Shawn
Re: AW: AW: transactions@Solr(J)
On 1/20/2015 11:42 PM, Clemens Wyss DEV wrote: But then what happens if: Autocommit is set to 10 docs and I add 11 docs and then decide (due to an exception?) to rollback. Will only one (i.e. the last added) document be rollen back? The way I understand the low-level architecture, yes -- assuming that all 11 documents actually got indexed. If the exception happened because of document 5 was badly formed, only documents 1-4 will have been indexed, and in that case, all four of them would get rolled back. Thanks, Shawn