deleteByQuery does not work with SolrCloud
Hi I am using SolrCloud withing solr 4.4 ,and I try the SolrJ API deleteByQuery to delete the Index as : CloudSolrServer cloudServer = new CloudSolrServer(myZKhost) cloudServer.connect() cloudServer.setDefaultCollection cloudServer.deleteByQuery(indexname:shardTv_20131010); cloudServer.commit(); It seems not to work. I also have do some google,unfortunately there is no help. Do I miss anything? Thanks Regard
Re: Test mail
Dear Aleksandr, You should not test the mailing list this way. You can describe your issue using Solr, instead, and test the mailing list at the same time. :-) All the Best, Darius 2013/10/23 Aleksandr Elbakyan ramal...@yahoo.com Hello Testing if mail works :)
Re: Question about sharding and overlapping
You can't control that if using the compositeIdRouter because the routing is dependent on the hash function. What you want is custom sharding i.e. the ability to control the shard to which updates are routed. You should create a collection using the Collections API with a shards param specifying the names of the shards you want to create. Then when indexing documents, include a shard=X param to route requests directly to that shard. While querying, you can choose to query the entire collection or again limit the shards using the same shard parameter. On Wed, Oct 23, 2013 at 4:22 AM, yriveiro yago.rive...@gmail.com wrote: Hi, I created a collection with 12 shards and route.field=month (month field will have values between 1 .. 12) I notice that I have shards with more that a month into them. This could left empty some shard and I want the documents one month in each shard. My question is, how I configure the sharding method to avoid overlaps? /Yago - Best regards -- View this message in context: http://lucene.472066.n3.nabble.com/Question-about-sharding-and-overlapping-tp4097111.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Shalin Shekhar Mangar.
Re: shards.tolerant throwing null pointer exception when spellcheck is on
This is a known bug. See https://issues.apache.org/jira/browse/SOLR-5204 Patches welcome. On Wed, Oct 23, 2013 at 8:27 AM, Shamik Bandopadhyay sham...@gmail.comwrote: Hi, .96 I'm trying to simulate a fault tolerance test where a shard and its replica(s) goes. down, leaving other shard(s) running. To test it, I added str name=shards.toleranttrue/str in my request handler under defaults section. This is to make sure that the condition is added to each query running against this request handler. In my test environment, I have to 2 shards with a replica each. I brought down Shard 1 and Replica 1, then fired a query using SolrJ CloudSolrServer, which internally talks to the zookeeper ensemble. In my request handler, the spellcheck option is turned on. Due to this, the servers are throwing null pointer exception. Here's the stack trace. 2013-10-22 20:24:43,875] INFO482886[qtp1783079124-15] - org.apache.solr.core.SolrCore.execute(SolrCore.java:1909) - [collection1] webapp=/solr path=/testhtmlhelp params={spellcheck=onq=xrefwt=xmlfq=TestProductLine:ADTfq=TestProductRelease:ADT+2014fq=language:english} hits=157 status=500 QTime=70 [2013-10-22 20:24:43,876]ERROR482887[qtp1783079124-15] - org.apache.solr.common.SolrException.log(SolrException.java:119) - null:java.lang.NullPointerException at org.apache.solr.handler.component.SpellCheckComponent.finishStage(SpellCheckComponent.java:323) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:317) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:619) Here's the query detail from the server log, as you can see the spellcheck is on. [collection1] webapp=/solr path=/testhtmlhelp params={facet=onf.TestCategory.facet.limit=160tie=0.01shards.qt=/testhtmlhelpfl=id,scorefacet.field=Source2fq=TestProductLine:ADTfq=TestProductRelease:ADT+2014fq=language:englishrows=150defType=edismaxstart=0spellcheck=onshards.tolerant=trueshard.url=localhost:8984/solr/collection1/|localhost:8983/solr/collection1/q=xrefisShard=true} hits=157 status=0 QTime=15 Now, if I comment out the spellcheck option in request handler, the query works as expected, even if the other shard and its
Re: Solr 4.5 router.name issue?
The router.name in the collections node is not used. The router specified in the clusterstate.json is the one which is actually used. This is a known issue and will be fixed with Solr 4.6 See http://issues.apache.org/jira/browse/SOLR-5319 On Wed, Oct 23, 2013 at 4:10 AM, yriveiro yago.rive...@gmail.com wrote: Hi, I create a collection with command (Solr 4.5): http://localhost:8983/solr/admin/collections?action=CREATEname=testDocValuescollection.configName=page-statisticsnumShards=12maxShardsPerNode=12router.field=month The documentation says that the default router.name it's compositeId. The clusterstate.json it's write compositeId for the testDocValues collection but the zookepeer's node /collections/testDocValues says: { configName:page-statistics, router:{name:implicit} } Is it this correct or is some kind of issue? - Best regards -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-5-router-name-issue-tp4097110.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Shalin Shekhar Mangar.
Re: Solr Cloud Distributed IDF
On Wed, 2013-10-23 at 04:26 +0200, dboychuck wrote: I recently moved an index from 3.6 non-distributed to Solr Cloud 4.4 with three shards. My company uses a boosting function with a value assigned to each document. This boosting function no longer works dependably and I believe the cause is that IDF is not distributed. This seems like it should be a high priority for Solr Cloud. It has been relevant for several years, well before SolrCloud. We run a mixed environment (Lucene/Solr/external index) and hacked a kinda-sorta distributed IDF together by boosting the search terms, but it is a poor man's solution. Distributed IDF for Solr is a very old JIRA issue, dating back to 2009: https://issues.apache.org/jira/browse/SOLR-1632 Activity has been on/off and I can see that it was last updated in June, but I have no idea of how close it is to completion. If you want anything out-of-the-box at this time, you'll have to look at Elasticsearch, which has this feature. - Toke Eskildsen, State and University Library, Denmark
Re: Solr Cloud Distributed IDF
Can you say more about the problem? What did you see that led to that problem? How did you distribute docs between shards, and how is that different from your 3.6 setup? It might be a distributed IDF thing, or it could be something simpler. Upayavira On Wed, Oct 23, 2013, at 03:26 AM, dboychuck wrote: I recently moved an index from 3.6 non-distributed to Solr Cloud 4.4 with three shards. My company uses a boosting function with a value assigned to each document. This boosting function no longer works dependably and I believe the cause is that IDF is not distributed. This seems like it should be a high priority for Solr Cloud. Does anybody know the status of this feature? I understand that the elevate component does work for Solr Cloud in version 4.5 but unfortunately it would be a pretty big leap for how we are currently using our index and our boosting function for relevancy scoring. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Cloud-Distributed-IDF-tp4097127.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: DIH - URLDataSource import size
anyone? On Tue, Oct 22, 2013 at 9:50 PM, Raheel Hasan raheelhasan@gmail.comwrote: Hi, I have an issue that is only coming on live environment. The DIH with URLDataSource is not working when the file size imported is large (i.e. 100kb above - which is not so large). If its large, it returns nothing (as seen in the Debug section of DataImport at Solr Admin). However, when working on local environment, this issue doesnt come at all. (note that I am using it with URLDataSource with PlainTextEntityProcessor in the entity field). Please help me as I tried to get it done a lot, but cant !! Thanks a lot. -- Regards, Raheel Hasan -- Regards, Raheel Hasan
Re: Question about sharding and overlapping
Can I split shards as with compositeId using this method? On Wednesday, October 23, 2013, Shalin Shekhar Mangar wrote: You can't control that if using the compositeIdRouter because the routing is dependent on the hash function. What you want is custom sharding i.e. the ability to control the shard to which updates are routed. You should create a collection using the Collections API with a shards param specifying the names of the shards you want to create. Then when indexing documents, include a shard=X param to route requests directly to that shard. While querying, you can choose to query the entire collection or again limit the shards using the same shard parameter. On Wed, Oct 23, 2013 at 4:22 AM, yriveiro yago.rive...@gmail.comjavascript:; wrote: Hi, I created a collection with 12 shards and route.field=month (month field will have values between 1 .. 12) I notice that I have shards with more that a month into them. This could left empty some shard and I want the documents one month in each shard. My question is, how I configure the sharding method to avoid overlaps? /Yago - Best regards -- View this message in context: http://lucene.472066.n3.nabble.com/Question-about-sharding-and-overlapping-tp4097111.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Shalin Shekhar Mangar. -- /Yago Riveiro
Re: Class name of parsing the fq clause
Thanks Jack for detailing out the parser logic. Would it be possible for you to say something more about filter cache code flow... sometimes we do not use fq parameter in query string and pass the raw query Regards Sandeep On Mon, Oct 21, 2013 at 7:11 PM, Jack Krupansky j...@basetechnology.comwrote: Start with org.apache.solr.handler.**component.QueryComponent#**prepare which fetches the fq parameters and indirectly invokes the query parser(s): String[] fqs = req.getParams().getParams(**CommonParams.FQ); if (fqs!=null fqs.length!=0) { ListQuery filters = rb.getFilters(); // if filters already exists, make a copy instead of modifying the original filters = filters == null ? new ArrayListQuery(fqs.length) : new ArrayListQuery(filters); for (String fq : fqs) { if (fq != null fq.trim().length()!=0) { QParser fqp = QParser.getParser(fq, null, req); filters.add(fqp.getQuery()); } } // only set the filters if they are not empty otherwise // fq=someotherParam= will trigger all docs filter for every request // if filter cache is disabled if (!filters.isEmpty()) { rb.setFilters( filters ); Note that this line actually invokes the parser: filters.add(fqp.getQuery()); Then in org.apache.lucene.search.**Query.QParser#getParser: QParserPlugin qplug = req.getCore().getQueryPlugin(**parserName); QParser parser = qplug.createParser(qstr, localParams, req.getParams(), req); And for the common case of the Lucene query parser, org.apache.solr.search. **LuceneQParserPlugin#**createParser: public QParser createParser(String qstr, SolrParams localParams, SolrParams params, SolrQueryRequest req) { return new LuceneQParser(qstr, localParams, params, req); } And then in org.apache.lucene.search.**Query.QParser#getQuery: public Query getQuery() throws SyntaxError { if (query==null) { query=parse(); And then in org.apache.lucene.search.**Query.LuceneQParser#parse: lparser = new SolrQueryParser(this, defaultField); lparser.setDefaultOperator (QueryParsing.**getQueryParserDefaultOperator(**getReq().getSchema(), getParam(QueryParsing.OP))); return lparser.parse(qstr); And then in org.apache.solr.parser.**SolrQueryParserBase#parse: Query res = TopLevelQuery(null); // pass null so we can tell later if an explicit field was provided or not And then in org.apache.solr.parser.**QueryParser#TopLevelQuery, the parsing begins. And org.apache.solr.parser.**QueryParser.jj is the grammar for a basic Solr/Lucene query, and org.apache.solr.parser.**QueryParser.java is generated by JFlex, and a lot of the logic is in the base class of the generated class, org.apache.solr.parser.**SolrQueryParserBase.java. Good luck! Happy hunting! -- Jack Krupansky -Original Message- From: YouPeng Yang Sent: Monday, October 21, 2013 2:57 AM To: solr-user@lucene.apache.org Subject: Class name of parsing the fq clause Hi I search the solr with fq clause,which is like: fq=BEGINTIME:[2013-08-25T16:**00:00Z TO *] AND BUSID:(M3 OR M9) I am curious about the parsing process . I want to study it. What is the Java file name describes the parsing process of the fq clause. Thanks Regards.
Re: Adding documents in Solr plugin
I've tried to write the plugin code. Currently I do: AddUpdateCommand addUpdateCommand = new AddUpdateCommand(solrQueryRequest); DocIterator iterator = docList.iterator(); SolrIndexSearcher indexReader = solrQueryRequest.getSearcher(); while (iterator.hasNext()) { Document document = indexReader.doc(iterator.nextDoc()); SolrInputDocument solrInputDocument = new SolrInputDocument(); addUpdateCommand.clear(); addUpdateCommand.solrDoc = solrInputDocument; addUpdateCommand.solrDoc.setField(id, document.get(id)); addUpdateCommand.solrDoc.setField(my_updated_field, new_value); updateRequestProcessor.processAdd(addUpdateCommand); } But this is very expensive since the update handler will fetch again the document which I already hold at hand. Is there a safe way to update the lucene document and write it back while taking into account all the Solr related code such as caches, extra solr logic, etc? I was thinking of converting it to a SolrInputDocument and then just add the document through Solr but I need first to convert all fields. Thanks in advance, Avner -- View this message in context: http://lucene.472066.n3.nabble.com/Adding-documents-in-Solr-plugin-tp4071574p4097168.html Sent from the Solr - User mailing list archive at Nabble.com.
fq with { or } in Solr 4.3.1
Hi If I do a search like /search?q=catid:{123} I get the results I expect. But if I do /search?q=*:*fq=catid{123} I get an error from Solr like: org.apache.solr.search.SyntaxError: Cannot parse 'catid:{123}': Encountered } } at line 1, column 58. Was expecting one of: TO ... RANGE_QUOTED ... RANGE_GOOP ... Can I not use { or } in an fq? Thanks, Peter
Re: fq with { or } in Solr 4.3.1
Missing a colon before the curly bracket in the fq? On Wed, Oct 23, 2013, at 09:42 AM, Peter Kirk wrote: Hi If I do a search like /search?q=catid:{123} I get the results I expect. But if I do /search?q=*:*fq=catid{123} I get an error from Solr like: org.apache.solr.search.SyntaxError: Cannot parse 'catid:{123}': Encountered } } at line 1, column 58. Was expecting one of: TO ... RANGE_QUOTED ... RANGE_GOOP ... Can I not use { or } in an fq? Thanks, Peter
RE: fq with { or } in Solr 4.3.1
Sorry, that was just a typo. / search?q=*:*fq=catid:{123} Gives me the error. I think that { and } must be used in ranges for fq, and that's why I can't use them directly like this. /Peter -Original Message- From: Upayavira [mailto:u...@odoko.co.uk] Sent: 23. oktober 2013 10:52 To: solr-user@lucene.apache.org Subject: Re: fq with { or } in Solr 4.3.1 Missing a colon before the curly bracket in the fq? On Wed, Oct 23, 2013, at 09:42 AM, Peter Kirk wrote: Hi If I do a search like /search?q=catid:{123} I get the results I expect. But if I do /search?q=*:*fq=catid{123} I get an error from Solr like: org.apache.solr.search.SyntaxError: Cannot parse 'catid:{123}': Encountered } } at line 1, column 58. Was expecting one of: TO ... RANGE_QUOTED ... RANGE_GOOP ... Can I not use { or } in an fq? Thanks, Peter
RE: Facet performance
On Tue, 2013-10-22 at 17:25 +0200, Lemke, Michael SZ/HZA-ZSW wrote: On Tue, October 22, 2013 11:54 AM Andre Bois-Crettez wrote: This is with Solr 1.4. Really ? This sound really outdated to me. Have you tried a tried more recent version, 4.5 just went out ? Sorry, can't. Too much `grown' stuff. I did not see that. I guess I parsed it as 4.1. Well, that rules out DocValues and fcs (as far as I remember). I am a bit surprised that the limit on #terms with fc is also in 1.4. I thought it was introduced in a later version. We too has been in a position where upgrading was hard due to homegrown addons. We even scrapped some DidYouMean-like functionality when going from 3.x to 4.x, but 4.x was so much better that there were little choice. Last suggestion for using fc: Create 2 or more CONTENT-fields and choose between them randomly when indexing. Facet on all the CONTENT fields and merge the results. It will take a bit more RAM though, so it is still out on your (assumedly) 32 bit machine. Regards, Toke Eskildsen, State and University Library, Denmark
Issue with large html indexing
Hi, I have an issue here while indexing large html. Here is the confguration for that: 1) Data is imported via URLDataSource / PlainTextEntityProcessor (DIH) 2) Schema has this for the field: type=text_en_splitting indexed=true stored=false required=false 3) text_en_splitting has the following work done for indexing: HTMLStripCharFilterFactory WhitespaceTokenizerFactory (create tokens) StopFilterFactory WordDelimiterFilterFactory ICUFoldingFilterFactory PorterStemFilterFactory RemoveDuplicatesTokenFilterFactory LengthFilterFactory However, the indexed data is like this (as in the attached image): [image: Inline image 1] so what are these numbers? If I put small html, it works fine, but as the size of html file increases, this is what happens.. -- Regards, Raheel Hasan
RE: fq with { or } in Solr 4.3.1
For filtering categories i'm using something like this : fq=category:(cat1 OR cat2 OR cat3) - Thanks, Michael -- View this message in context: http://lucene.472066.n3.nabble.com/fq-with-or-in-Solr-4-3-1-tp4097170p4097183.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Stop/Restart Solr
Kill -9 didnt kill it... ... the process is now again listed, but with PPID=1 which I dont want to kill as many processes have this same id... On Tue, Oct 22, 2013 at 11:59 PM, Utkarsh Sengar utkarsh2...@gmail.comwrote: We use this to start/stop solr: Start: java -Dsolr.clustering.enabled=true -Dsolr.solr.home=multicore -Djetty.class.path=lib/ext/* -Dbootstrap_conf=true -DnumShards=3 -DSTOP.PORT=8079 -DSTOP.KEY=some_value -jar start.jar Stop: java -Dsolr.solr.home=multicore -Dbootstrap_conf=true -DnumShards=3 -DSTOP.PORT=8079 -DSTOP.KEY=some_value -jar start.jar --stop Thanks, -Utkarsh On Tue, Oct 22, 2013 at 10:09 AM, Raheel Hasan raheelhasan@gmail.com wrote: ok fantastic... thanks a lot guyz On Tue, Oct 22, 2013 at 10:00 PM, François Schiettecatte fschietteca...@gmail.com wrote: Yago has the right command to search for the process, that will get you the process ID specifically the first number on the output line, then do 'kill ###', if that fails 'kill -9 ###'. François On Oct 22, 2013, at 12:56 PM, Raheel Hasan raheelhasan@gmail.com wrote: its CentOS... and using jetty with solr here.. On Tue, Oct 22, 2013 at 9:54 PM, François Schiettecatte fschietteca...@gmail.com wrote: A few more specifics about the environment would help, Windows/Linux/...? Jetty/Tomcat/...? François On Oct 22, 2013, at 12:50 PM, Yago Riveiro yago.rive...@gmail.com wrote: If you are asking about if solr has a way to restart himself, I think that the answer is no. If you lost control of the remote machine someone will need to go and restart the machine ... You can try use a kvm or other remote control system -- Yago Riveiro Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Tuesday, October 22, 2013 at 5:46 PM, François Schiettecatte wrote: If you are on linux/unix, use the kill command. François On Oct 22, 2013, at 12:42 PM, Raheel Hasan raheelhasan@gmail.com (mailto: raheelhasan@gmail.com) wrote: Hi, is there a way to stop/restart java? I lost control over it via SSH and connection was closed. But the Solr (start.jar) is still running. thanks. -- Regards, Raheel Hasan -- Regards, Raheel Hasan -- Regards, Raheel Hasan -- Thanks, -Utkarsh -- Regards, Raheel Hasan
Re: Stop/Restart Solr
also, is this DSTOP.PORT same as on which solr is visible on a browser (i.e. like 8983 from http://localhost:8983)? On Wed, Oct 23, 2013 at 2:49 PM, Raheel Hasan raheelhasan@gmail.comwrote: Kill -9 didnt kill it... ... the process is now again listed, but with PPID=1 which I dont want to kill as many processes have this same id... On Tue, Oct 22, 2013 at 11:59 PM, Utkarsh Sengar utkarsh2...@gmail.comwrote: We use this to start/stop solr: Start: java -Dsolr.clustering.enabled=true -Dsolr.solr.home=multicore -Djetty.class.path=lib/ext/* -Dbootstrap_conf=true -DnumShards=3 -DSTOP.PORT=8079 -DSTOP.KEY=some_value -jar start.jar Stop: java -Dsolr.solr.home=multicore -Dbootstrap_conf=true -DnumShards=3 -DSTOP.PORT=8079 -DSTOP.KEY=some_value -jar start.jar --stop Thanks, -Utkarsh On Tue, Oct 22, 2013 at 10:09 AM, Raheel Hasan raheelhasan@gmail.com wrote: ok fantastic... thanks a lot guyz On Tue, Oct 22, 2013 at 10:00 PM, François Schiettecatte fschietteca...@gmail.com wrote: Yago has the right command to search for the process, that will get you the process ID specifically the first number on the output line, then do 'kill ###', if that fails 'kill -9 ###'. François On Oct 22, 2013, at 12:56 PM, Raheel Hasan raheelhasan@gmail.com wrote: its CentOS... and using jetty with solr here.. On Tue, Oct 22, 2013 at 9:54 PM, François Schiettecatte fschietteca...@gmail.com wrote: A few more specifics about the environment would help, Windows/Linux/...? Jetty/Tomcat/...? François On Oct 22, 2013, at 12:50 PM, Yago Riveiro yago.rive...@gmail.com wrote: If you are asking about if solr has a way to restart himself, I think that the answer is no. If you lost control of the remote machine someone will need to go and restart the machine ... You can try use a kvm or other remote control system -- Yago Riveiro Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Tuesday, October 22, 2013 at 5:46 PM, François Schiettecatte wrote: If you are on linux/unix, use the kill command. François On Oct 22, 2013, at 12:42 PM, Raheel Hasan raheelhasan@gmail.com (mailto: raheelhasan@gmail.com) wrote: Hi, is there a way to stop/restart java? I lost control over it via SSH and connection was closed. But the Solr (start.jar) is still running. thanks. -- Regards, Raheel Hasan -- Regards, Raheel Hasan -- Regards, Raheel Hasan -- Thanks, -Utkarsh -- Regards, Raheel Hasan -- Regards, Raheel Hasan
RE: Stop/Restart Solr
Can you please share output of following command? ps -ef | grep 'start.jar' - Jeeva -- Original Message -- From: Raheel Hasan [mailto:raheelhasan@gmail.com] Sent: October 23, 2013 3:19:46 PM GMT+05:30 To: solr-user@lucene.apache.org Subject: Re: Stop/Restart Solr Kill -9 didnt kill it... ... the process is now again listed, but with PPID=1 which I dont want to kill as many processes have this same id... On Tue, Oct 22, 2013 at 11:59 PM, Utkarsh Sengar utkarsh2...@gmail.comwrote: We use this to start/stop solr: Start: java -Dsolr.clustering.enabled=true -Dsolr.solr.home=multicore -Djetty.class.path=lib/ext/* -Dbootstrap_conf=true -DnumShards=3 -DSTOP.PORT=8079 -DSTOP.KEY=some_value -jar start.jar Stop: java -Dsolr.solr.home=multicore -Dbootstrap_conf=true -DnumShards=3 -DSTOP.PORT=8079 -DSTOP.KEY=some_value -jar start.jar --stop Thanks, -Utkarsh On Tue, Oct 22, 2013 at 10:09 AM, Raheel Hasan raheelhasan@gmail.com wrote: ok fantastic... thanks a lot guyz On Tue, Oct 22, 2013 at 10:00 PM, François Schiettecatte fschietteca...@gmail.com wrote: Yago has the right command to search for the process, that will get you the process ID specifically the first number on the output line, then do 'kill ###', if that fails 'kill -9 ###'. François On Oct 22, 2013, at 12:56 PM, Raheel Hasan raheelhasan@gmail.com wrote: its CentOS... and using jetty with solr here.. On Tue, Oct 22, 2013 at 9:54 PM, François Schiettecatte fschietteca...@gmail.com wrote: A few more specifics about the environment would help, Windows/Linux/...? Jetty/Tomcat/...? François On Oct 22, 2013, at 12:50 PM, Yago Riveiro yago.rive...@gmail.com wrote: If you are asking about if solr has a way to restart himself, I think that the answer is no. If you lost control of the remote machine someone will need to go and restart the machine ... You can try use a kvm or other remote control system -- Yago Riveiro Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Tuesday, October 22, 2013 at 5:46 PM, François Schiettecatte wrote: If you are on linux/unix, use the kill command. François On Oct 22, 2013, at 12:42 PM, Raheel Hasan raheelhasan@gmail.com (mailto: raheelhasan@gmail.com) wrote: Hi, is there a way to stop/restart java? I lost control over it via SSH and connection was closed. But the Solr (start.jar) is still running. thanks. -- Regards, Raheel Hasan -- Regards, Raheel Hasan -- Regards, Raheel Hasan -- Thanks, -Utkarsh -- Regards, Raheel Hasan
Minor bug with CloudSolrServer and collection-alias.
I found this bug in both 4.4 and 4.5 Using cloudSolrServer.setDefaultCollection(collectionId) does not work as intended for an alias spanning more than 1 collection. The virtual collection-alias collectionID is recoqnized as a existing collection, but it does only query one of the collections it is mapped to. You can confirm this easy in AliasIntegrationTest. The test-class AliasIntegrationTest creates to cores with 2 and 3 different documents. And then creates an alias pointing to both of them. Line 153: // search with new cloud client CloudSolrServer cloudSolrServer = new CloudSolrServer(zkServer.getZkAddress(), random().nextBoolean()); cloudSolrServer.setParallelUpdates(random().nextBoolean()); query = new SolrQuery(*:*); query.set(collection, testalias); res = cloudSolrServer.query(query); cloudSolrServer.shutdown(); assertEquals(5, res.getResults().getNumFound()); No unit-test bug here, however if you change it from setting the collectionid on the query but on CloudSolrServer instead,it will produce the bug: // search with new cloud client CloudSolrServer cloudSolrServer = new CloudSolrServer(zkServer.getZkAddress(), random().nextBoolean()); cloudSolrServer.setDefaultCollection(testalias); cloudSolrServer.setParallelUpdates(random().nextBoolean()); query = new SolrQuery(*:*); //query.set(collection, testalias); res = cloudSolrServer.query(query); cloudSolrServer.shutdown(); assertEquals(5, res.getResults().getNumFound()); -- Assertion failure Should I create a Jira issue for this? From, Thomas Egense
Re: Stop/Restart Solr
31173 1 0 16:45 ?00:00:08 java -jar start.jar On Wed, Oct 23, 2013 at 2:53 PM, Jeevanandam M. je...@myjeeva.com wrote: Can you please share output of following command? ps -ef | grep 'start.jar' - Jeeva -- Original Message -- From: Raheel Hasan [mailto:raheelhasan@gmail.com] Sent: October 23, 2013 3:19:46 PM GMT+05:30 To: solr-user@lucene.apache.org Subject: Re: Stop/Restart Solr Kill -9 didnt kill it... ... the process is now again listed, but with PPID=1 which I dont want to kill as many processes have this same id... On Tue, Oct 22, 2013 at 11:59 PM, Utkarsh Sengar utkarsh2...@gmail.com wrote: We use this to start/stop solr: Start: java -Dsolr.clustering.enabled=true -Dsolr.solr.home=multicore -Djetty.class.path=lib/ext/* -Dbootstrap_conf=true -DnumShards=3 -DSTOP.PORT=8079 -DSTOP.KEY=some_value -jar start.jar Stop: java -Dsolr.solr.home=multicore -Dbootstrap_conf=true -DnumShards=3 -DSTOP.PORT=8079 -DSTOP.KEY=some_value -jar start.jar --stop Thanks, -Utkarsh On Tue, Oct 22, 2013 at 10:09 AM, Raheel Hasan raheelhasan@gmail.com wrote: ok fantastic... thanks a lot guyz On Tue, Oct 22, 2013 at 10:00 PM, François Schiettecatte fschietteca...@gmail.com wrote: Yago has the right command to search for the process, that will get you the process ID specifically the first number on the output line, then do 'kill ###', if that fails 'kill -9 ###'. François On Oct 22, 2013, at 12:56 PM, Raheel Hasan raheelhasan@gmail.com wrote: its CentOS... and using jetty with solr here.. On Tue, Oct 22, 2013 at 9:54 PM, François Schiettecatte fschietteca...@gmail.com wrote: A few more specifics about the environment would help, Windows/Linux/...? Jetty/Tomcat/...? François On Oct 22, 2013, at 12:50 PM, Yago Riveiro yago.rive...@gmail.com wrote: If you are asking about if solr has a way to restart himself, I think that the answer is no. If you lost control of the remote machine someone will need to go and restart the machine ... You can try use a kvm or other remote control system -- Yago Riveiro Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Tuesday, October 22, 2013 at 5:46 PM, François Schiettecatte wrote: If you are on linux/unix, use the kill command. François On Oct 22, 2013, at 12:42 PM, Raheel Hasan raheelhasan@gmail.com (mailto: raheelhasan@gmail.com) wrote: Hi, is there a way to stop/restart java? I lost control over it via SSH and connection was closed. But the Solr (start.jar) is still running. thanks. -- Regards, Raheel Hasan -- Regards, Raheel Hasan -- Regards, Raheel Hasan -- Thanks, -Utkarsh -- Regards, Raheel Hasan -- Regards, Raheel Hasan
RE: Stop/Restart Solr
It seems process started recently. Is there any external cron/process triggering a startup of Solr? Kill again and monitor it. - Jeeva -- Original Message -- From: Raheel Hasan [mailto:raheelhasan@gmail.com] Sent: October 23, 2013 3:29:47 PM GMT+05:30 To: solr-user@lucene.apache.org Subject: Re: Stop/Restart Solr 31173 1 0 16:45 ?00:00:08 java -jar start.jar On Wed, Oct 23, 2013 at 2:53 PM, Jeevanandam M. je...@myjeeva.com wrote: Can you please share output of following command? ps -ef | grep 'start.jar' - Jeeva -- Original Message -- From: Raheel Hasan [mailto:raheelhasan@gmail.com] Sent: October 23, 2013 3:19:46 PM GMT+05:30 To: solr-user@lucene.apache.org Subject: Re: Stop/Restart Solr Kill -9 didnt kill it... ... the process is now again listed, but with PPID=1 which I dont want to kill as many processes have this same id... On Tue, Oct 22, 2013 at 11:59 PM, Utkarsh Sengar utkarsh2...@gmail.com wrote: We use this to start/stop solr: Start: java -Dsolr.clustering.enabled=true -Dsolr.solr.home=multicore -Djetty.class.path=lib/ext/* -Dbootstrap_conf=true -DnumShards=3 -DSTOP.PORT=8079 -DSTOP.KEY=some_value -jar start.jar Stop: java -Dsolr.solr.home=multicore -Dbootstrap_conf=true -DnumShards=3 -DSTOP.PORT=8079 -DSTOP.KEY=some_value -jar start.jar --stop Thanks, -Utkarsh On Tue, Oct 22, 2013 at 10:09 AM, Raheel Hasan raheelhasan@gmail.com wrote: ok fantastic... thanks a lot guyz On Tue, Oct 22, 2013 at 10:00 PM, François Schiettecatte fschietteca...@gmail.com wrote: Yago has the right command to search for the process, that will get you the process ID specifically the first number on the output line, then do 'kill ###', if that fails 'kill -9 ###'. François On Oct 22, 2013, at 12:56 PM, Raheel Hasan raheelhasan@gmail.com wrote: its CentOS... and using jetty with solr here.. On Tue, Oct 22, 2013 at 9:54 PM, François Schiettecatte fschietteca...@gmail.com wrote: A few more specifics about the environment would help, Windows/Linux/...? Jetty/Tomcat/...? François On Oct 22, 2013, at 12:50 PM, Yago Riveiro yago.rive...@gmail.com wrote: If you are asking about if solr has a way to restart himself, I think that the answer is no. If you lost control of the remote machine someone will need to go and restart the machine ... You can try use a kvm or other remote control system -- Yago Riveiro Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Tuesday, October 22, 2013 at 5:46 PM, François Schiettecatte wrote: If you are on linux/unix, use the kill command. François On Oct 22, 2013, at 12:42 PM, Raheel Hasan raheelhasan@gmail.com (mailto: raheelhasan@gmail.com) wrote: Hi, is there a way to stop/restart java? I lost control over it via SSH and connection was closed. But the Solr (start.jar) is still running. thanks. -- Regards, Raheel Hasan -- Regards, Raheel Hasan -- Regards, Raheel Hasan -- Thanks, -Utkarsh -- Regards, Raheel Hasan -- Regards, Raheel Hasan
Why analyzer only output part of my string ?
Hi All , I have configured a custom analyzer (Chinese) in solr 4.5.0 , when I access http://localhost:8983/solr/#/collection1/analysis , Choose my fieldType , and input some character string , why only part of string is analyzed ? the last part of string is dismissed. Is there any length limitation for analyzer ? how to configure this ? Thanks, -Judy
Re: Question about sharding and overlapping
No, shard splitting does not support collections with implicit router. On Wed, Oct 23, 2013 at 1:21 PM, Yago Riveiro yago.rive...@gmail.comwrote: Can I split shards as with compositeId using this method? On Wednesday, October 23, 2013, Shalin Shekhar Mangar wrote: You can't control that if using the compositeIdRouter because the routing is dependent on the hash function. What you want is custom sharding i.e. the ability to control the shard to which updates are routed. You should create a collection using the Collections API with a shards param specifying the names of the shards you want to create. Then when indexing documents, include a shard=X param to route requests directly to that shard. While querying, you can choose to query the entire collection or again limit the shards using the same shard parameter. On Wed, Oct 23, 2013 at 4:22 AM, yriveiro yago.rive...@gmail.com javascript:; wrote: Hi, I created a collection with 12 shards and route.field=month (month field will have values between 1 .. 12) I notice that I have shards with more that a month into them. This could left empty some shard and I want the documents one month in each shard. My question is, how I configure the sharding method to avoid overlaps? /Yago - Best regards -- View this message in context: http://lucene.472066.n3.nabble.com/Question-about-sharding-and-overlapping-tp4097111.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Shalin Shekhar Mangar. -- /Yago Riveiro -- Regards, Shalin Shekhar Mangar.
Re: DIH - URLDataSource import size
Following up within 15 hours is not going to do any good -- it just increases email traffic for everyone. Please understand that a lot of people here are in different time zones and almost all of them are volunteers answering questions in addition to their day jobs. Are there any exceptions in the logs of your production environment? On Wed, Oct 23, 2013 at 1:10 PM, Raheel Hasan raheelhasan@gmail.comwrote: anyone? On Tue, Oct 22, 2013 at 9:50 PM, Raheel Hasan raheelhasan@gmail.com wrote: Hi, I have an issue that is only coming on live environment. The DIH with URLDataSource is not working when the file size imported is large (i.e. 100kb above - which is not so large). If its large, it returns nothing (as seen in the Debug section of DataImport at Solr Admin). However, when working on local environment, this issue doesnt come at all. (note that I am using it with URLDataSource with PlainTextEntityProcessor in the entity field). Please help me as I tried to get it done a lot, but cant !! Thanks a lot. -- Regards, Raheel Hasan -- Regards, Raheel Hasan -- Regards, Shalin Shekhar Mangar.
New shard leaders or existing shard replicas depends on zookeeper?
Hi solr-users, I'm seeing some confusing behaviour in Solr/zookeeper and hope you can shed some light on what's happening/how I can correct it. We have two physical servers running automated builds of RedHat 6.4 and Solr 4.4.0 that host two separate Solr services. The first server (called ld01) has 24 shards and hosts a collection called 'ukdomain'; the second server (ld02) also has 24 shards and hosts a different collection called 'ldwa'. It's evidently important to note that previously both of these physical servers provided the 'ukdomain' collection, but the 'ldwa' server has been rebuilt for the new collection. When I start the ldwa solr nodes with their zookeeper configuration (defined in /etc/sysconfig/solrnode* and with collection.configName as 'ldwacfg') pointing to the development zookeeper ensemble, all nodes initially become shard leaders and then replicas as I'd expect. But if I change the ldwa solr nodes to point to the zookeeper ensemble also used for the ukdomain collection, all ldwa solr nodes start on the same shard (that is, the first ldwa solr node becomes the shard leader, then every other solr node becomes a replica for this shard). The significant point here is no other ldwa shards gain leaders (or replicas). The ukdomain collection uses a zookeeper collection.configName of 'ukdomaincfg', and prior to the creation of this ldwa service the collection.configName of 'ldwacfg' has never previously been used. So I'm confused why the ldwa service would differ when the only difference is which zookeeper ensemble is used (both zookeeper ensembles are automatedly built using version 3.4.5). If anyone can explain why this is happening and how I can get the ldwa services to start correctly using the non-development zookeeper ensemble, I'd be very grateful! If more information or explanation is needed, just ask. Thanks, Gil Gil Hoggarth Web Archiving Technical Services Engineer The British Library, Boston Spa, West Yorkshire, LS23 7BQ
Re: Indexing logs files of thousands of GBs
Prerna, The FileListEntityProcessor has a terribly inefficient recursive method, which will be using up all your heap building a list of files. I would suggest writing a client application and traverse your filesystem with NIO available in Java 7. Files.walkFileTree() and a FileVisitor. As you walk post up to the server with SolrJ. Cheers, Chris On 22 October 2013 18:58, keshari.prerna keshari.pre...@gmail.com wrote: Hello, I am tried to index log files (all text data) stored in file system. Data can be as big as 1000 GBs or more. I am working on windows. A sample file can be found at https://www.dropbox.com/s/mslwwnme6om38b5/batkid.glnxa64.66441 I tried using FileListEntityProcessor with TikaEntityProcessor which ended up in java heap exception and couldn't get rid of it no matter how much I increase my ram size. data-confilg.xml dataConfig dataSource name=bin type=FileDataSource / document entity name=f dataSource=null rootEntity=true processor=FileListEntityProcessor transformer=TemplateTransformer baseDir=//mathworks/devel/bat/A/logs/66048/ fileName=.*\.* onError=skip recursive=true field column=fileAbsolutePath name=path / field column=fileSize name=size/ field column=fileLastModified name=lastmodified / entity name=file dataSource=bin processor=TikaEntityProcessor url=${f.fileAbsolutePath} format=text onError=skip transformer=TemplateTransformer rootEntity=true field column=text name=text/ /entity /entity /document /dataConfig Then i used FileListEntityProcessor with LineEntityProcessor which never stopped indexing even after 40 hours or so. data-config.xml dataConfig dataSource name=bin type=FileDataSource / document entity name=f dataSource=null rootEntity=true processor=FileListEntityProcessor transformer=TemplateTransformer baseDir=//mathworks/devel/bat/A/logs/ fileName=.*\.* onError=skip recursive=true field column=fileAbsolutePath name=path / field column=fileSize name=size/ field column=fileLastModified name=lastmodified / entity name=file dataSource=bin processor=LineEntityProcessor url=${f.fileAbsolutePath} format=text onError=skip rootEntity=true field column=content name=rawLine/ /entity /entity /document /dataConfig Is there any way i can use post.jar to index text file recursively. Or any other way which works without java heap exception and doesn't take days to index. I am completely stuck here. Any help would be greatly appreciated. Thanks, Prerna -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-logs-files-of-thousands-of-GBs-tp4097073.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: fq with { or } in Solr 4.3.1
Are you using the edismax query parser? It traps the syntax error and then escapes or ignores special characters. Curly braces are used for exclusive range queries (square brackets are inclusive ranges). The proper syntax is {term1 TO term2}. So, what were your intentions with catid:{123}? If you are simply trying to pass the braces as literal characters for a string field, either escape them with backslash or enclose the entire term in quotes: catid:\{123\} catid:{123} -- Jack Krupansky -Original Message- From: Peter Kirk Sent: Wednesday, October 23, 2013 4:57 AM To: solr-user@lucene.apache.org Subject: RE: fq with { or } in Solr 4.3.1 Sorry, that was just a typo. / search?q=*:*fq=catid:{123} Gives me the error. I think that { and } must be used in ranges for fq, and that's why I can't use them directly like this. /Peter -Original Message- From: Upayavira [mailto:u...@odoko.co.uk] Sent: 23. oktober 2013 10:52 To: solr-user@lucene.apache.org Subject: Re: fq with { or } in Solr 4.3.1 Missing a colon before the curly bracket in the fq? On Wed, Oct 23, 2013, at 09:42 AM, Peter Kirk wrote: Hi If I do a search like /search?q=catid:{123} I get the results I expect. But if I do /search?q=*:*fq=catid{123} I get an error from Solr like: org.apache.solr.search.SyntaxError: Cannot parse 'catid:{123}': Encountered } } at line 1, column 58. Was expecting one of: TO ... RANGE_QUOTED ... RANGE_GOOP ... Can I not use { or } in an fq? Thanks, Peter
Re: Stop/Restart Solr
Did you check that is it running as a service or not? If it runs as a service when even you kill the process it may start again. 2013/10/23 Jeevanandam M. je...@myjeeva.com It seems process started recently. Is there any external cron/process triggering a startup of Solr? Kill again and monitor it. - Jeeva -- Original Message -- From: Raheel Hasan [mailto:raheelhasan@gmail.com] Sent: October 23, 2013 3:29:47 PM GMT+05:30 To: solr-user@lucene.apache.org Subject: Re: Stop/Restart Solr 31173 1 0 16:45 ?00:00:08 java -jar start.jar On Wed, Oct 23, 2013 at 2:53 PM, Jeevanandam M. je...@myjeeva.com wrote: Can you please share output of following command? ps -ef | grep 'start.jar' - Jeeva -- Original Message -- From: Raheel Hasan [mailto:raheelhasan@gmail.com] Sent: October 23, 2013 3:19:46 PM GMT+05:30 To: solr-user@lucene.apache.org Subject: Re: Stop/Restart Solr Kill -9 didnt kill it... ... the process is now again listed, but with PPID=1 which I dont want to kill as many processes have this same id... On Tue, Oct 22, 2013 at 11:59 PM, Utkarsh Sengar utkarsh2...@gmail.com wrote: We use this to start/stop solr: Start: java -Dsolr.clustering.enabled=true -Dsolr.solr.home=multicore -Djetty.class.path=lib/ext/* -Dbootstrap_conf=true -DnumShards=3 -DSTOP.PORT=8079 -DSTOP.KEY=some_value -jar start.jar Stop: java -Dsolr.solr.home=multicore -Dbootstrap_conf=true -DnumShards=3 -DSTOP.PORT=8079 -DSTOP.KEY=some_value -jar start.jar --stop Thanks, -Utkarsh On Tue, Oct 22, 2013 at 10:09 AM, Raheel Hasan raheelhasan@gmail.com wrote: ok fantastic... thanks a lot guyz On Tue, Oct 22, 2013 at 10:00 PM, François Schiettecatte fschietteca...@gmail.com wrote: Yago has the right command to search for the process, that will get you the process ID specifically the first number on the output line, then do 'kill ###', if that fails 'kill -9 ###'. François On Oct 22, 2013, at 12:56 PM, Raheel Hasan raheelhasan@gmail.com wrote: its CentOS... and using jetty with solr here.. On Tue, Oct 22, 2013 at 9:54 PM, François Schiettecatte fschietteca...@gmail.com wrote: A few more specifics about the environment would help, Windows/Linux/...? Jetty/Tomcat/...? François On Oct 22, 2013, at 12:50 PM, Yago Riveiro yago.rive...@gmail.com wrote: If you are asking about if solr has a way to restart himself, I think that the answer is no. If you lost control of the remote machine someone will need to go and restart the machine ... You can try use a kvm or other remote control system -- Yago Riveiro Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Tuesday, October 22, 2013 at 5:46 PM, François Schiettecatte wrote: If you are on linux/unix, use the kill command. François On Oct 22, 2013, at 12:42 PM, Raheel Hasan raheelhasan@gmail.com (mailto: raheelhasan@gmail.com) wrote: Hi, is there a way to stop/restart java? I lost control over it via SSH and connection was closed. But the Solr (start.jar) is still running. thanks. -- Regards, Raheel Hasan -- Regards, Raheel Hasan -- Regards, Raheel Hasan -- Thanks, -Utkarsh -- Regards, Raheel Hasan -- Regards, Raheel Hasan
Re: Class name of parsing the fq clause
Not in just a few words. Do you have specific questions? I mean none of that relates to parsing of fq, the topic of this particular email thread, right? -- Jack Krupansky -Original Message- From: Sandeep Gupta Sent: Wednesday, October 23, 2013 3:58 AM To: solr-user@lucene.apache.org Subject: Re: Class name of parsing the fq clause Thanks Jack for detailing out the parser logic. Would it be possible for you to say something more about filter cache code flow... sometimes we do not use fq parameter in query string and pass the raw query Regards Sandeep On Mon, Oct 21, 2013 at 7:11 PM, Jack Krupansky j...@basetechnology.comwrote: Start with org.apache.solr.handler.**component.QueryComponent#**prepare which fetches the fq parameters and indirectly invokes the query parser(s): String[] fqs = req.getParams().getParams(**CommonParams.FQ); if (fqs!=null fqs.length!=0) { ListQuery filters = rb.getFilters(); // if filters already exists, make a copy instead of modifying the original filters = filters == null ? new ArrayListQuery(fqs.length) : new ArrayListQuery(filters); for (String fq : fqs) { if (fq != null fq.trim().length()!=0) { QParser fqp = QParser.getParser(fq, null, req); filters.add(fqp.getQuery()); } } // only set the filters if they are not empty otherwise // fq=someotherParam= will trigger all docs filter for every request // if filter cache is disabled if (!filters.isEmpty()) { rb.setFilters( filters ); Note that this line actually invokes the parser: filters.add(fqp.getQuery()); Then in org.apache.lucene.search.**Query.QParser#getParser: QParserPlugin qplug = req.getCore().getQueryPlugin(**parserName); QParser parser = qplug.createParser(qstr, localParams, req.getParams(), req); And for the common case of the Lucene query parser, org.apache.solr.search. **LuceneQParserPlugin#**createParser: public QParser createParser(String qstr, SolrParams localParams, SolrParams params, SolrQueryRequest req) { return new LuceneQParser(qstr, localParams, params, req); } And then in org.apache.lucene.search.**Query.QParser#getQuery: public Query getQuery() throws SyntaxError { if (query==null) { query=parse(); And then in org.apache.lucene.search.**Query.LuceneQParser#parse: lparser = new SolrQueryParser(this, defaultField); lparser.setDefaultOperator (QueryParsing.**getQueryParserDefaultOperator(**getReq().getSchema(), getParam(QueryParsing.OP))); return lparser.parse(qstr); And then in org.apache.solr.parser.**SolrQueryParserBase#parse: Query res = TopLevelQuery(null); // pass null so we can tell later if an explicit field was provided or not And then in org.apache.solr.parser.**QueryParser#TopLevelQuery, the parsing begins. And org.apache.solr.parser.**QueryParser.jj is the grammar for a basic Solr/Lucene query, and org.apache.solr.parser.**QueryParser.java is generated by JFlex, and a lot of the logic is in the base class of the generated class, org.apache.solr.parser.**SolrQueryParserBase.java. Good luck! Happy hunting! -- Jack Krupansky -Original Message- From: YouPeng Yang Sent: Monday, October 21, 2013 2:57 AM To: solr-user@lucene.apache.org Subject: Class name of parsing the fq clause Hi I search the solr with fq clause,which is like: fq=BEGINTIME:[2013-08-25T16:**00:00Z TO *] AND BUSID:(M3 OR M9) I am curious about the parsing process . I want to study it. What is the Java file name describes the parsing process of the fq clause. Thanks Regards.
Re: Chinese language search in SOLR 3.6.1
Hi Rajani, The string field type is not analyzed. But that is not the case for text_chinese field type for which is ChineseTokenizerFactory and ChineseFilterFactory is added for index and query analysis. Below check the schema and the fields how it is defined in my above mail. Thanks, Poornima On Wednesday, 23 October 2013 7:21 AM, Rajani Maski rajinima...@gmail.com wrote: String field will work for any case when you do exact key search. text_chinese also should work if you are simply searching with exact string676767667. Well, the best way to find an answer to this query is by using solr analysis tool : http://localhost:8983/solr/#/collection1/analysis Enter your field type and index time input that you had given with query value that you are searching for. You should be able to find your answers. On Tue, Oct 22, 2013 at 8:06 PM, Poornima Jay poornima...@rocketmail.comwrote: Hi Rajani, Below is the configured in my schema. fieldType name=text_chinese class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.ChineseTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.ChineseFilterFactory / filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.ChineseTokenizerFactory/ filter class=solr.ChineseFilterFactory / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType field name=product_code type=string indexed=true stored=false multiValued=true / field name=author_name type=text_chinese indexed=true stored=false multiValued=true/ field name=author_name_string type=string indexed=true stored=false multiValued=true / field name=simple type=text_chinese indexed=true stored=false multiValued=true / copyField source=product_code dest=simple / copyField source=author_name dest=author_name_string / if I search with the query q=simple:总评价 it works but doesn't work if I search with q=simple:676767667. If the field is defined as string the chinese character works but doesn't work if it is defined as text_chinese. Regards, Poornima On Tuesday, 22 October 2013 7:52 PM, Rajani Maski rajinima...@gmail.com wrote: Hi Poornima, Your statement : It works fine with the chinese strings but not working with product code or ISBN even though the fields are defined as string is confusing. Did you mean that the product code and ISBN fields are of type text_Chinese? Is it first or second: field name=product_code* type=string *indexed=true stored=false/ or field name=product_code type=text_chinese indexed=true stored=false/ What do you refer to when you tell that it's not working? Unable to search? On Tue, Oct 22, 2013 at 6:09 PM, Poornima Jay poornima...@rocketmail.comwrote: Hi, Did any one face a problem for chinese language in SOLR 3.6.1. Below is the analyzer in the schema.xml file. fieldType name=text_chinese class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.CJKTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.ChineseFilterFactory / filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.CJKTokenizerFactory/ filter class=solr.ChineseFilterFactory / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType It works fine with the chinese strings but not working with product code or ISBN even though the fields are defined as string. Please let me know how should the chinese schema be configured. Thanks. Poornima
Query cache and group by queries
Hi, It seems that query cache is not used to all for group queries? Can someone explain why this is?
Having two document sets in one index, separated by filter query.
Hi, I have two document sets, both having the same schema. On set is the larger reference set (lets say a few hundred thousand documents) and the smaller set is some user generated content (a few hundreds or thousands). In most cases, I just want to search on the larger reference sets but some functionality also works on both sets. Is the following assumption correct? I could add a field to the schema which defines what type a document is (large set or small set). Now I configure two search handlers. For one handler I add a filter query, which filters on the just defined type field. If I use this handler in my application, I should only see content from the large set and it should be impossible to get results from the small set back. cheers, Achim
Re: Solr cloud weird behaviour
When you say missing files, do you mean the index segments are missing or what? Are your document counts the same the night before and after? Is there any indexing going on? We need some more specifics if we're to help you. If you do have indexing going on, then you might be getting segment merges. The key is whether documents are disappearing from your index when you search. Best, Erick On Tue, Oct 22, 2013 at 11:11 AM, Andreas Weichhart a.weichh...@gmail.comwrote: Hi guys i have a little bit of a problem for some time now and can't finde any solution nor have i any idea why this is happening :) So i'm running a solr cloud for my project at work 3 zks, 4 solrs, and i'm indexing through jsolr because i'm indexing xml files -- parsing them first and creating solr documents and then putting them on the solr. so my problem scenario is as follows: I do a full import, i have erverything indexed, commited and optimized at least that what i do in the java code and in the webapp it shows me that every collection is commited/optimzed So when i check the next day (have a db too where all my documents are referenced) i'm suddendly missing files on the solr and i have no idea how they got lost :) any idea how this could happen? cheers andi
Re: SolrCloud performance in VM environment
Be a bit careful here. 128G is lots of memory, you may encounter very long garbage collection pauses. Just be aware that this may be happening later. Best, Erick On Tue, Oct 22, 2013 at 5:04 PM, Tom Mortimer tom.m.f...@gmail.com wrote: Just tried it with no other changes than upping the RAM to 128GB total, and it's flying. I think that proves that RAM is good. =) Will implement suggested changes later, though. cheers, Tom On 22 October 2013 09:04, Tom Mortimer tom.m.f...@gmail.com wrote: Boogie, Shawn, Thanks for the replies. I'm going to try out some of your suggestions today. Although, without more RAM I'm not that optimistic.. Tom On 21 October 2013 18:40, Shawn Heisey s...@elyograg.org wrote: On 10/21/2013 9:48 AM, Tom Mortimer wrote: Hi everyone, I've been working on an installation recently which uses SolrCloud to index 45M documents into 8 shards on 2 VMs running 64-bit Ubuntu (with another 2 identical VMs set up for replicas). The reason we're using so many shards for a relatively small index is that there are complex filtering requirements at search time, to restrict users to items they are licensed to view. Initial tests demonstrated that multiple shards would be required. The total size of the index is about 140GB, and each VM has 16GB RAM (32GB total) and 4 CPU units. I know this is far under what would normally be recommended for an index of this size, and I'm working on persuading the customer to increase the RAM (basically, telling them it won't work otherwise.) Performance is currently pretty poor and I would expect more RAM to improve things. However, there are a couple of other oddities which concern me, Running multiple shards like you are, where each operating system is handling more than one shard, is only going to perform better if your query volume is low and you have lots of CPU cores. If your query volume is high or you only have 2-4 CPU cores on each VM, you might be better off with fewer shards or not sharded at all. The way that I read this is that you've got two physical machines with 32GB RAM, each running two VMs that have 16GB. Each VM houses 4 shards, or 70GB of index. There's a scenario that might be better if all of the following are true: 1) I'm right about how your hardware is provisioned. 2) You or the client owns the hardware. 3) You have an extremely low-end third machine available - single CPU with 1GB of RAM would probably be enough. In this scenario, you run one Solr instance and one zookeeper instance on each of your two big machines, and use the third wimpy machine as a third zookeeper node. No virtualization. For the rest of my reply, I'm assuming that you haven't taken this step, but it will probably apply either way. The first is that I've been reindexing a fixed set of 500 docs to test indexing and commit performance (with soft commits within 60s). The time taken to complete a hard commit after this is longer than I'd expect, and highly variable - from 10s to 70s. This makes me wonder whether the SAN (which provides all the storage for these VMs and the customers several other VMs) is being saturated periodically. I grabbed some iostat output on different occasions to (possibly) show the variability: Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sdb 64.50 0.00 2476.00 0 4952 ... sdb 8.90 0.00 348.00 0 6960 ... sdb 1.15 0.0043.20 0864 There are two likely possibilities for this. One or both of them might be in play. 1) Because the OS disk cache is small, not much of the index can be cached. This can result in a lot of disk I/O for a commit, slowing things way down. Increasing the size of the OS disk cache is really the only solution for that. 2) Cache autowarming, particularly the filter cache. In the cache statistics, you can see how long each cache took to warm up after the last searcher was opened. The solution for that is to reduce the autowarmCount values. The other thing that confuses me is that after a Solr restart or hard commit, search times average about 1.2s under light load. After searching the same set of queries for 5-6 iterations this improves to 0.1s. However, in either case - cold or warm - iostat reports no device reads at all: Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sdb 0.40 0.00 8.00 0160 ... sdb 0.30 0.0010.40 0104 (the writes are due to logging). This implies to me that the 'hot' blocks are being completely cached in RAM - so why the variation in search time and the number of iterations required to speed it up?
Re: External Zookeeper and JBOSS
When you create the collection, you specify the number of shards you want. From there on, the data is stored in ZK, I don't think shows up in your solr.xml file. Best, Erick On Tue, Oct 22, 2013 at 7:08 PM, Branham, Jeremy [HR] jeremy.d.bran...@sprint.com wrote: [collections] was empty until I used the correct zkcli script from the solr distribution. I uploaded the config - java -classpath .:/production/v8p/deploy/svc.war/WEB-INF/lib/* org.apache.solr.cloud.ZkCLI -cmd upconfig -zkhost localhost:2181 -confdir /data/v8p/solr/root/conf -confname defaultconfig Then ran the bootstrap - java -classpath .:/production/v8p/deploy/svc.war/WEB-INF/lib/* org.apache.solr.cloud.ZkCLI -cmd bootstrap -zkhost 127.0.0.1:2181-solrhome /data/v8p/solr If I'm not mistaken, I don't need to link anything if the collection names are defined in the core element [solr.xml] The cloud admin page shows each core now, but I'm curious how it know how many shards I want to use... I think I missed that somewhere. Jeremy D. Branham Performance Technologist II Sprint University Performance Support Fort Worth, TX | Tel: **DOTNET http://JeremyBranham.Wordpress.com http://www.linkedin.com/in/jeremybranham -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Tuesday, October 22, 2013 3:57 AM To: solr-user@lucene.apache.org Subject: Re: External Zookeeper and JBOSS What happens if you look in collections? Best, Erick On Mon, Oct 21, 2013 at 9:55 PM, Shawn Heisey s...@elyograg.org wrote: On 10/21/2013 1:19 PM, Branham, Jeremy [HR] wrote: Sorl.xml [simplified by removing additional cores] ?xml version=1.0 encoding=UTF-8 ? solr persistent=true sharedLib=lib zkHost=192.168.1.101:2181 cores adminPath=/admin/cores core schema=/data/v8p/solr/root/**schema/schema.xml instanceDir=/data/v8p/solr/**root/ name=wdsp dataDir=/data/v8p/solr/wdsp2/**data/ core schema=/data/v8p/solr/root/**schema/schema.xml instanceDir=/data/v8p/solr/**root/ name=wdsp2 dataDir=/data/v8p/solr/wdsp/**data/ /cores /solr These cores that you have listed here do not look like SolrCloud-related cores, because they do not reference a collection or a shard. Here's what I've got on a 4.2.1 box where all cores were automatically created by the CREATE action on the collections API: core schema=schema.xml loadOnStartup=true shard=shard1 instanceDir=eatatjoes_shard1_**replica2/ transient=false name=eatatjoes_shard1_**replica2 config=solrconfig.xml collection=eatatjoes/ core schema=schema.xml loadOnStartup=true shard=shard1 instanceDir=test3_shard1_**replica1/ transient=false name=test3_shard1_replica1 config=solrconfig.xml collection=test3/ core schema=schema.xml loadOnStartup=true shard=shard1 instanceDir=smb2_shard1_**replica1/ transient=false name=smb2_shard1_replica1 config=solrconfig.xml collection=smb2/ On the commandline script -- the zkCli.sh script comes with zookeeper, but it is not aware of anything having to do with SolrCloud. There is another script named zkcli.sh (note the lowercase C) that comes with the solr example (in example/cloud-scripts)- it's a very different script and will accept the options that you tried to give. I do wonder how much pain would be caused by renaming the Solr zkcli script so it's not so similar to the one that comes with Zookeeper. Thanks, Shawn This e-mail may contain Sprint proprietary information intended for the sole use of the recipient(s). Any use by others is prohibited. If you are not the intended recipient, please contact the sender and delete all copies of the message.
Changing indexed property on a field from false to true
Being given field name=title type=string bindexed=false* stored=true multiValued=false / Changed to field name=title type=string bindexed=true* stored=true multiValued=false / Once the above is done and the collection reloaded, is there a way I can build that index on that field, without reindexing the everything? Thank you! - Thanks, Michael -- View this message in context: http://lucene.472066.n3.nabble.com/Changing-indexed-property-on-a-field-from-false-to-true-tp4097213.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: deleteByQuery does not work with SolrCloud
The first thing I'd do is go in to the browser UI and make sure you can get hits on documents, something like blah/collection/q=indexname:shardTv_20131010 Best, Erick On Wed, Oct 23, 2013 at 8:20 AM, YouPeng Yang yypvsxf19870...@gmail.comwrote: Hi I am using SolrCloud withing solr 4.4 ,and I try the SolrJ API deleteByQuery to delete the Index as : CloudSolrServer cloudServer = new CloudSolrServer(myZKhost) cloudServer.connect() cloudServer.setDefaultCollection cloudServer.deleteByQuery(indexname:shardTv_20131010); cloudServer.commit(); It seems not to work. I also have do some google,unfortunately there is no help. Do I miss anything? Thanks Regard
Multiple facet fields in defaults section of a Request Handler
I define 2 facets - brand and category. Both have been configured in a request handler inside defaults Now a client wants to use multi select faceting. He calls the following API: http://localhost:8983/solr/collection1/search?q=*:*facet.field={!ex=foo}categoryfq={!tag=foo}category :cat What happens in DefaultSolrParams#getParams is it picks up the facet field from the API and discards all the other facets defined in defaults. Thus the response does not facet on brand. If I put the facet definitions in invariants then whatever is provided by the client will be discarded. Putting the facet definitions in appends cases it to facet category 2 times. Is there a way where he does not have to provide all the facet.field parameters in the API call? -- Regards, Varun Thacker http://www.vthacker.in/
RE: SOLR Cloud node link is wrong in the admin panel
It seems the parameters in solr.xml are being ignored. ?xml version=1.0 encoding=UTF-8 ? solr persistent=true sharedLib=lib zkHost=192.168.1.102:2181 host=localhost hostPort=8080 hostContext=/svc/solr cores adminPath=/admin/cores core schema=schema.xml instanceDir=root/ name=test shard=shard1 collection=test dataDir=/data/v8p/solr/test/data/ /cores /solr Jeremy D. Branham Performance Technologist II Sprint University Performance Support Fort Worth, TX | Tel: **DOTNET http://JeremyBranham.Wordpress.com http://www.linkedin.com/in/jeremybranham -Original Message- From: Shawn Heisey [mailto:s...@elyograg.org] Sent: Tuesday, October 22, 2013 3:14 PM To: solr-user@lucene.apache.org Subject: Re: SOLR Cloud node link is wrong in the admin panel On 10/22/2013 2:01 PM, Branham, Jeremy [HR] wrote: I'm thinking I might have a configuration problem... The SOLR Cloud node link is wrong in the admin panel. I am running solr on port 8080 in JBOSS, but the SOLR cloud admin panel has links to http://192.168.1.123:8983/solr for example. Also the context should be svc instead of solr. Is this a configuration problem, or are there some hardcoded values? You're going to need to define the hostPort value in your solr.xml file. In the example solr.xml, this is set to the following string: ${jetty.port:8983} This means that it will use the java property jetty.port unless that's not defined, in which case it will use 8983. Just remove this from hostPort and put 8080 in there. http://wiki.apache.org/solr/SolrCloud#SolrCloud_Instance_Params You might ask why Solr doesn't just figure out what port it is running on and store what it finds in the cloudstate. The reason it doesn't do that is because it can't - a java webapp/servlet has no idea what port it's on until it actually receives a request, but it's not going to receive any requests until it's initialized, and by then it's too late to do anything useful with the information ... plus you need to send it a request. This is one of the prime motivating factors behind the project's decision that Solr will no longer be a war in a future major version. Thanks, Shawn This e-mail may contain Sprint proprietary information intended for the sole use of the recipient(s). Any use by others is prohibited. If you are not the intended recipient, please contact the sender and delete all copies of the message.
Is Solr can create temporary sub-index ?
Dear Solr User, We have to do a new web project which is : Connect our SOLR database to a web plateform. This Web Plateform will be used by several users at the same time. They do requests on our SOLR and they can apply filter on the result. i.e.: Our SOLR contains 87M docs An user do requests, result is around few hundreds to several thousands. On the Web Plateform, user will see first 20 results (or more by using Next Page button) But he will need also to filter the whole result by additional terms. (Terms that our plateform will propose him) Is SOLR can create temporary index (manage by SOLR himself during a web session) ? My goal is to not download the whole result on local computer to provide filter, or to re-send the same request several times added to the new criterias. Many thanks for your comment, Regards, Bruno
Re: Changing indexed property on a field from false to true
The content needs to be re-indexed, the question is whether you can use the info in the index to do it rather than pushing fresh copies of the documents to the index. I've often wondered whether atomic updates could be used to handle this sort of thing. If all fields are stored, push a nominal update to cause the document to be re-indexed. I've never tried it though. I'd be curious to know if it works. Upayavira On Wed, Oct 23, 2013, at 02:25 PM, michael.boom wrote: Being given field name=title type=string bindexed=false* stored=true multiValued=false / Changed to field name=title type=string bindexed=true* stored=true multiValued=false / Once the above is done and the collection reloaded, is there a way I can build that index on that field, without reindexing the everything? Thank you! - Thanks, Michael -- View this message in context: http://lucene.472066.n3.nabble.com/Changing-indexed-property-on-a-field-from-false-to-true-tp4097213.html Sent from the Solr - User mailing list archive at Nabble.com.
SolR document with high number of fields
Hi, I have done some research about SolR document with a very high number of fields. In the mailing list archive there's a thread about this subject which answers my question : http://lucene.472066.n3.nabble.com/Dynamic-fields-performance-question-td476337.html . By the way, this post is a little old and I would like to know if it's still applicable for a recent SolR version (4.5) ? Thanks
Reclaiming disk space from (large, optimized) segments
*Background:* - Our use case is to use SOLR as a massive FIFO queue. - Document additions and updates happen continuously. - Documents are being added at sustained a rate of 50 - 100 documents per second. - About 50% of these document are updates to existing docs, indexed using atomic updates: the original doc is thus deleted and re-added. - There is a separate purge operation running every four hours that deletes the oldest docs, if required based on a number of unrelated configuration parameters. - At some time in the past, a manual force merge / optimize with maxSegments=2 was run to troubleshoot high disk i/o and remove too many segments as a potential variable. Currently, the largest fdts are 74G and 43G. There are 47 total segments, the largest other sizes are all around 2G. - Merge policies are all at Solr 4 defaults. Index size is currently ~50M maxDocs, ~35M numDocs, 276GB. *Issue:* The background purge operation is deleting docs on schedule, but the disk space is not being recovered. *Presumptions:* I presume, but have not confirmed (how?) the 15M deleted documents are predominately in the two large segments. Because they are largely in the two large segments, and those large segments still have (some/many) live documents, the segment backing files are not deleted. *Questions:* - When will those segments get merged and documents recovered? Does it happen when _all_ the documents in those segments are deleted? Some percentage of the segment is filled with deleted documents? - Is there a way to do it right now vs. just waiting? - In some cases, the purge delete conditional is _just_ free disk space: when index free space, delete oldest. Those setups are now in scenarios where index free space, and getting worse. How does low disk space effect above two questions? - Is there a way for me to determine stats on a per-segment basis? - for example, how many deleted documents in a particular segment? - On the flip side, can I determine in what segment a particular document is located? Thank you, Scott -- Scott Lundgren Director of Engineering Carbon Black, Inc. (210) 204-0483 | scott.lundg...@carbonblack.com
Re: Stop/Restart Solr
PPID is the parent process ID. You want to kill the PID, not the PPID. wunder On Oct 23, 2013, at 3:09 AM, Jeevanandam M. wrote: It seems process started recently. Is there any external cron/process triggering a startup of Solr? Kill again and monitor it. - Jeeva -- Original Message -- From: Raheel Hasan [mailto:raheelhasan@gmail.com] Sent: October 23, 2013 3:29:47 PM GMT+05:30 To: solr-user@lucene.apache.org Subject: Re: Stop/Restart Solr 31173 1 0 16:45 ?00:00:08 java -jar start.jar On Wed, Oct 23, 2013 at 2:53 PM, Jeevanandam M. je...@myjeeva.com wrote: Can you please share output of following command? ps -ef | grep 'start.jar' - Jeeva -- Original Message -- From: Raheel Hasan [mailto:raheelhasan@gmail.com] Sent: October 23, 2013 3:19:46 PM GMT+05:30 To: solr-user@lucene.apache.org Subject: Re: Stop/Restart Solr Kill -9 didnt kill it... ... the process is now again listed, but with PPID=1 which I dont want to kill as many processes have this same id... On Tue, Oct 22, 2013 at 11:59 PM, Utkarsh Sengar utkarsh2...@gmail.com wrote: We use this to start/stop solr: Start: java -Dsolr.clustering.enabled=true -Dsolr.solr.home=multicore -Djetty.class.path=lib/ext/* -Dbootstrap_conf=true -DnumShards=3 -DSTOP.PORT=8079 -DSTOP.KEY=some_value -jar start.jar Stop: java -Dsolr.solr.home=multicore -Dbootstrap_conf=true -DnumShards=3 -DSTOP.PORT=8079 -DSTOP.KEY=some_value -jar start.jar --stop Thanks, -Utkarsh On Tue, Oct 22, 2013 at 10:09 AM, Raheel Hasan raheelhasan@gmail.com wrote: ok fantastic... thanks a lot guyz On Tue, Oct 22, 2013 at 10:00 PM, François Schiettecatte fschietteca...@gmail.com wrote: Yago has the right command to search for the process, that will get you the process ID specifically the first number on the output line, then do 'kill ###', if that fails 'kill -9 ###'. François On Oct 22, 2013, at 12:56 PM, Raheel Hasan raheelhasan@gmail.com wrote: its CentOS... and using jetty with solr here.. On Tue, Oct 22, 2013 at 9:54 PM, François Schiettecatte fschietteca...@gmail.com wrote: A few more specifics about the environment would help, Windows/Linux/...? Jetty/Tomcat/...? François On Oct 22, 2013, at 12:50 PM, Yago Riveiro yago.rive...@gmail.com wrote: If you are asking about if solr has a way to restart himself, I think that the answer is no. If you lost control of the remote machine someone will need to go and restart the machine ... You can try use a kvm or other remote control system -- Yago Riveiro Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Tuesday, October 22, 2013 at 5:46 PM, François Schiettecatte wrote: If you are on linux/unix, use the kill command. François On Oct 22, 2013, at 12:42 PM, Raheel Hasan raheelhasan@gmail.com (mailto: raheelhasan@gmail.com) wrote: Hi, is there a way to stop/restart java? I lost control over it via SSH and connection was closed. But the Solr (start.jar) is still running. thanks. -- Regards, Raheel Hasan -- Regards, Raheel Hasan -- Regards, Raheel Hasan -- Thanks, -Utkarsh -- Regards, Raheel Hasan -- Regards, Raheel Hasan -- Walter Underwood wun...@wunderwood.org
Stemming and Synonyms in Apache Solr
We have written a blog with our understanding and experiments on stemming and synonyms in Apache Solr. http://theunstructuredworld.blogspot.in/ We appreciate the users can read and post their valuable suggestions/comments. Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Stemming-and-Synonyms-in-Apache-Solr-tp4097227.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Stop/Restart Solr
ok got it thanks :) On Wed, Oct 23, 2013 at 7:33 PM, Walter Underwood wun...@wunderwood.orgwrote: PPID is the parent process ID. You want to kill the PID, not the PPID. wunder On Oct 23, 2013, at 3:09 AM, Jeevanandam M. wrote: It seems process started recently. Is there any external cron/process triggering a startup of Solr? Kill again and monitor it. - Jeeva -- Original Message -- From: Raheel Hasan [mailto:raheelhasan@gmail.com] Sent: October 23, 2013 3:29:47 PM GMT+05:30 To: solr-user@lucene.apache.org Subject: Re: Stop/Restart Solr 31173 1 0 16:45 ?00:00:08 java -jar start.jar On Wed, Oct 23, 2013 at 2:53 PM, Jeevanandam M. je...@myjeeva.com wrote: Can you please share output of following command? ps -ef | grep 'start.jar' - Jeeva -- Original Message -- From: Raheel Hasan [mailto:raheelhasan@gmail.com] Sent: October 23, 2013 3:19:46 PM GMT+05:30 To: solr-user@lucene.apache.org Subject: Re: Stop/Restart Solr Kill -9 didnt kill it... ... the process is now again listed, but with PPID=1 which I dont want to kill as many processes have this same id... On Tue, Oct 22, 2013 at 11:59 PM, Utkarsh Sengar utkarsh2...@gmail.com wrote: We use this to start/stop solr: Start: java -Dsolr.clustering.enabled=true -Dsolr.solr.home=multicore -Djetty.class.path=lib/ext/* -Dbootstrap_conf=true -DnumShards=3 -DSTOP.PORT=8079 -DSTOP.KEY=some_value -jar start.jar Stop: java -Dsolr.solr.home=multicore -Dbootstrap_conf=true -DnumShards=3 -DSTOP.PORT=8079 -DSTOP.KEY=some_value -jar start.jar --stop Thanks, -Utkarsh On Tue, Oct 22, 2013 at 10:09 AM, Raheel Hasan raheelhasan@gmail.com wrote: ok fantastic... thanks a lot guyz On Tue, Oct 22, 2013 at 10:00 PM, François Schiettecatte fschietteca...@gmail.com wrote: Yago has the right command to search for the process, that will get you the process ID specifically the first number on the output line, then do 'kill ###', if that fails 'kill -9 ###'. François On Oct 22, 2013, at 12:56 PM, Raheel Hasan raheelhasan@gmail.com wrote: its CentOS... and using jetty with solr here.. On Tue, Oct 22, 2013 at 9:54 PM, François Schiettecatte fschietteca...@gmail.com wrote: A few more specifics about the environment would help, Windows/Linux/...? Jetty/Tomcat/...? François On Oct 22, 2013, at 12:50 PM, Yago Riveiro yago.rive...@gmail.com wrote: If you are asking about if solr has a way to restart himself, I think that the answer is no. If you lost control of the remote machine someone will need to go and restart the machine ... You can try use a kvm or other remote control system -- Yago Riveiro Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Tuesday, October 22, 2013 at 5:46 PM, François Schiettecatte wrote: If you are on linux/unix, use the kill command. François On Oct 22, 2013, at 12:42 PM, Raheel Hasan raheelhasan@gmail.com (mailto: raheelhasan@gmail.com) wrote: Hi, is there a way to stop/restart java? I lost control over it via SSH and connection was closed. But the Solr (start.jar) is still running. thanks. -- Regards, Raheel Hasan -- Regards, Raheel Hasan -- Regards, Raheel Hasan -- Thanks, -Utkarsh -- Regards, Raheel Hasan -- Regards, Raheel Hasan -- Walter Underwood wun...@wunderwood.org -- Regards, Raheel Hasan
Re: Is Solr can create temporary sub-index ?
Hi Bruno, Have you looked into Solr's facet support? If I'm reading your post correctly, this sounds like the classic case for facets. Each time the user selects a facet, you add a filter query (fq clause) to the original query. http://wiki.apache.org/solr/SolrFacetingOverview Tim On Wed, Oct 23, 2013 at 8:16 AM, Bruno Mannina bmann...@free.fr wrote: Dear Solr User, We have to do a new web project which is : Connect our SOLR database to a web plateform. This Web Plateform will be used by several users at the same time. They do requests on our SOLR and they can apply filter on the result. i.e.: Our SOLR contains 87M docs An user do requests, result is around few hundreds to several thousands. On the Web Plateform, user will see first 20 results (or more by using Next Page button) But he will need also to filter the whole result by additional terms. (Terms that our plateform will propose him) Is SOLR can create temporary index (manage by SOLR himself during a web session) ? My goal is to not download the whole result on local computer to provide filter, or to re-send the same request several times added to the new criterias. Many thanks for your comment, Regards, Bruno
Re: Changing indexed property on a field from false to true
I've made a test, based on your suggestion. Using the example in 4.5.0 i set the title field as indexed=false, indexed a couple of docs: add doc field name=id1/field field name=title update=setBigApple/field /doc doc field name=id2/field field name=title update=setSmallApple/field /doc /add and made fq=title:BigApple. No docs were returned, of course. Then I modified the schema, setting indexed=true for the title field and restarted solr. Following that I posted a document update : add doc field name=id1/field field name=title update=setBigApple/field /doc /add Afterwards i ranned the same query fq=title:BigApple and the document was returned. So at a first look an atomic update can do the trick. Unless I was doing something wrong. - Thanks, Michael -- View this message in context: http://lucene.472066.n3.nabble.com/Changing-indexed-property-on-a-field-from-false-to-true-tp4097213p4097233.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Having two document sets in one index, separated by filter query.
Sounds correct - you probably want to use an invariant parameter in solrconfig.xml, something along the lines of: lst name=invariants str name=fqdocset:0/str /lst Where docset is the new field you add to the schema to determine which set a document belongs to. You might also consider adding a newSearcher warming query that includes this fq so that the filter gets cached everytime you open a new searcher. On Wed, Oct 23, 2013 at 7:09 AM, Achim Domma do...@procoders.net wrote: Hi, I have two document sets, both having the same schema. On set is the larger reference set (lets say a few hundred thousand documents) and the smaller set is some user generated content (a few hundreds or thousands). In most cases, I just want to search on the larger reference sets but some functionality also works on both sets. Is the following assumption correct? I could add a field to the schema which defines what type a document is (large set or small set). Now I configure two search handlers. For one handler I add a filter query, which filters on the just defined type field. If I use this handler in my application, I should only see content from the large set and it should be impossible to get results from the small set back. cheers, Achim
Re: shards.tolerant throwing null pointer exception when spellcheck is on
Thanks for the information. I think its good to have this issue fixed, specially for cases where the spellcheck feature is on. I'll check out at the source code and take a look, even a quick suppressing of the null pointer exception might make a difference. -- View this message in context: http://lucene.472066.n3.nabble.com/shards-tolerant-throwing-null-pointer-exception-when-spellcheck-is-on-tp4097133p4097234.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Minor bug with CloudSolrServer and collection-alias.
On 10/23/2013 3:59 AM, Thomas Egense wrote: Using cloudSolrServer.setDefaultCollection(collectionId) does not work as intended for an alias spanning more than 1 collection. The virtual collection-alias collectionID is recoqnized as a existing collection, but it does only query one of the collections it is mapped to. You can confirm this easy in AliasIntegrationTest. The test-class AliasIntegrationTest creates to cores with 2 and 3 different documents. And then creates an alias pointing to both of them. Line 153: // search with new cloud client CloudSolrServer cloudSolrServer = new CloudSolrServer(zkServer.getZkAddress(), random().nextBoolean()); cloudSolrServer.setParallelUpdates(random().nextBoolean()); query = new SolrQuery(*:*); query.set(collection, testalias); res = cloudSolrServer.query(query); cloudSolrServer.shutdown(); assertEquals(5, res.getResults().getNumFound()); No unit-test bug here, however if you change it from setting the collectionid on the query but on CloudSolrServer instead,it will produce the bug: // search with new cloud client CloudSolrServer cloudSolrServer = new CloudSolrServer(zkServer.getZkAddress(), random().nextBoolean()); cloudSolrServer.setDefaultCollection(testalias); cloudSolrServer.setParallelUpdates(random().nextBoolean()); query = new SolrQuery(*:*); //query.set(collection, testalias); res = cloudSolrServer.query(query); cloudSolrServer.shutdown(); assertEquals(5, res.getResults().getNumFound()); -- Assertion failure Should I create a Jira issue for this? Thomas, I have confirmed this with the following test patch, which adds to the test rather than changing what's already there: http://apaste.info/9ke5 I'm about to head off to the train station to start my commute, so I will be unavailable for a little while. If you haven't gotten the jira filed by the time I get to another computer, I will create it. Thanks, Shawn
Re: SOLR Cloud node link is wrong in the admin panel
On 10/23/2013 7:50 AM, Branham, Jeremy [HR] wrote: It seems the parameters in solr.xml are being ignored. ?xml version=1.0 encoding=UTF-8 ? solr persistent=true sharedLib=lib zkHost=192.168.1.102:2181 host=localhost hostPort=8080 hostContext=/svc/solr cores adminPath=/admin/cores core schema=schema.xml instanceDir=root/ name=test shard=shard1 collection=test dataDir=/data/v8p/solr/test/data/ /cores /solr Did you restart Solr (actually your container - jetty, tomcat, etc) after making that change? You'll need to make the change on all your Solr instances and restart them all. Thanks, Shawn
Re: Is Solr can create temporary sub-index ?
Hello Tim, Yes solr's facet could be a solution, but I need to re-send the q= each time. I'm asking me just if an another solution exists. Facet seems to be the good solution. Bruno Le 23/10/2013 17:03, Timothy Potter a écrit : Hi Bruno, Have you looked into Solr's facet support? If I'm reading your post correctly, this sounds like the classic case for facets. Each time the user selects a facet, you add a filter query (fq clause) to the original query. http://wiki.apache.org/solr/SolrFacetingOverview Tim On Wed, Oct 23, 2013 at 8:16 AM, Bruno Mannina bmann...@free.fr wrote: Dear Solr User, We have to do a new web project which is : Connect our SOLR database to a web plateform. This Web Plateform will be used by several users at the same time. They do requests on our SOLR and they can apply filter on the result. i.e.: Our SOLR contains 87M docs An user do requests, result is around few hundreds to several thousands. On the Web Plateform, user will see first 20 results (or more by using Next Page button) But he will need also to filter the whole result by additional terms. (Terms that our plateform will propose him) Is SOLR can create temporary index (manage by SOLR himself during a web session) ? My goal is to not download the whole result on local computer to provide filter, or to re-send the same request several times added to the new criterias. Many thanks for your comment, Regards, Bruno
SV: fq with { or } in Solr 4.3.1
Thanks. The data for the catid comes from another system, and is actually a string with a start { and an end }. I was confused that it works in a q parameter but not fq. I think the easiest for me, is simply to strip the start and end characters when I feed to the index. Thanks Fra: Jack Krupansky j...@basetechnology.com Sendt: 23. oktober 2013 12:59 Til: solr-user@lucene.apache.org Emne: Re: fq with { or } in Solr 4.3.1 Are you using the edismax query parser? It traps the syntax error and then escapes or ignores special characters. Curly braces are used for exclusive range queries (square brackets are inclusive ranges). The proper syntax is {term1 TO term2}. So, what were your intentions with catid:{123}? If you are simply trying to pass the braces as literal characters for a string field, either escape them with backslash or enclose the entire term in quotes: catid:\{123\} catid:{123} -- Jack Krupansky -Original Message- From: Peter Kirk Sent: Wednesday, October 23, 2013 4:57 AM To: solr-user@lucene.apache.org Subject: RE: fq with { or } in Solr 4.3.1 Sorry, that was just a typo. / search?q=*:*fq=catid:{123} Gives me the error. I think that { and } must be used in ranges for fq, and that's why I can't use them directly like this. /Peter -Original Message- From: Upayavira [mailto:u...@odoko.co.uk] Sent: 23. oktober 2013 10:52 To: solr-user@lucene.apache.org Subject: Re: fq with { or } in Solr 4.3.1 Missing a colon before the curly bracket in the fq? On Wed, Oct 23, 2013, at 09:42 AM, Peter Kirk wrote: Hi If I do a search like /search?q=catid:{123} I get the results I expect. But if I do /search?q=*:*fq=catid{123} I get an error from Solr like: org.apache.solr.search.SyntaxError: Cannot parse 'catid:{123}': Encountered } } at line 1, column 58. Was expecting one of: TO ... RANGE_QUOTED ... RANGE_GOOP ... Can I not use { or } in an fq? Thanks, Peter
Re: Is Solr can create temporary sub-index ?
Yes, absolutely you resend the q= each time, optionally with any facets selected by the user using fq= On Wed, Oct 23, 2013 at 10:00 AM, Bruno Mannina bmann...@free.fr wrote: Hello Tim, Yes solr's facet could be a solution, but I need to re-send the q= each time. I'm asking me just if an another solution exists. Facet seems to be the good solution. Bruno Le 23/10/2013 17:03, Timothy Potter a écrit : Hi Bruno, Have you looked into Solr's facet support? If I'm reading your post correctly, this sounds like the classic case for facets. Each time the user selects a facet, you add a filter query (fq clause) to the original query. http://wiki.apache.org/solr/**SolrFacetingOverviewhttp://wiki.apache.org/solr/SolrFacetingOverview Tim On Wed, Oct 23, 2013 at 8:16 AM, Bruno Mannina bmann...@free.fr wrote: Dear Solr User, We have to do a new web project which is : Connect our SOLR database to a web plateform. This Web Plateform will be used by several users at the same time. They do requests on our SOLR and they can apply filter on the result. i.e.: Our SOLR contains 87M docs An user do requests, result is around few hundreds to several thousands. On the Web Plateform, user will see first 20 results (or more by using Next Page button) But he will need also to filter the whole result by additional terms. (Terms that our plateform will propose him) Is SOLR can create temporary index (manage by SOLR himself during a web session) ? My goal is to not download the whole result on local computer to provide filter, or to re-send the same request several times added to the new criterias. Many thanks for your comment, Regards, Bruno
Re: Is Solr can create temporary sub-index ?
I have a little question concerning statistics on a request: I have a field defined like that: field name=ic type=text_classification indexed=true stored=true multiValued=true/ fieldType name=text_classification class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=true analyzer type=index tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.LowerCaseFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType Date sample for this field: arr name=ic strA23L1/22066/str strA23L1/227/str strA23L1/231/str strA23L1/2375/str /arr My question is: Is it possible to have frequency of terms for the whole result of the initial user's request? Thanks a lot, Bruno Le 23/10/2013 18:12, Timothy Potter a écrit : Yes, absolutely you resend the q= each time, optionally with any facets selected by the user using fq= On Wed, Oct 23, 2013 at 10:00 AM, Bruno Mannina bmann...@free.fr wrote: Hello Tim, Yes solr's facet could be a solution, but I need to re-send the q= each time. I'm asking me just if an another solution exists. Facet seems to be the good solution. Bruno Le 23/10/2013 17:03, Timothy Potter a écrit : Hi Bruno, Have you looked into Solr's facet support? If I'm reading your post correctly, this sounds like the classic case for facets. Each time the user selects a facet, you add a filter query (fq clause) to the original query. http://wiki.apache.org/solr/**SolrFacetingOverviewhttp://wiki.apache.org/solr/SolrFacetingOverview Tim On Wed, Oct 23, 2013 at 8:16 AM, Bruno Mannina bmann...@free.fr wrote: Dear Solr User, We have to do a new web project which is : Connect our SOLR database to a web plateform. This Web Plateform will be used by several users at the same time. They do requests on our SOLR and they can apply filter on the result. i.e.: Our SOLR contains 87M docs An user do requests, result is around few hundreds to several thousands. On the Web Plateform, user will see first 20 results (or more by using Next Page button) But he will need also to filter the whole result by additional terms. (Terms that our plateform will propose him) Is SOLR can create temporary index (manage by SOLR himself during a web session) ? My goal is to not download the whole result on local computer to provide filter, or to re-send the same request several times added to the new criterias. Many thanks for your comment, Regards, Bruno
Re: Is Solr can create temporary sub-index ?
Hum I think my fieldType = text_classification is not appropriated for this kind of data... I don't need to use stopwords, synonym etc... IC field is a field that contains codes, and codes contains often the char / and if I use the Terms option, I get: lst name=ic ... int name=004563254/int int name=003763554/int int name=002263254/int ... .. Le 23/10/2013 18:51, Bruno Mannina a écrit : fieldType name=text_classification class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=true analyzer type=index tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.LowerCaseFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType
Re: Minor bug with CloudSolrServer and collection-alias.
I filed https://issues.apache.org/jira/browse/SOLR-5380 and just committed a fix. - Mark On Oct 23, 2013, at 11:15 AM, Shawn Heisey s...@elyograg.org wrote: On 10/23/2013 3:59 AM, Thomas Egense wrote: Using cloudSolrServer.setDefaultCollection(collectionId) does not work as intended for an alias spanning more than 1 collection. The virtual collection-alias collectionID is recoqnized as a existing collection, but it does only query one of the collections it is mapped to. You can confirm this easy in AliasIntegrationTest. The test-class AliasIntegrationTest creates to cores with 2 and 3 different documents. And then creates an alias pointing to both of them. Line 153: // search with new cloud client CloudSolrServer cloudSolrServer = new CloudSolrServer(zkServer.getZkAddress(), random().nextBoolean()); cloudSolrServer.setParallelUpdates(random().nextBoolean()); query = new SolrQuery(*:*); query.set(collection, testalias); res = cloudSolrServer.query(query); cloudSolrServer.shutdown(); assertEquals(5, res.getResults().getNumFound()); No unit-test bug here, however if you change it from setting the collectionid on the query but on CloudSolrServer instead,it will produce the bug: // search with new cloud client CloudSolrServer cloudSolrServer = new CloudSolrServer(zkServer.getZkAddress(), random().nextBoolean()); cloudSolrServer.setDefaultCollection(testalias); cloudSolrServer.setParallelUpdates(random().nextBoolean()); query = new SolrQuery(*:*); //query.set(collection, testalias); res = cloudSolrServer.query(query); cloudSolrServer.shutdown(); assertEquals(5, res.getResults().getNumFound()); -- Assertion failure Should I create a Jira issue for this? Thomas, I have confirmed this with the following test patch, which adds to the test rather than changing what's already there: http://apaste.info/9ke5 I'm about to head off to the train station to start my commute, so I will be unavailable for a little while. If you haven't gotten the jira filed by the time I get to another computer, I will create it. Thanks, Shawn
Spellcheck with Distributed Search (sharding).
Hello! I'be been trying to enable Spellchecking using sharding following the steps from the Wiki, but I failed, :-( What I do is: *Solrconfig.xml* *searchComponent name=suggest* class=solr.SpellCheckComponent lst name=spellchecker str name=namesuggest/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.tst.TSTLookup/str str name=fieldsuggestion/str str name=buildOnOptimizetrue/str /lst /searchComponent *requestHandler name=/suggest* class=solr.SearchHandler lst name=defaults str name=dfsuggestion/str str name=spellchecktrue/str str name=spellcheck.dictionarysuggest/str str name=spellcheck.count10/str /lst arr name=last-components strsuggest/str /arr /requestHandler *Note:* I have two shards (solr1 and solr2) and both have the same solrconfig.xml. Also, bot indexes were optimized to create the spellchecker indexes. *Query* solr1:8080/events/data/select?q=mqt=/suggestionshards.qt=/suggestionwt=jsonshards=solr1:8080/events/data,solr2:8080/events/data * * *Response* * * { - responseHeader: { - status: 404, - QTime: 12, - params: { - shards: solr1:8080/events/data,solr2:8080/events/data, - shards.qt: /suggestion, - q: m, - wt: json, - qt: /suggestion } }, - error: { - msg: Server at http://solr1:8080/events/data returned non ok status:404, message:Not Found, - code: 404 } } More query syntaxes that I used and that doesn't work: http://solr1:8080/events/data/select?q=mqt=suggestionshards.qt=/suggestionwt=jsonshards=solr1:8080/events/data,solr2:8080/events/datahttp://solrclusterd.buguroo.dev:8080/events/data/select?q=mqt=suggestionshards.qt=/suggestionwt=jsonshards=solrclusterd.buguroo.dev:8080/events/data,solrclusterc.buguroo.dev:8080/events/data http://solr1:8080/events/data/select?q=*:*spellcheck.q=mqt=suggestionshards.qt=/suggestionwt=jsonshards=solr1:8080/events/data,solr2:8080/events/datahttp://solrclusterd.buguroo.dev:8080/events/data/select?q=*:*spellcheck.q=mqt=suggestionshards.qt=/suggestionwt=jsonshards=solrclusterd.buguroo.dev:8080/events/data,solrclusterc.buguroo.dev:8080/events/data Any idea of what I'm doing wrong? Thank you very much in advance! Best regards, -- - Luis Cappa
Re: Solr Cloud Distributed IDF
I am indexing documents using the domin:id format ex id = k-690kohler!670614 This ensures that all k-690kohler documents are indexed to the same shard. This does cause numDocs that are not perfectly distributed across shards probably even worse than the default sharding algorithm. Here is the search on Solr Cloud http://solrsolr/productindex/productQuery?q=categories_82_is:108996bf=linear(popularity_82_i,1,2)^3debugQuery=true And on Solr 3.6 http://solr-2-build.sys.id.build.com:8080/solr-build/select?q.alt=categoryId:108996qt=dismaxbf=linear(popularity,1,2)^3debugQuery=truefl=id,productID,manufacturer Here is the debug output from Solr Cloud lst name=explain str name=921rusticware!1210842 48481.992 = (MATCH) sum of: 4.7323933 = (MATCH) weight(categories_82_is:`#8;#0;#6;SD in 248779) [DefaultSimilarity], result of: 4.7323933 = score(doc=248779,freq=1.0 = termFreq=1.0 ), product of: 0.8745785 = queryWeight, product of: 5.411056 = idf(docFreq=3181, maxDocs=262058) 0.16162805 = queryNorm 5.411056 = fieldWeight in 248779, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 5.411056 = idf(docFreq=3181, maxDocs=262058) 1.0 = fieldNorm(doc=248779) 48477.26 = (MATCH) FunctionQuery(1.0*float(int(popularity_82_i))+2.0), product of: 99977.0 = 1.0*float(int(popularity_82_i)=99975)+2.0 3.0 = boost 0.16162805 = queryNorm /str str name=4706baldwin!1223898 48380.168 = (MATCH) sum of: 4.7323933 = (MATCH) weight(categories_82_is:`#8;#0;#6;SD in 67238) [DefaultSimilarity], result of: 4.7323933 = score(doc=67238,freq=1.0 = termFreq=1.0 ), product of: 0.8745785 = queryWeight, product of: 5.411056 = idf(docFreq=3181, maxDocs=262058) 0.16162805 = queryNorm 5.411056 = fieldWeight in 67238, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 5.411056 = idf(docFreq=3181, maxDocs=262058) 1.0 = fieldNorm(doc=67238) 48375.438 = (MATCH) FunctionQuery(1.0*float(int(popularity_82_i))+2.0), product of: 99767.0 = 1.0*float(int(popularity_82_i)=99765)+2.0 3.0 = boost 0.16162805 = queryNorm /str str name=yb5405moen!1748274 48278.34 = (MATCH) sum of: 4.7323933 = (MATCH) weight(categories_82_is:`#8;#0;#6;SD in 123982) [DefaultSimilarity], result of: 4.7323933 = score(doc=123982,freq=1.0 = termFreq=1.0 ), product of: 0.8745785 = queryWeight, product of: 5.411056 = idf(docFreq=3181, maxDocs=262058) 0.16162805 = queryNorm 5.411056 = fieldWeight in 123982, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 5.411056 = idf(docFreq=3181, maxDocs=262058) 1.0 = fieldNorm(doc=123982) 48273.61 = (MATCH) FunctionQuery(1.0*float(int(popularity_82_i))+2.0), product of: 99557.0 = 1.0*float(int(popularity_82_i)=99555)+2.0 3.0 = boost 0.16162805 = queryNorm /str str name=bp53005amerock!1721790 48262.008 = (MATCH) sum of: 4.7675867 = (MATCH) weight(categories_82_is:`#8;#0;#6;SD in 108146) [DefaultSimilarity], result of: 4.7675867 = score(doc=108146,freq=1.0 = termFreq=1.0 ), product of: 0.8758082 = queryWeight, product of: 5.4436426 = idf(docFreq=3131, maxDocs=266484) 0.16088642 = queryNorm 5.4436426 = fieldWeight in 108146, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 5.4436426 = idf(docFreq=3131, maxDocs=266484) 1.0 = fieldNorm(doc=108146) 48257.24 = (MATCH) FunctionQuery(1.0*float(int(popularity_82_i))+2.0), product of: 99982.0 = 1.0*float(int(popularity_82_i)=99980)+2.0 3.0 = boost 0.16088642 = queryNorm /str str name=bp29340amerock!1721865 48208.918 = (MATCH) sum of: 4.7675867 = (MATCH) weight(categories_82_is:`#8;#0;#6;SD in 108031) [DefaultSimilarity], result of: 4.7675867 = score(doc=108031,freq=1.0 = termFreq=1.0 ), product of: 0.8758082 = queryWeight, product of: 5.4436426 = idf(docFreq=3131, maxDocs=266484) 0.16088642 = queryNorm 5.4436426 = fieldWeight in 108031, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 5.4436426 = idf(docFreq=3131, maxDocs=266484) 1.0 = fieldNorm(doc=108031) 48204.15 = (MATCH) FunctionQuery(1.0*float(int(popularity_82_i))+2.0), product of: 99872.0 = 1.0*float(int(popularity_82_i)=99870)+2.0 3.0 = boost 0.16088642 = queryNorm /str str name=bp53001amerock!1314101 48176.516 = (MATCH) sum of: 4.7323933 = (MATCH) weight(categories_82_is:`#8;#0;#6;SD in 47622) [DefaultSimilarity], result of: 4.7323933 = score(doc=47622,freq=1.0 = termFreq=1.0 ), product of: 0.8745785 = queryWeight, product of: 5.411056 = idf(docFreq=3181, maxDocs=262058) 0.16162805 = queryNorm 5.411056 = fieldWeight in 47622, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 5.411056 = idf(docFreq=3181, maxDocs=262058) 1.0 = fieldNorm(doc=47622) 48171.785 = (MATCH) FunctionQuery(1.0*float(int(popularity_82_i))+2.0), product of: 99347.0 = 1.0*float(int(popularity_82_i)=99345)+2.0 3.0 = boost 0.16162805 = queryNorm /str And here is the debug output from Solr 3.6 lst name=explain str name=bp53005amerock 15421.395 = (MATCH) sum of: 1.6594616 = (MATCH) weight(categoryId:`#8;#0;#6;SD in 45538), product of: 0.29207912 = queryWeight(categoryId:`#8;#0;#6;SD), product of:
RE: Facet performance
On Tue, October 22, 2013 5:23 PM Michael Lemke wrote: On Tue, October 22, 2013 9:23 AM Toke Eskildsen wrote: On Mon, 2013-10-21 at 16:57 +0200, Lemke, Michael SZ/HZA-ZSW wrote: QTime fc: never returns, webserver restarts itself after 30 min with 100% CPU load It might be because it dies due to garbage collection. But since more memory (as your test server presumably has) just leads to the too many values-error, there isn't much to do. Essentially, fc is out then. QTime=41205 facet.prefix=q=frequent_word numFound=44532 Same query repeated: QTime=225810 facet.prefix=q=ottomotor numFound=909 QTime=199839 facet.prefix=q=ottomotor numFound=909 I am stumped on this, sorry. I do not understand why the 'ottomotor' query can take 5 times as long as the 'frequent_word'-one. I looked into this some more this morning. I noticed the java process was doing a lot of I/O as shown in Process Explorer. For the frequent_word it read about 180MB, for ottomotor is was about seven times as much, ~ 1,200 MB. Got another observation today. The response time for q=ottomotor depends on facet.limit: QTime=59300 facet.limit=2 QTime=69395 facet.limit=4 QTime=85208 facet.limit=6 QTime=158150 facet.limit=8 QTime=186276 facet.limit=10 QTime=231763 facet.limit=15 QTime=260437 facet.limit=20 QTime=312268 facet.limit=30 For q=frequent_word the result is much less pronounced and shows only for facet.limit = 15 : QTime=0 facet.limit=0 QTime=20535 facet.limit=1 QTime=13456 facet.limit=2 QTime=13925 facet.limit=4 QTime=13705 facet.limit=6 QTime=13924 facet.limit=8 QTime=13799 facet.limit=10 QTime=14361 facet.limit=15 QTime=14704 facet.limit=20 QTime=15189 facet.limit=30 QTime=16783 facet.limit=50 QTime=57128 facet.limit=500 Looks to me for solr to collect enough facets to fulfill the limit constraint it has to read much more of the index in the case of the infrequent word. jconsole didn't show anything unusual according to our more experienced Java experts here. Nor was the machine swapping. Is it possible to screw up an index such that this sort of faceting leads to constant reading of the index? Something like full table scans in a db? Michael
Re: Is Solr can create temporary sub-index ?
I need your help to define the right fieldType, please, this field must be indexed, stored and each value must be considered as one term. The char / don't be consider like a separator. Is String could be a good fieldType ? thanks Le 23/10/2013 18:51, Bruno Mannina a écrit : arr name=ic strA23L1/22066/str strA23L1/227/str strA23L1/231/str strA23L1/2375/str /arr
New query-time multi-word synonym expander
Hi, Heads up that there is new query-time multi-word synonym expander patch in https://issues.apache.org/jira/browse/SOLR-5379 This worked for our customer and we hope it works for others. Any feedback would be greatly appreciated. Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com
Re: Spellcheck with Distributed Search (sharding).
More info: When executing the Query to a single Solr server it works: http://solr1:8080/events/data/suggest?q=mwt=jsonhttp://solrclusterd.buguroo.dev:8080/events/data/suggest?q=mwt=json { - responseHeader: { - status: 0, - QTime: 1 }, - response: { - numFound: 0, - start: 0, - docs: [ ] }, - spellcheck: { - suggestions: [ - m, - { - numFound: 4, - startOffset: 0, - endOffset: 1, - suggestion: [ - marca, - marcacom, - mis, - mispelotas ] } ] } } But when choosing the Request handler this way it doesn't: http://solr1:8080/events/data/select?*qt=/sugges*twt=jsonq=*:*http://solrclusterd.buguroo.dev:8080/events/data/select?qt=/suggestwt=jsonq=*:* 2013/10/23 Luis Cappa Banda luisca...@gmail.com Hello! I'be been trying to enable Spellchecking using sharding following the steps from the Wiki, but I failed, :-( What I do is: *Solrconfig.xml* *searchComponent name=suggest* class=solr.SpellCheckComponent lst name=spellchecker str name=namesuggest/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.tst.TSTLookup/str str name=fieldsuggestion/str str name=buildOnOptimizetrue/str /lst /searchComponent *requestHandler name=/suggest* class=solr.SearchHandler lst name=defaults str name=dfsuggestion/str str name=spellchecktrue/str str name=spellcheck.dictionarysuggest/str str name=spellcheck.count10/str /lst arr name=last-components strsuggest/str /arr /requestHandler *Note:* I have two shards (solr1 and solr2) and both have the same solrconfig.xml. Also, bot indexes were optimized to create the spellchecker indexes. *Query* solr1:8080/events/data/select?q=mqt=/suggestionshards.qt=/suggestionwt=jsonshards=solr1:8080/events/data,solr2:8080/events/data * * *Response* * * { - responseHeader: { - status: 404, - QTime: 12, - params: { - shards: solr1:8080/events/data,solr2:8080/events/data, - shards.qt: /suggestion, - q: m, - wt: json, - qt: /suggestion } }, - error: { - msg: Server at http://solr1:8080/events/data returned non ok status:404, message:Not Found, - code: 404 } } More query syntaxes that I used and that doesn't work: http://solr1:8080/events/data/select?q=mqt=suggestionshards.qt=/suggestionwt=jsonshards=solr1:8080/events/data,solr2:8080/events/datahttp://solrclusterd.buguroo.dev:8080/events/data/select?q=mqt=suggestionshards.qt=/suggestionwt=jsonshards=solrclusterd.buguroo.dev:8080/events/data,solrclusterc.buguroo.dev:8080/events/data http://solr1:8080/events/data/select?q=*:*spellcheck.q=mqt=suggestionshards.qt=/suggestionwt=jsonshards=solr1:8080/events/data,solr2:8080/events/datahttp://solrclusterd.buguroo.dev:8080/events/data/select?q=*:*spellcheck.q=mqt=suggestionshards.qt=/suggestionwt=jsonshards=solrclusterd.buguroo.dev:8080/events/data,solrclusterc.buguroo.dev:8080/events/data Any idea of what I'm doing wrong? Thank you very much in advance! Best regards, -- - Luis Cappa -- - Luis Cappa
Re: Issue with large html indexing
Attachments and images are often eaten by the mail server, your image is not visible at least to me. Can you describe what you're seeing? Or post the image somewhere and provide a link? Best, Erick On Wed, Oct 23, 2013 at 11:07 AM, Raheel Hasan raheelhasan@gmail.comwrote: Hi, I have an issue here while indexing large html. Here is the confguration for that: 1) Data is imported via URLDataSource / PlainTextEntityProcessor (DIH) 2) Schema has this for the field: type=text_en_splitting indexed=true stored=false required=false 3) text_en_splitting has the following work done for indexing: HTMLStripCharFilterFactory WhitespaceTokenizerFactory (create tokens) StopFilterFactory WordDelimiterFilterFactory ICUFoldingFilterFactory PorterStemFilterFactory RemoveDuplicatesTokenFilterFactory LengthFilterFactory However, the indexed data is like this (as in the attached image): [image: Inline image 1] so what are these numbers? If I put small html, it works fine, but as the size of html file increases, this is what happens.. -- Regards, Raheel Hasan
Re: New query-time multi-word synonym expander
Otis, could you provide a little (well, maybe a lot!) of discussion and detailed examples that illustrate what the patch can and can't handle? I mean, I read the Jira and and is simultaneously promising and a bit vague. Does it fully solve the issue, or is it yet another partial solution? Either way, it may be reasonably satisfactory, but some clarity would help. Thanks! -- Jack Krupansky -Original Message- From: Otis Gospodnetic Sent: Wednesday, October 23, 2013 1:28 PM To: solr-user@lucene.apache.org Subject: New query-time multi-word synonym expander Hi, Heads up that there is new query-time multi-word synonym expander patch in https://issues.apache.org/jira/browse/SOLR-5379 This worked for our customer and we hope it works for others. Any feedback would be greatly appreciated. Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com
RE: New query-time multi-word synonym expander
Nice, but now we got three multi-word synonym parsers? Didn't the LUCENE-4499 or SOLR-4381 patches work? I know the latter has had a reasonable amount of users and committers on github, but it was never brought back to ASF it seems. -Original message- From:Otis Gospodnetic otis.gospodne...@gmail.com Sent: Wednesday 23rd October 2013 18:54 To: solr-user@lucene.apache.org Subject: New query-time multi-word synonym expander Hi, Heads up that there is new query-time multi-word synonym expander patch in https://issues.apache.org/jira/browse/SOLR-5379 This worked for our customer and we hope it works for others. Any feedback would be greatly appreciated. Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com
Re: New shard leaders or existing shard replicas depends on zookeeper?
My first impulse would be to ask how you created the collection. It sure _sounds_ like you didn't specify 24 shards and thus have only a single shard, one leader and 23 replicas bq: ...to point to the zookeeper ensemble also used for the ukdomain collection... so my guess is that this ZK ensemble has the ldwa collection defined as having only one shard I admit I pretty much skimmed your post though... Best, Erick On Wed, Oct 23, 2013 at 12:54 PM, Hoggarth, Gil gil.hogga...@bl.uk wrote: Hi solr-users, I'm seeing some confusing behaviour in Solr/zookeeper and hope you can shed some light on what's happening/how I can correct it. We have two physical servers running automated builds of RedHat 6.4 and Solr 4.4.0 that host two separate Solr services. The first server (called ld01) has 24 shards and hosts a collection called 'ukdomain'; the second server (ld02) also has 24 shards and hosts a different collection called 'ldwa'. It's evidently important to note that previously both of these physical servers provided the 'ukdomain' collection, but the 'ldwa' server has been rebuilt for the new collection. When I start the ldwa solr nodes with their zookeeper configuration (defined in /etc/sysconfig/solrnode* and with collection.configName as 'ldwacfg') pointing to the development zookeeper ensemble, all nodes initially become shard leaders and then replicas as I'd expect. But if I change the ldwa solr nodes to point to the zookeeper ensemble also used for the ukdomain collection, all ldwa solr nodes start on the same shard (that is, the first ldwa solr node becomes the shard leader, then every other solr node becomes a replica for this shard). The significant point here is no other ldwa shards gain leaders (or replicas). The ukdomain collection uses a zookeeper collection.configName of 'ukdomaincfg', and prior to the creation of this ldwa service the collection.configName of 'ldwacfg' has never previously been used. So I'm confused why the ldwa service would differ when the only difference is which zookeeper ensemble is used (both zookeeper ensembles are automatedly built using version 3.4.5). If anyone can explain why this is happening and how I can get the ldwa services to start correctly using the non-development zookeeper ensemble, I'd be very grateful! If more information or explanation is needed, just ask. Thanks, Gil Gil Hoggarth Web Archiving Technical Services Engineer The British Library, Boston Spa, West Yorkshire, LS23 7BQ
Re: Indexing logs files of thousands of GBs
As a supplement to what Chris said, if you can partition the walking amongst a number of clients you can also parallelize the indexing. If you're using SolrCloud 4.5+, there are also some nice optimizations in SolrCloud to keep intra-shard routing to a minimum. FWIW, Erick On Wed, Oct 23, 2013 at 12:59 PM, Chris Geeringh geeri...@gmail.com wrote: Prerna, The FileListEntityProcessor has a terribly inefficient recursive method, which will be using up all your heap building a list of files. I would suggest writing a client application and traverse your filesystem with NIO available in Java 7. Files.walkFileTree() and a FileVisitor. As you walk post up to the server with SolrJ. Cheers, Chris On 22 October 2013 18:58, keshari.prerna keshari.pre...@gmail.com wrote: Hello, I am tried to index log files (all text data) stored in file system. Data can be as big as 1000 GBs or more. I am working on windows. A sample file can be found at https://www.dropbox.com/s/mslwwnme6om38b5/batkid.glnxa64.66441 I tried using FileListEntityProcessor with TikaEntityProcessor which ended up in java heap exception and couldn't get rid of it no matter how much I increase my ram size. data-confilg.xml dataConfig dataSource name=bin type=FileDataSource / document entity name=f dataSource=null rootEntity=true processor=FileListEntityProcessor transformer=TemplateTransformer baseDir=//mathworks/devel/bat/A/logs/66048/ fileName=.*\.* onError=skip recursive=true field column=fileAbsolutePath name=path / field column=fileSize name=size/ field column=fileLastModified name=lastmodified / entity name=file dataSource=bin processor=TikaEntityProcessor url=${f.fileAbsolutePath} format=text onError=skip transformer=TemplateTransformer rootEntity=true field column=text name=text/ /entity /entity /document /dataConfig Then i used FileListEntityProcessor with LineEntityProcessor which never stopped indexing even after 40 hours or so. data-config.xml dataConfig dataSource name=bin type=FileDataSource / document entity name=f dataSource=null rootEntity=true processor=FileListEntityProcessor transformer=TemplateTransformer baseDir=//mathworks/devel/bat/A/logs/ fileName=.*\.* onError=skip recursive=true field column=fileAbsolutePath name=path / field column=fileSize name=size/ field column=fileLastModified name=lastmodified / entity name=file dataSource=bin processor=LineEntityProcessor url=${f.fileAbsolutePath} format=text onError=skip rootEntity=true field column=content name=rawLine/ /entity /entity /document /dataConfig Is there any way i can use post.jar to index text file recursively. Or any other way which works without java heap exception and doesn't take days to index. I am completely stuck here. Any help would be greatly appreciated. Thanks, Prerna -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-logs-files-of-thousands-of-GBs-tp4097073.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Query cache and group by queries
query cache? queryResultCache? filterCache? Some more details please, what are you seeing and what do you expect to see? Best, Erick On Wed, Oct 23, 2013 at 1:22 PM, Kalle Aaltonen kalle.aalto...@zemanta.comwrote: Hi, It seems that query cache is not used to all for group queries? Can someone explain why this is?
What is the right fieldType for this kind of field?
Dear, Data look likes: strA23L1/22066/str strA23L1/227/str strA23L1/231/str strA23L1/2375/str I tried: - String but I can't search with troncation (i.e. A23*) - Text_General but as my code contains / then data are splitted... What kind of field must choose to use truncation and consider code with / as one term? thanks a lot for your help, Bruno
Re: What is the right fieldType for this kind of field?
Trailing wildcard should work fine for strings, but a23* will not match A23* due to case. You could use the keyword tokenizer plus the lower case filter. -- Jack Krupansky -Original Message- From: Bruno Mannina Sent: Wednesday, October 23, 2013 1:54 PM To: solr-user@lucene.apache.org Subject: What is the right fieldType for this kind of field? Dear, Data look likes: strA23L1/22066/str strA23L1/227/str strA23L1/231/str strA23L1/2375/str I tried: - String but I can't search with troncation (i.e. A23*) - Text_General but as my code contains / then data are splitted... What kind of field must choose to use truncation and consider code with / as one term? thanks a lot for your help, Bruno
Re: DIH - delta query and delta import query executes transformer twice
Hello Lee. In case you haven't solved this, would you mind posting your DIH config? Arcadius. On 27 September 2013 15:06, Lee Carroll lee.a.carr...@googlemail.comwrote: Hi It looks like when a DIH entity has a delta and delta import query plus a transformer defined the execution of both query's call the transformer. I was expecting it to only be called on the import query. Sure we can check for a null value or something and just return the row during the delta query execution, but is their a better way of doing this. That is not call the transformer in the first place ? Cheers Lee C -- Arcadius Ahouansou Menelic Ltd | Information is Power M: 07908761999 W: www.menelic.com ---
Re: What is the right fieldType for this kind of field?
Hi Jack, Yes String works fine, I forgot to restart my solr server after changing my schema.xml...arrf.I'm so stupid sorry ! Le 23/10/2013 20:09, Jack Krupansky a écrit : Trailing wildcard should work fine for strings, but a23* will not match A23* due to case. You could use the keyword tokenizer plus the lower case filter. -- Jack Krupansky -Original Message- From: Bruno Mannina Sent: Wednesday, October 23, 2013 1:54 PM To: solr-user@lucene.apache.org Subject: What is the right fieldType for this kind of field? Dear, Data look likes: strA23L1/22066/str strA23L1/227/str strA23L1/231/str strA23L1/2375/str I tried: - String but I can't search with troncation (i.e. A23*) - Text_General but as my code contains / then data are splitted... What kind of field must choose to use truncation and consider code with / as one term? thanks a lot for your help, Bruno
Solr not indexing everything from MongoDB
Hi, I have a Mongo database with about 50 entries inside. I use a mongo-solr connector. When I do a Solr *:* query, I only get about 10 or 13 responses. Even if I increase the max rows. I have updated my schema.xml accordingly. I have deleted my solr index, restarted solr, restarted the connector, everything. Any ideas? Thanks! Zach -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-not-indexing-everything-from-MongoDB-tp4097302.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr not indexing everything from MongoDB
On 10/23/2013 1:14 PM, gohome190 wrote: I have a Mongo database with about 50 entries inside. I use a mongo-solr connector. When I do a Solr *:* query, I only get about 10 or 13 responses. Even if I increase the max rows. I have updated my schema.xml accordingly. I have deleted my solr index, restarted solr, restarted the connector, everything. Any ideas? What is the numFound value in the query response? If you go to the admin UI and select your core from the dropdown, what does it say for Num Docs and Max Doc?I'm assuming Solr 4.x here, 1.x and 3.x are very different. Thanks, Shawn
Solr facet field counts not correct
I am running a simple query in a non-distributed search using grouping. I am getting incorrect facet field counts and I cannot figure out why. Here is the query you will notice that the facet field and facet query counts are not the same. The facet query counts are correct. Any help is appreciated. This XML file does not appear to have any style information associated with it. The document tree is shown below. response lst name=responseHeader int name=status0/int int name=QTime18/int /lst lst name=grouped lst name=groupid int name=matches89/int int name=ngroups74/int arr name=groups lst str name=groupValueuc101cadet/str result name=doclist numFound=2 start=0 doc str name=productiduc101/str arr name=finish strWhite/str strBlack/str /arr str name=uniqueFinishBlack/str str name=manufacturercadet/str int name=productCompositeid736388/int int name=uniqueid1545116/int /doc /result /lst lst str name=groupValuerm162cadet/str result name=doclist numFound=1 start=0 doc str name=productidrm162/str arr name=finish strN/A/str /arr str name=uniqueFinishN/A/str str name=manufacturercadet/str int name=productCompositeid667690/int int name=uniqueid1545089/int /doc /result /lst lst str name=groupValuecs202cadet/str result name=doclist numFound=1 start=0 doc str name=productidcs202/str arr name=finish strN/A/str /arr str name=uniqueFinishN/A/str str name=manufacturercadet/str int name=productCompositeid460865/int int name=uniqueid1545142/int /doc /result /lst lst str name=groupValuecs152cadet/str result name=doclist numFound=1 start=0 doc str name=productidcs152/str arr name=finish strN/A/str /arr str name=uniqueFinishN/A/str str name=manufacturercadet/str int name=productCompositeid458740/int int name=uniqueid1545141/int /doc /result /lst lst str name=groupValue65201cadet/str result name=doclist numFound=1 start=0 doc str name=productid65201/str arr name=finish strWhite/str /arr str name=uniqueFinishWhite/str str name=manufacturercadet/str int name=productCompositeid773769/int int name=uniqueid1999873/int /doc /result /lst lst str name=groupValuermc202cadet/str result name=doclist numFound=1 start=0 doc str name=productidrmc202/str arr name=finish strWhite/str /arr str name=uniqueFinishWhite/str str name=manufacturercadet/str int name=productCompositeid667929/int int name=uniqueid1545122/int /doc /result /lst lst str name=groupValuerbf101cadet/str result name=doclist numFound=1 start=0 doc str name=productidrbf101/str arr name=finish strChrome/str /arr str name=uniqueFinishChrome/str str name=manufacturercadet/str int name=productCompositeid663553/int int name=uniqueid1820328/int /doc /result /lst lst str name=groupValuerm202cadet/str result name=doclist numFound=1 start=0 doc str name=productidrm202/str arr name=finish strN/A/str /arr str name=uniqueFinishN/A/str str name=manufacturercadet/str int name=productCompositeid667551/int int name=uniqueid1545088/int /doc /result /lst lst str name=groupValuesl151tcadet/str result name=doclist numFound=1 start=0 doc str name=productidsl151t/str arr name=finish strWhite/str /arr str name=uniqueFinishWhite/str str name=manufacturercadet/str int name=productCompositeid710375/int int name=uniqueid1545153/int /doc /result /lst lst str name=groupValueuc102cadet/str result name=doclist numFound=2 start=0 doc str name=productiduc102/str arr name=finish strWhite/str strBlack/str /arr str name=uniqueFinishWhite/str str name=manufacturercadet/str int name=productCompositeid736389/int int name=uniqueid1820349/int /doc /result /lst /arr /lst /lst lst name=facet_counts lst name=facet_queries int name=HeatingArea_numeric:[0 TO *]23/int /lst lst name=facet_fields lst name=HeatingArea_numeric int name=128.020/int int name=250.06/int int name=500.06/int int name=250.06/int int name=500.06/int int name=250.06/int int name=500.06/int int name=375.03/int int name=375.03/int int name=374.03/int int name=125.02/int int name=200.02/int int name=125.02/int int name=200.02/int int name=125.02/int int name=200.02/int int name=32.02/int int name=175.01/int int name=300.01/int int name=400.01/int int name=550.01/int int name=175.01/int int name=300.01/int int name=400.01/int int name=550.01/int int name=175.01/int int name=300.01/int int name=400.01/int int name=548.01/int int name=512.01/int int name=100.00/int int name=220.00/int int name=420.00/int int name=610.00/int int name=640.00/int int name=710.00/int int name=720.00/int int name=750.00/int int name=770.00/int int name=835.00/int int name=850.00/int int name=860.00/int int name=870.00/int int name=900.00/int int name=910.00/int int name=920.00/int int name=930.00/int int name=940.00/int int name=950.00/int int name=1000.00/int int name=1010.00/int int name=1015.00/int int name=1020.00/int int name=1040.00/int int name=1050.00/int int name=1070.00/int int name=1090.00/int int name=1100.00/int int name=1150.00/int int name=1175.00/int int name=1200.00/int int name=1250.00/int int name=1300.00/int int name=1330.00/int int name=1360.00/int int
Re: Solr facet field counts not correct
Here is my query String: /solr/singleproductindex/productQuery?fq=siteid:82q=categories_82_is:109124facet=truefacet.query=HeatingArea_numeric:[0%20TO%20*]facet.field=HeatingArea_numericdebugQuery=true Here is my schema for that field: dynamicField name=*_numeric type=tfloatindexed=true stored=false multiValued=true/ Here is my request handler definition: requestHandler name=/productQuery class=solr.SearchHandler lst name=defaults str name=dftext/str str name=defTypeedismax/str float name=tie0.01/float str name=qf sku^9.0 upc^9.1 keywords_82_txtws^1.9 series^2.8 productTitle^1.2 productid^9.0 manufacturer^4.0 masterFinish^1.5 theme^1.1 categoryNames_82_txt^0.2 finish^1.4 uniqueFinish^1 /str str name=pf keywords_82_txtws^2.1 productTitle^1.5 manufacturer^4.0 finish^1.9 /str str name=bf linear(popularity_82_i,1,2)^3.0 /str str name=fl uniqueid,productCompositeid,productid,manufacturer,uniqueFinish,finish /str str name=mm 3lt;-1 5lt;-2 6lt;90% /str bool name=grouptrue/bool str name=group.fieldgroupid/str bool name=group.ngroupstrue/bool bool name=group.facettrue/bool int name=ps100/int int name=qs3/int str name=spellcheck.count10/str str name=spellcheck.alternativeTermCount5/str str name=spellcheck.maxResultsForSuggest5/str str name=spellcheck.collatetrue/str str name=spellcheck.collateExtendedResultstrue/str str name=spellcheck.maxCollationTries10/str str name=spellcheck.maxCollations5/str /lst arr name=last-components strspellcheck/str /arr /requestHandler -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-facet-field-counts-not-correct-tp4097305p4097306.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr facet field counts not correct
if I do group=falsegroup.facet=false the counts are what they should be for the ungrouped counts... seems like group.facet isn't working correctly -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-facet-field-counts-not-correct-tp4097305p4097314.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: What is the right fieldType for this kind of field?
Le 23/10/2013 20:09, Jack Krupansky a écrit : You could use the keyword tokenizer plus the lower case filter. Jack, Could you help me to write the right fieldType please? (index and query) Another thing, I don't know if I must use the Keyword tokenizer because codes contain / char, and Tokenizer seems split code no ? Many thanks, Bruno
Re: What is the right fieldType for this kind of field?
Le 23/10/2013 22:44, Bruno Mannina a écrit : Le 23/10/2013 20:09, Jack Krupansky a écrit : You could use the keyword tokenizer plus the lower case filter. Jack, Could you help me to write the right fieldType please? (index and query) Another thing, I don't know if I must use the Keyword tokenizer because codes contain / char, and Tokenizer seems split code no ? Many thanks, Bruno may be an answer (i don't tested yet) http://pietervogelaar.nl/solr-3-5-search-case-insensitive-on-a-string-field-for-exact-match/
Re: What is the right fieldType for this kind of field?
Le 23/10/2013 22:49, Bruno Mannina a écrit : Le 23/10/2013 22:44, Bruno Mannina a écrit : Le 23/10/2013 20:09, Jack Krupansky a écrit : You could use the keyword tokenizer plus the lower case filter. Jack, Could you help me to write the right fieldType please? (index and query) Another thing, I don't know if I must use the Keyword tokenizer because codes contain / char, and Tokenizer seems split code no ? Many thanks, Bruno may be an answer (i don't tested yet) http://pietervogelaar.nl/solr-3-5-search-case-insensitive-on-a-string-field-for-exact-match/ ok it works fine !
Re: What is the right fieldType for this kind of field?
Yes, that blog post appears to use the proper technique for case insensitive string fields. The so-called keyword tokenizer merely treats the whole string value as a single token (AKA keyword) and does NOT do any further tokenization. -- Jack Krupansky -Original Message- From: Bruno Mannina Sent: Wednesday, October 23, 2013 4:57 PM To: solr-user@lucene.apache.org Subject: Re: What is the right fieldType for this kind of field? Le 23/10/2013 22:49, Bruno Mannina a écrit : Le 23/10/2013 22:44, Bruno Mannina a écrit : Le 23/10/2013 20:09, Jack Krupansky a écrit : You could use the keyword tokenizer plus the lower case filter. Jack, Could you help me to write the right fieldType please? (index and query) Another thing, I don't know if I must use the Keyword tokenizer because codes contain / char, and Tokenizer seems split code no ? Many thanks, Bruno may be an answer (i don't tested yet) http://pietervogelaar.nl/solr-3-5-search-case-insensitive-on-a-string-field-for-exact-match/ ok it works fine !
Terms function join with a Select function ?
Dear Solr users, I use the Terms function to see the frequency data in a field but it's for the whole database. I have 2 questions: - Is it possible to increase the number of statistic ? actually I have the 10 first frequency term. - Is it possible to limit this statistic to the result of a request ? PS: the second question is very important for me. Many thanks
Re: Solr not indexing everything from MongoDB
numFound is 10. numDocs is 10, maxDoc is 23. Yeah, Solr 4.x! Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-not-indexing-everything-from-MongoDB-tp4097302p4097340.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr facet field counts not correct
: if I do group=falsegroup.facet=false the counts are what they should be for : the ungrouped counts... seems like group.facet isn't working correctly yeah ... thanks for digging int -- definitely seems like a problem with group.facet and Trie fields that use precisionStep. I've opened a Jira: https://issues.apache.org/jira/browse/SOLR-5383 -Hoss
Re: Solr facet field counts not correct
Hoss created: https://issues.apache.org/jira/browse/SOLR-5383 -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-facet-field-counts-not-correct-tp4097305p4097346.html Sent from the Solr - User mailing list archive at Nabble.com.
single core for extracted text from pdf/other doc types and metadata fields about that doc from the database
Can I create a core where one subset of fields comes from the Database source using the DataImport handler for database and another subset of fields using the Apache Tika dataimport handler For example if in the indexed doc I want following fields to come from the database source 1 Id 2 DocFilePath (nullable) 3 Subject 4 KeyWords 5 Description 6 Text and another set of field(s) to come from documents on the filesystem with text extracted using Apache Tika processor 7 DocText so that Final Doc fields are as follows where DocText is the text of the document whose path is mentioned in the DocFilePath column 1 Id 2 DocFilePath (nullable) 3 Subject 4 KeyWords 5 Description 6 Text 7 DocText Thanks, Vikas Vikas Sharma | Senior Software Engineer | MedAssets 14405 SE 36th Street, Suite 206 | Bellevue, WA, 98006 | Work: 425.519.1305 vsha...@medassets.commailto:vsha...@medassets.com Visit us at www.medassets.comhttp://www.medassets.com Follow us on LinkedInhttp://www.linkedin.com/company/medassets, YouTubehttps://www.youtube.com/user/MedAssetsInc, Twitterhttps://twitter.com/MedAssets, and Facebookhttps://www.facebook.com/MedAssets *Attention* This electronic transmission may contain confidential, sensitive, proprietary and/or privileged information belonging to the sender. This information, including any attached files, is intended only for the persons or entities to which it is addressed. Authorized recipients of this information are prohibited from disclosing the information to any unauthorized party and are required to properly dispose of the information upon fulfillment of its need/use, unless otherwise required by law. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon this information by any person or entity other than the intended recipient is prohibited. If you have received this electronic transmission in error, please notify the sender and properly dispose of the information immediately.
Re: single core for extracted text from pdf/other doc types and metadata fields about that doc from the database
You can accomplish your end goal easily if you just write your own indexer, which is easy and gives you power and flexibility. Otis Solr ElasticSearch Support http://sematext.com/ On Oct 23, 2013 6:39 PM, Sharma, Vikas vsha...@medassets.com wrote: Can I create a core where one subset of fields comes from the Database source using the DataImport handler for database and another subset of fields using the Apache Tika dataimport handler For example if in the indexed doc I want following fields to come from the database source 1 Id 2 DocFilePath (nullable) 3 Subject 4 KeyWords 5 Description 6 Text and another set of field(s) to come from documents on the filesystem with text extracted using Apache Tika processor 7 DocText so that Final Doc fields are as follows where DocText is the text of the document whose path is mentioned in the DocFilePath column 1 Id 2 DocFilePath (nullable) 3 Subject 4 KeyWords 5 Description 6 Text 7 DocText Thanks, Vikas Vikas Sharma | Senior Software Engineer | MedAssets 14405 SE 36th Street, Suite 206 | Bellevue, WA, 98006 | Work: 425.519.1305 vsha...@medassets.commailto:vsha...@medassets.com Visit us at www.medassets.comhttp://www.medassets.com Follow us on LinkedInhttp://www.linkedin.com/company/medassets, YouTube https://www.youtube.com/user/MedAssetsInc, Twitter https://twitter.com/MedAssets, and Facebook https://www.facebook.com/MedAssets *Attention* This electronic transmission may contain confidential, sensitive, proprietary and/or privileged information belonging to the sender. This information, including any attached files, is intended only for the persons or entities to which it is addressed. Authorized recipients of this information are prohibited from disclosing the information to any unauthorized party and are required to properly dispose of the information upon fulfillment of its need/use, unless otherwise required by law. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon this information by any person or entity other than the intended recipient is prohibited. If you have received this electronic transmission in error, please notify the sender and properly dispose of the information immediately.
Solr operation problem
Dear user, Cloud you please help me to solve my following problem: I have installed Java, Tomcat and arrange all the files for Solr 4.5 according the instruction from Solr Wiki.htm and different web. My tomcat is running well but I am getting problem once I try to open solr using http://localhost:8983/solr. It shows, type Status report message /solr description The requested resource is not available. Apache Tomcat/7.0.42 I am seeking help to continue my solr. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-operation-problem-tp4097390.html Sent from the Solr - User mailing list archive at Nabble.com.
Global IDF vs. Routing
Hi, Seeing so much work being put in routing and seeing the recent questions about the status of global IDF support made me realize, for the first time really, that with people using routing more and more we should be seeing more and more issues caused by the lack of global IDF because routing by definition doesn't randomly and evenly spread data across shards. Is this correct or am I missing something and this is in fact not (such a big) problem? Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/
Carrot2 Clustering with Field Collapsing
When I try to use carrot2 clustering in solr with grouping based on a field, I get a null pointer exception. However, the clustering query works fine without field grouping. For eg: the below query works fine: /clustering?q=text:applerows=500carrot.title=title but, this query throws an error: /clustering? q=text:tigerrows=500carrot.title=titlegroup=truegroup.field=idgroup.main=true I would like to know if clustering is supported with field collapsing or I'm doing something incorrect? The stack trace for the error is given below: java.lang.NullPointerException at org.apache.solr.handler.clustering.ClusteringComponent.process(ClusteringComponent.java:161) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:241) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.jav Thanigai Vellore Sr. Software Architect Art.com Phone: (510) 879-4791 [Art.com Inc.] If you have received this e-mail in error, please immediately notify the sender by reply e-mail and destroy the original e-mail and its attachments without reading or saving them. This e-mail and any documents, files or previous e-mail messages attached to it, may contain confidential or privileged information that is prohibited from disclosure under confidentiality agreement or applicable law. If you are not the intended recipient, or a person responsible for delivering it to the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of this e-mail or any of the information contained in or attached to this e-mail is STRICTLY PROHIBITED. Thank you.
Re: Global IDF vs. Routing
On Wed, Oct 23, 2013 at 9:03 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Seeing so much work being put in routing and seeing the recent questions about the status of global IDF support made me realize, for the first time really, that with people using routing more and more we should be seeing more and more issues caused by the lack of global IDF because routing by definition doesn't randomly and evenly spread data across shards. Many people are using routing to partition users data - in this case, global IDF would normally not be what you want anyway. -Yonik
Re: Solr operation problem
Have you already used Solr with default setup (Jetty)? If not, I recommend you do the Jetty setup first and online tutorial. Just so you understand what the files are, where they are and so on. Then, add Tomcat into the mix. If you still have a problem, let us know which operating system you are on and what exceptions you are getting in log files. Currently, the information you provided is insufficient exactly because Tomcat is not the primary out-of-the-box solution. Regards, Alex. P.s. Latest version of Solr requires additional logging libraries for usage with Tomcat. It's documented on the Wiki. But I am not sure you are even at this point yet. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Thu, Oct 24, 2013 at 7:58 AM, masum.uia masum@gmail.com wrote: Dear user, Cloud you please help me to solve my following problem: I have installed Java, Tomcat and arrange all the files for Solr 4.5 according the instruction from Solr Wiki.htm and different web. My tomcat is running well but I am getting problem once I try to open solr using http://localhost:8983/solr. It shows, type Status report message /solr description The requested resource is not available. Apache Tomcat/7.0.42 I am seeking help to continue my solr. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-operation-problem-tp4097390.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: deleteByQuery does not work with SolrCloud
Hi Erick It can get hits on this documents. And I try this : myhost/solr/mycore/update?stream.body=deletequeryname:shardTv_20131010/query/deletecommit=true the document could be deleted. Regards 2013/10/23 Erick Erickson erickerick...@gmail.com The first thing I'd do is go in to the browser UI and make sure you can get hits on documents, something like blah/collection/q=indexname:shardTv_20131010 Best, Erick On Wed, Oct 23, 2013 at 8:20 AM, YouPeng Yang yypvsxf19870...@gmail.com wrote: Hi I am using SolrCloud withing solr 4.4 ,and I try the SolrJ API deleteByQuery to delete the Index as : CloudSolrServer cloudServer = new CloudSolrServer(myZKhost) cloudServer.connect() cloudServer.setDefaultCollection cloudServer.deleteByQuery(indexname:shardTv_20131010); cloudServer.commit(); It seems not to work. I also have do some google,unfortunately there is no help. Do I miss anything? Thanks Regard
Re: Global IDF vs. Routing
Duh, right, right, sorry for the noise. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Wed, Oct 23, 2013 at 9:13 PM, Yonik Seeley ysee...@gmail.com wrote: On Wed, Oct 23, 2013 at 9:03 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Seeing so much work being put in routing and seeing the recent questions about the status of global IDF support made me realize, for the first time really, that with people using routing more and more we should be seeing more and more issues caused by the lack of global IDF because routing by definition doesn't randomly and evenly spread data across shards. Many people are using routing to partition users data - in this case, global IDF would normally not be what you want anyway. -Yonik
why Analyzer in solr always hang ?
Hi All , My custom analyser always hang when I click Analysis values button from analysis page . The thread dump is the following : http-bio-8080-exec-7 daemon prio=5 tid=7ffc7e0a9800 nid=0x1152d6000 runnable [1152d3000] java.lang.Thread.State: RUNNABLE at gnu.trove.impl.hash.TObjectHash.insertKeyRehash(TObjectHash.java:348) at gnu.trove.impl.hash.TObjectHash.insertKey(TObjectHash.java:294) at gnu.trove.map.custom_hash.TObjectIntCustomHashMap.put(TObjectIntCustomHashMap.java:252) at gnu.trove.map.custom_hash.TObjectIntCustomHashMap.readExternal(TObjectIntCustomHashMap.java:1141) at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1795) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1754) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1326) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1950) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1874) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1756) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1326) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:348) at java.util.HashMap.readObject(HashMap.java:1030) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:969) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1852) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1756) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1326) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1950) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1874) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1756) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1326) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1950) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1874) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1756) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1326) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:348) …….. at org.apache.solr.handler.FieldAnalysisRequestHandler.doAnalysis(FieldAnalysisRequestHandler.java:101) at org.apache.solr.handler.AnalysisRequestHandlerBase.handleRequestBody(AnalysisRequestHandlerBase.java:59) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:241) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:953) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1023) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:312) - locked 7810afd50 (a org.apache.tomcat.util.net.SocketWrapper) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:680) Locked ownable synchronizers: - 780f81530 (a java.util.concurrent.locks.ReentrantLock$NonfairSync) Can anybody give some hints and suggestions for this kind of issue ? Thanks, -Mingz
Re: Major GC does not reduce the old gen size
help please -- View this message in context: http://lucene.472066.n3.nabble.com/Major-GC-does-not-reduce-the-old-gen-size-tp4096880p4097429.html Sent from the Solr - User mailing list archive at Nabble.com.