How to change the JVM Threads of SolrCloud
Hello All, I'm running SolrCloud(1 shard,9 replicas) on Amazon EKS. The other day, when I accidentally stopped CoreDNS of EKS, the entire Solr cluster went down due to the inability to resolve names of each node. I restarted CoreDNS shortly afterwards, but the Solr node just repeated down and recovering, and it did not return to the normal state automatically. During this time Solr was in a state of accepting search requests all the time, so I stopped the search request completely. After that, I executed DELETEREPLICA to reduce the number of Solr nodes to one. I increased the number of replicas little by little, and after returning to the original cluster state completely, I restarted the search request, and after that, no particular problem occurred. At the time of this failure, the JVM Threads on each node were stuck at 1. Since the load was very high, it is probable that each node repeated down and recovering. If I reduced(or increased) this JVM Threads, would the Solr cluster automatically return to normal state? If so, what setting in sorlconfig.xml should I change to reduce(or increase) this JVM Threads? I think "maxConnectionsPerHost" and "maximumPoolSize" are related to this issue, but I'm not sure about the difference between the two. Any help would be appreciated. Thanks, Issei
Re: DIH on SolrCloud
Thank you for your quick reply. Can I make sure that the indexing isn't conducted on the node where the DIH executed but conducted on the Leader node, right? As far as I have seen a log, there are errors: the failed establishment of connection occurred from Node2 on the state of Replica on running DIH to Node9 where on the state of Replica. Therefore, for my understanding, I thought there would be errors when the DIH was implemented at the Node2 and trying to forward a tlog to Node9. Unless Node9 receives the tlog, if Node1 as Leader receives the tlog, I do believe there are no worries because Node9 is synchronised with Node1. But if Node1 as Leader cannot receive the tlog, Replica might be synchronised to the Leader soon and that makes me a problematic issue. I want to try to find out the cause as I will check all log files of all servers through, but could you give me your comment for my understanding of the indexing architecture on SolrCloud, please? Thanks, Issei 2020年8月14日(金) 0:33 Jörn Franke : > DIH is deprecated in current Solr versions. The general recommendation is > to do processing outside the Solr server and use the update handler (the > normal one, not Cell) to add documents to the index. So you should avoid > using it as it is not future proof . > > If you need more Time to migrate to a non-DIH solution: > I recommend to look at all log files of all servers to find the real error > behind the issue. If you trigger in Solr cloud mode DIH from node 2 that > does not mean it is executed there ! > > What could to wrong: > Other nodes do not have access to files/database or there is a parsing > error or a script error. > > > Am 13.08.2020 um 17:21 schrieb Issei Nishigata : > > > > Hi, All > > > > I'm using Solr4.10 with SolrCloud mode. > > I have 10 Nodes. one of Nodes is Leader Node, the others is Replica.(I > will > > call this Node1 to Node10 for convenience) > > -> 1 Shard, 1 Leader(Node1), 9 Replica(Node2-10) > > Indexing always uses DIH of Node2. Therefore, DIH may be executed when > > Node2 is Leader or Replica. > > Node2 is not forcibly set to Leader when DIH is executed. > > > > At one point, when Node2 executed DIH in the Replica state, the following > > error in Node9 occurred. > > > > > [updateExecutor-1-thread-9737][ERROR][org.apache.solr.common.SolrException] > > - org.apache.solr.client.solrj.SolrServerException: IOException occured > > when talking to server at: > http://samplehost:8983/solr/test_shard1_replica9 > > > > I think this is the error while sending data from Node2 to Node9. And > Node9 > > couldn't respond for some reason. > > > > The error occurs sometimes however it is not reproducible so that the > > investigation is troublesome. > > Is there any possible cause for this problem? I am worrying about if it > is > > doing Solr anti-pattern. > > The thing is, when running DIH by Node2 as Replica, the above error > occurs > > towards Node1 as Leader, > > then soon after, all the nodes might be returning to the index of the > > Node1. > > Do you think my understanding makes sense? > > > > If using DIH on SolrCloud is not recommended, please let me know about > this. > > > > Thanks, > > Issei >
DIH on SolrCloud
Hi, All I'm using Solr4.10 with SolrCloud mode. I have 10 Nodes. one of Nodes is Leader Node, the others is Replica.(I will call this Node1 to Node10 for convenience) -> 1 Shard, 1 Leader(Node1), 9 Replica(Node2-10) Indexing always uses DIH of Node2. Therefore, DIH may be executed when Node2 is Leader or Replica. Node2 is not forcibly set to Leader when DIH is executed. At one point, when Node2 executed DIH in the Replica state, the following error in Node9 occurred. [updateExecutor-1-thread-9737][ERROR][org.apache.solr.common.SolrException] - org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://samplehost:8983/solr/test_shard1_replica9 I think this is the error while sending data from Node2 to Node9. And Node9 couldn't respond for some reason. The error occurs sometimes however it is not reproducible so that the investigation is troublesome. Is there any possible cause for this problem? I am worrying about if it is doing Solr anti-pattern. The thing is, when running DIH by Node2 as Replica, the above error occurs towards Node1 as Leader, then soon after, all the nodes might be returning to the index of the Node1. Do you think my understanding makes sense? If using DIH on SolrCloud is not recommended, please let me know about this. Thanks, Issei
Re: AtomicUpdate on SolrCloud is not working
I have the same problem in my Solr8. I think it's because in the first way, TrimFieldUpdateProcessorFactory and RemoveBlankFieldUpdateProcessorFactory is not taking effect. On SolrCloud, TrimFieldUpdateProcessorFactory, RemoveBlankFieldUpdateProcessorFactory and other processors only run on the first node that receives an update request. Consequently, it's necessary to execute TrimFieldUpdateProcessorFactory and RemoveBlankFieldUpdateProcessorFactory after giving the document to the replica node using the DistributedUpdateProcessor, so we need to use the second way that he described otherwise it won't operate properly. But even with this way, both I and he are worried whether it will be cause of SOLR-8030. I also want to know about this, does anyone have any comment about this? Best, Issei 2020年7月17日(金) 18:34 Jörn Franke : > What does „not work correctly mean“? > > Have you checked that all fields are stored or doc values? > > > Am 17.07.2020 um 11:26 schrieb yo tomi : > > > > Hi All > > > > Sorry, above settings are contrary with each other. > > Actually, following setting does not work properly. > > --- > > > > > > > > > > > > > > --- > > And follows is working as expected. > > --- > > > > > > > > > > > > > > > > --- > > > > Thanks, > > Yoshiaki > > > > > > 2020年7月17日(金) 16:32 yo tomi : > > > >> Hi, All > >> When I did AtomicUpdate on SolrCloud by the following setting, it does > not work properly. > >> > >> --- > >> > >> > >> > >> > >> > >> > >> > >> --- > >> When changed as follows and made it work, it became as expected. > >> --- > >> > >> > >> > >> > >> > >> > >> --- > >> The later setting and the way of using post-processor could make the > same result, I though, > >> but using post-processor, bug of SOLR-8030 makes me not feel like using > it. > >> By the latter setting even, is there any possibility of SOLR-8030 to > become? Seeing the source code, tlog which is from leader comes to Replica > seems to be processed correctly with UpdateRequestProcessor, > >> the latter setting had not been the right one for the bug, I > though.Anyone knows the most appropriate way to configure AtomicUpdate on > SolrCloud? > >> > >> Thanks, > >> Yoshiaki > >> > >> >
facet.threads on JSON Facet
Hi, All Is facet.threads available on JSON Facet? If it's available on JSONFacet, how do I specify on request parameters? I'm using facet.threads on JSON Facet like below. I can't confirm any performance difference before-and-after specifying facet.threads=-1 localhost:8983/solr/collection1/select?q=test=-1={category:{type:terms, field:category, limit:-1},content_type:{type:terms, field:content_type, limit:-1}} I'm using Solr8.4.1 The machine that is running Solr has 8 Cores Any clue will be very appreciated. Sincerely, Issei Nishigata
Failed to create collection
Hello, all. I have 1 collection running, and when I tried to create a new collection with the following command, - $ solr-6.2.0/bin/solr create -c collection2 -d data_driven_schema_configs - I got the following error. - Connecting to ZooKeeper at sample1:2181,sample2:2182,sample3:2183 ... Uploading /tmp/solr-6.2.0/server/solr/configsets/data_driven_schema_configs/conf for config collection2 to ZooKeeper at sample1:2181,sample2:2182,sample3:2183 Creating new collection 'collection2' using command: http://localhost:8983/solr/admin/collections?action=CREATE=collection2=1=1=1=collection2 ERROR: Failed to create collection 'collection2' due to: Could not fully create collection: portal2 - I can see collection2 on the collections list of Solr Admin UI. but I can't confirm collection2 on the graph list of Solr Admin UI, and collection selector as well. Does anyone know about cause for this error? Could you please help me on how to resolve them? Regards, Issei
Performance if there is a large number of field
Hi, all I am designing a schema. I calculated the number of the necessary field as trial, and found that I need at least more than 35000. I do not use all these fields in 1 document. I use 300 field each document at maximum, and do not use remaining 34700 fields. Does this way of using it affect performance such as retrieving and sorting? If it is affected, what kind of alternative idea do we have? Thanks, Issei -- Issei Nishigata
How to replacing values on multiValued all together by using 1 query
Hi, all I create a field called employee_name, and use it as multiValued. If “Mr.Smith" that is part of the value of the field is changed to “Mr.Brown", do I have to create 1 million deletion queries and updating queries in case where “Mr.Smith" appears in 1 million documents? Do we have a simple way of updating to use only 1 query? Thanks, Issei -- Issei Nishigata
Re: About editing managed-schema by hand
Thank you for these information. but I am still confusing about specification of managed-schema. I recognize that I cannot modify "unique id" or "Similarity" by Schema API now. * https://issues.apache.org/jira/browse/SOLR-7242 Isn't there any other way than hand-editing in this particular case? Do we have any other way than hand-editing? Is my understanding correct that managed-schema is not limited that it can be modified only via Schema API, but that we usually modify it via Schema API, and we also can modify what Schema API can't do by hand-editing? Needless to say, I understand that there is an assumption that we do not use Schema API and hand-editing at the same time. Thanks, Issei 2017-03-02 10:15 GMT+09:00 Shawn Heisey <apa...@elyograg.org>: > 2/27/2017 4:46 AM, Issei Nishigata wrote: > > Thank you for your reply. If I was to say which one, I'd maybe be > > talking about the concept for Solr. I understand we should use > > "ClassicSchemaFactory" when we want to hand-edit, but why are there > > two files, schema.xml and managed-schema, in spite that we can > > hand-edit managed-schema? If we can modify the schema.xml through > > Schema API, I think we won't need the managed-schema, but is there any > > reason why that can't be done? Could you please let me know if there > > is any information that can clear things up around those details? > > The default filename with the Managed Schema factory is managed-schema > -- no extension. I'm pretty sure that the reason the extension was > removed was to discourage hand-editing. If you use both hand-editing > and API modification, you can lose some (or maybe all) of your hand edits. > > The default filename for the schema with the classic factory is > schema.xml. With this factory, API modification is not possible. > > If the managed factory is in use, and a schema.xml file is found during > startup, the system will rename managed-schema (or whatever the config > says to use) to something else, then rename schema.xml to managed-schema > -- basically this is a startup-only way to support a legacy config. > > I personally don't ever plan to use the managed schema API, but I will > leave the default factory in place, and hand-edit managed-schema, just > like I did in previous versions with schema.xml. > > Thanks, > Shawn > >
Re: About editing managed-schema by hand
Thank you for your reply. If I was to say which one, I'd maybe be talking about the concept for Solr. I understand we should use "ClassicSchemaFactory" when we want to hand-edit, but why are there two files, schema.xml and managed-schema, in spite that we can hand-edit managed-schema? If we can modify the schema.xml through Schema API, I think we won't need the managed-schema, but is there any reason why that can't be done? Could you please let me know if there is any information that can clear things up around those details? Thanks, Issei 2017-02-27 1:51 GMT+09:00 Erick Erickson <erickerick...@gmail.com>: > This is the sequence that gets you in trouble: > > start solr > > hand edit the schema _without_ reloading your collection or restarting > all your Solr instances. > > use the managed-schema API to make modifications. > > In this scenario your hand-edits can be lost since the in-memory version > of the > schema is written out without re-fetching it from Zookeeper. > > If you only ever hand-edit your schema, you'll be fine. > > If you conscientiously reload your collection (or restart all your Solr's) > after > you hand-edit your schema, you'll be fine even if you use the managed > schema > API calls. > > But really, if you want to hand-edit your schema why not go back to using > the ClassicSchemaFactory? See: > https://cwiki.apache.org/confluence/display/solr/ > Schema+Factory+Definition+in+SolrConfig#SchemaFactoryDefinitioninSolrC > onfig-SwitchingfromManagedSchematoManuallyEditedschema.xml > > Best, > Erick > > On Sun, Feb 26, 2017 at 8:22 AM, Issei Nishigata <duo.2...@gmail.com> > wrote: > > Hi, All > > > > Similar questions may have been already asked, but just in case please > let > > me ask you. > > According to the below URL it says as "Schema modifications via the > Schema > > API will now be enabled by default.", > > but would there be any issues if I edited with text editor instead of > > Schema API? > > > > https://cwiki.apache.org/confluence/display/solr/Major+ > Changes+from+Solr+5+to+Solr+6 > > > > > > In the answer to the past question, it seemed okay. > > > > http://lucene.472066.n3.nabble.com/Solr-6-managed- > schema-amp-version-control-td4289243.html > > > > > > I was worried because managed-schema said "" when managed-schema was > > automatically generated from schema.xml. > > If I need to use Schema API and if I wanted to do some process that > cannot > > be done with Schema API(modifying unique key, etc), what should I do? > > > > > > Thanks, > > Issei >
About editing managed-schema by hand
Hi, All Similar questions may have been already asked, but just in case please let me ask you. According to the below URL it says as "Schema modifications via the Schema API will now be enabled by default.", but would there be any issues if I edited with text editor instead of Schema API? https://cwiki.apache.org/confluence/display/solr/Major+Changes+from+Solr+5+to+Solr+6 In the answer to the past question, it seemed okay. http://lucene.472066.n3.nabble.com/Solr-6-managed-schema-amp-version-control-td4289243.html I was worried because managed-schema said "" when managed-schema was automatically generated from schema.xml. If I need to use Schema API and if I wanted to do some process that cannot be done with Schema API(modifying unique key, etc), what should I do? Thanks, Issei
About reasons of "enablePositionIncrements" deprecation
Hi, all. I am using Solr5.5. I am planning to activate "enablePositionIncrements" by customizing "XXXFilterFactory" in order to make autoGeneratedPhraseQueries work properly. Can anyone tell me about the impact of such customizing? I am planning two kinds of customizing mentioned below. - To allow schema.xml to specify "enbalePositionIncrements". - To make "lucene43XXXFilter" is called in case of "enablePositionIncrements=false". I would appreciate if you can tell me the reason "enablePostionIncrements" is deprecated after Solr4.4 in the first place. And, I also appreciate why "lucene43XXXFilter" is completely removed in Solr6. Thanks, Issei
After Solr 5.5, mm parameter doesn't work properly
Hi, “mm" parameter does not work properly, when I set "q.op=AND” after Solr 5.5. In Solr 5.4, mm parameter works expectedly with the following setting. --- [schema] [request] http://localhost:8983/solr/collection1/select?defType=edismax=AND=2=solar — After Solr 5.5, the result will not be the same as Solr 5.4. Has the setting of mm parameter specs, or description of file setting changed? [Solr 5.4] ... 2 solar edismax AND ... 0 solr solar solar (+DisjunctionMaxQuerytext:so text:ol text:la text:ar)~2/no_coord +(((text:so text:ol text:la text:ar)~2)) ... [Solr 6.0.1] ... 2 solar edismax AND ... solar solar (+DisjunctionMaxQuery(((+text:so +text:ol +text:la +text:ar/no_coord +((+text:so +text:ol +text:la +text:ar)) ... As shown above, parsedquery also differs from Solr 5.4 and Solr 6.0.1(after Solr 5.5). — Thanks Issei Nishigata