How to change the JVM Threads of SolrCloud

2021-01-21 Thread Issei Nishigata
Hello All,

I'm running SolrCloud(1 shard,9 replicas) on Amazon EKS.

The other day, when I accidentally stopped CoreDNS of EKS,
the entire Solr cluster went down due to the inability to resolve names of
each node.
I restarted CoreDNS shortly afterwards, but the Solr node just repeated
down and recovering,
and it did not return to the normal state automatically.

During this time Solr was in a state of accepting search requests all the
time,
so I stopped the search request completely.
After that, I executed DELETEREPLICA to reduce the number of Solr nodes to
one.
I increased the number of replicas little by little, and after returning to
the original cluster state completely,
I restarted the search request, and after that, no particular problem
occurred.

At the time of this failure, the JVM Threads on each node were stuck at
1.
Since the load was very high, it is probable that each node repeated down
and recovering.
If I reduced(or increased) this JVM Threads, would the Solr cluster
automatically return to normal state?
If so, what setting in sorlconfig.xml should I change to reduce(or
increase) this JVM Threads?
I think "maxConnectionsPerHost" and "maximumPoolSize" are related to this
issue,
but I'm not sure about the difference between the two.

Any help would be appreciated.

Thanks, Issei


Re: DIH on SolrCloud

2020-08-13 Thread Issei Nishigata
Thank you for your quick reply.
Can I make sure that the indexing isn't conducted on the node where the DIH
executed but conducted on the Leader node, right?

As far as I have seen a log, there are errors: the failed establishment of
connection occurred from Node2 on the state of Replica on running DIH to
Node9 where on the state of Replica.
Therefore, for my understanding, I thought there would be errors when the
DIH was implemented at the Node2 and trying to forward a tlog to Node9.

Unless Node9 receives the tlog, if Node1 as Leader receives the tlog, I do
believe there are no worries because Node9 is synchronised with Node1.
But if Node1 as Leader cannot receive the tlog, Replica might be
synchronised to the Leader soon and that makes me a problematic issue.
I want to try to find out the cause as I will check all log files of all
servers through, but could you give me your comment for my understanding of
the indexing architecture on SolrCloud, please?


Thanks,
Issei

2020年8月14日(金) 0:33 Jörn Franke :

> DIH is deprecated in current Solr versions. The general recommendation is
> to do processing outside the Solr server and use the update handler (the
> normal one, not Cell) to add documents to the index. So you should avoid
> using it as it is not future proof .
>
> If you need more Time to migrate to a non-DIH solution:
> I recommend to look at all log files of all servers to find the real error
> behind the issue. If you trigger in Solr cloud mode DIH from node 2 that
> does not mean it is executed there !
>
> What could to wrong:
> Other nodes do not have access to files/database or there is a parsing
> error or a script error.
>
> > Am 13.08.2020 um 17:21 schrieb Issei Nishigata :
> >
> > Hi, All
> >
> > I'm using Solr4.10 with SolrCloud mode.
> > I have 10 Nodes. one of Nodes is Leader Node, the others is Replica.(I
> will
> > call this Node1 to Node10 for convenience)
> > -> 1 Shard, 1 Leader(Node1), 9 Replica(Node2-10)
> > Indexing always uses DIH of Node2. Therefore, DIH may be executed when
> > Node2 is Leader or Replica.
> > Node2 is not forcibly set to Leader when DIH is executed.
> >
> > At one point, when Node2 executed DIH in the Replica state, the following
> > error in Node9 occurred.
> >
> >
> [updateExecutor-1-thread-9737][ERROR][org.apache.solr.common.SolrException]
> > - org.apache.solr.client.solrj.SolrServerException: IOException occured
> > when talking to server at:
> http://samplehost:8983/solr/test_shard1_replica9
> >
> > I think this is the error while sending data from Node2 to Node9. And
> Node9
> > couldn't respond for some reason.
> >
> > The error occurs sometimes however it is not reproducible so that the
> > investigation is troublesome.
> > Is there any possible cause for this problem? I am worrying about if it
> is
> > doing Solr anti-pattern.
> > The thing is, when running DIH by Node2 as Replica, the above error
> occurs
> > towards Node1 as Leader,
> > then soon after, all the nodes might be returning to the index of the
> > Node1.
> > Do you think my understanding makes sense?
> >
> > If using DIH on SolrCloud is not recommended, please let me know about
> this.
> >
> > Thanks,
> > Issei
>


DIH on SolrCloud

2020-08-13 Thread Issei Nishigata
Hi, All

I'm using Solr4.10 with SolrCloud mode.
I have 10 Nodes. one of Nodes is Leader Node, the others is Replica.(I will
call this Node1 to Node10 for convenience)
-> 1 Shard, 1 Leader(Node1), 9 Replica(Node2-10)
Indexing always uses DIH of Node2. Therefore, DIH may be executed when
Node2 is Leader or Replica.
Node2 is not forcibly set to Leader when DIH is executed.

At one point, when Node2 executed DIH in the Replica state, the following
error in Node9 occurred.

[updateExecutor-1-thread-9737][ERROR][org.apache.solr.common.SolrException]
- org.apache.solr.client.solrj.SolrServerException: IOException occured
when talking to server at: http://samplehost:8983/solr/test_shard1_replica9

I think this is the error while sending data from Node2 to Node9. And Node9
couldn't respond for some reason.

The error occurs sometimes however it is not reproducible so that the
investigation is troublesome.
Is there any possible cause for this problem? I am worrying about if it is
doing Solr anti-pattern.
The thing is, when running DIH by Node2 as Replica, the above error occurs
towards Node1 as Leader,
then soon after, all the nodes might be returning to the index of the
Node1.
Do you think my understanding makes sense?

If using DIH on SolrCloud is not recommended, please let me know about this.

Thanks,
Issei


Re: AtomicUpdate on SolrCloud is not working

2020-07-17 Thread Issei Nishigata
I have the same problem in my Solr8.
I think it's because in the first way,
TrimFieldUpdateProcessorFactory and RemoveBlankFieldUpdateProcessorFactory
is not taking effect.

On SolrCloud, TrimFieldUpdateProcessorFactory,
RemoveBlankFieldUpdateProcessorFactory and other processors
only run on the first node that receives an update request.
Consequently, it's necessary to execute TrimFieldUpdateProcessorFactory and
RemoveBlankFieldUpdateProcessorFactory
after giving the document to the replica node using the
DistributedUpdateProcessor,
so we need to use the second way that he described otherwise it won't
operate properly.

But even with this way, both I and he are worried whether it will be cause
of SOLR-8030.
I also want to know about this, does anyone have any comment about this?


Best,
Issei

2020年7月17日(金) 18:34 Jörn Franke :

> What does „not work correctly mean“?
>
> Have you checked that all fields are stored or doc values?
>
> > Am 17.07.2020 um 11:26 schrieb yo tomi :
> >
> > Hi All
> >
> > Sorry, above settings are contrary with each other.
> > Actually, following setting does not work properly.
> > ---
> > 
> > 
> > 
> > 
> > 
> > 
> > ---
> > And follows is working as expected.
> > ---
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > ---
> >
> > Thanks,
> > Yoshiaki
> >
> >
> > 2020年7月17日(金) 16:32 yo tomi :
> >
> >> Hi, All
> >> When I did AtomicUpdate on SolrCloud by the following setting, it does
> not work properly.
> >>
> >> ---
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> ---
> >> When changed as follows and made it work, it became as expected.
> >> ---
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> ---
> >> The later setting and the way of using post-processor could make the
> same result, I though,
> >> but using post-processor, bug of SOLR-8030 makes me not feel like using
> it.
> >> By the latter setting even, is there any possibility of SOLR-8030 to
> become? Seeing the source code, tlog which is from leader comes to Replica
> seems to be processed correctly with UpdateRequestProcessor,
> >> the latter setting had not been the right one for the bug, I
> though.Anyone knows the most appropriate way to configure AtomicUpdate on
> SolrCloud?
> >>
> >> Thanks,
> >> Yoshiaki
> >>
> >>
>


facet.threads on JSON Facet

2020-03-01 Thread Issei Nishigata

Hi, All


Is facet.threads available on JSON Facet? If it's available on JSONFacet, how 
do I specify on request parameters?
I'm using facet.threads on JSON Facet like below. I can't confirm any 
performance difference before-and-after specifying facet.threads=-1

localhost:8983/solr/collection1/select?q=test=-1={category:{type:terms, field:category, 
limit:-1},content_type:{type:terms, field:content_type, limit:-1}}



I'm using Solr8.4.1
The machine that is running Solr has 8 Cores

Any clue will be very appreciated.


Sincerely,
Issei Nishigata


Failed to create collection

2019-02-10 Thread Issei Nishigata

Hello, all.


I have 1 collection running, and when I tried to create a new collection with 
the following command,
-
$ solr-6.2.0/bin/solr create -c collection2 -d data_driven_schema_configs
-

I got the following error.
-
Connecting to ZooKeeper at sample1:2181,sample2:2182,sample3:2183 ...
Uploading /tmp/solr-6.2.0/server/solr/configsets/data_driven_schema_configs/conf for config collection2 to ZooKeeper at 
sample1:2181,sample2:2182,sample3:2183


Creating new collection 'collection2' using command:
http://localhost:8983/solr/admin/collections?action=CREATE=collection2=1=1=1=collection2


ERROR: Failed to create collection 'collection2' due to: Could not fully create 
collection: portal2
-

I can see collection2 on the collections list of Solr Admin UI. but I can't confirm collection2 on the graph list of Solr Admin UI, and 
collection selector as well.


Does anyone know about cause for this error?
Could you please help me on how to resolve them?


Regards,
Issei


Performance if there is a large number of field

2018-05-10 Thread Issei Nishigata
Hi, all


I am designing a schema.

I calculated the number of the necessary field as trial, and found that I
need at least more than 35000.
I do not use all these fields in 1 document.
I use 300 field each document at maximum, and do not use remaining 34700
fields.

Does this way of using it affect performance such as retrieving and
sorting?
If it is affected, what kind of alternative idea do we have?


Thanks,
Issei

-- 
Issei Nishigata


How to replacing values on multiValued all together by using 1 query

2018-05-10 Thread Issei Nishigata
Hi, all


I create a field called employee_name, and use it as multiValued.
If “Mr.Smith" that is part of the value of the field is changed to
“Mr.Brown", do I have to create 1 million deletion queries
and updating queries in case where “Mr.Smith" appears in 1 million
documents?

Do we have a simple way of updating to use only 1 query?


Thanks,
Issei

-- 
Issei Nishigata


Re: About editing managed-schema by hand

2017-03-20 Thread Issei Nishigata
Thank you for these information.

but I am still confusing about specification of managed-schema.

I recognize that I cannot modify "unique id" or "Similarity" by Schema API
now.
* https://issues.apache.org/jira/browse/SOLR-7242

Isn't there any other way than hand-editing in this particular case?
Do we have any other way than hand-editing?


Is my understanding correct that managed-schema is not limited that it can
be modified
only via Schema API, but that we usually modify it via Schema API, and we
also can modify
what Schema API can't do by hand-editing?

Needless to say, I understand that there is an assumption that we do not
use
Schema API and hand-editing at the same time.



Thanks,
Issei

2017-03-02 10:15 GMT+09:00 Shawn Heisey <apa...@elyograg.org>:

> 2/27/2017 4:46 AM, Issei Nishigata wrote:
> > Thank you for your reply. If I was to say which one, I'd maybe be
> > talking about the concept for Solr. I understand we should use
> > "ClassicSchemaFactory" when we want to hand-edit, but why are there
> > two files, schema.xml and managed-schema, in spite that we can
> > hand-edit managed-schema? If we can modify the schema.xml through
> > Schema API, I think we won't need the managed-schema, but is there any
> > reason why that can't be done? Could you please let me know if there
> > is any information that can clear things up around those details?
>
> The default filename with the Managed Schema factory is managed-schema
> -- no extension.  I'm pretty sure that the reason the extension was
> removed was to discourage hand-editing.  If you use both hand-editing
> and API modification, you can lose some (or maybe all) of your hand edits.
>
> The default filename for the schema with the classic factory is
> schema.xml.  With this factory, API modification is not possible.
>
> If the managed factory is in use, and a schema.xml file is found during
> startup, the system will rename managed-schema (or whatever the config
> says to use) to something else, then rename schema.xml to managed-schema
> -- basically this is a startup-only way to support a legacy config.
>
> I personally don't ever plan to use the managed schema API, but I will
> leave the default factory in place, and hand-edit managed-schema, just
> like I did in previous versions with schema.xml.
>
> Thanks,
> Shawn
>
>


Re: About editing managed-schema by hand

2017-02-27 Thread Issei Nishigata
Thank you for your reply.

If I was to say which one, I'd maybe be talking about the concept for Solr.
I understand we should use "ClassicSchemaFactory" when we want to
hand-edit, but why are there two files, schema.xml and managed-schema, in
spite that we can hand-edit managed-schema? If we can modify the schema.xml
through Schema API, I think we won't need the managed-schema, but is there
any reason why that can't be done?
Could you please let me know if there is any information that can clear
things up around those details?

Thanks,
Issei

2017-02-27 1:51 GMT+09:00 Erick Erickson <erickerick...@gmail.com>:

> This is the sequence that gets you in trouble:
> > start solr
> > hand edit the schema _without_ reloading your collection or restarting
> all your Solr instances.
> > use the managed-schema API to make modifications.
>
> In this scenario your hand-edits can be lost since the in-memory version
> of the
> schema is written out without re-fetching it from Zookeeper.
>
> If you only ever hand-edit your schema, you'll be fine.
>
> If you conscientiously reload your collection (or restart all your Solr's)
> after
> you hand-edit your schema, you'll be fine even if you use the managed
> schema
> API calls.
>
> But really, if you want to hand-edit your schema why not go back to using
> the ClassicSchemaFactory? See:
> https://cwiki.apache.org/confluence/display/solr/
> Schema+Factory+Definition+in+SolrConfig#SchemaFactoryDefinitioninSolrC
> onfig-SwitchingfromManagedSchematoManuallyEditedschema.xml
>
> Best,
> Erick
>
> On Sun, Feb 26, 2017 at 8:22 AM, Issei Nishigata <duo.2...@gmail.com>
> wrote:
> > Hi, All
> >
> > Similar questions may have been already asked, but just in case please
> let
> > me ask you.
> > According to the below URL it says as "Schema modifications via the
> Schema
> > API will now be enabled by default.",
> > but would there be any issues if I edited with text editor instead of
> > Schema API?
> >
> > https://cwiki.apache.org/confluence/display/solr/Major+
> Changes+from+Solr+5+to+Solr+6
> >
> >
> > In the answer to the past question, it seemed okay.
> >
> > http://lucene.472066.n3.nabble.com/Solr-6-managed-
> schema-amp-version-control-td4289243.html
> >
> >
> > I was worried because managed-schema said "" when managed-schema was
> > automatically generated from schema.xml.
> > If I need to use Schema API and if I wanted to do some process that
> cannot
> > be done with Schema API(modifying unique key, etc), what should I do?
> >
> >
> > Thanks,
> > Issei
>


About editing managed-schema by hand

2017-02-26 Thread Issei Nishigata
Hi, All

Similar questions may have been already asked, but just in case please let
me ask you.
According to the below URL it says as "Schema modifications via the Schema
API will now be enabled by default.",
but would there be any issues if I edited with text editor instead of
Schema API?

https://cwiki.apache.org/confluence/display/solr/Major+Changes+from+Solr+5+to+Solr+6


In the answer to the past question, it seemed okay.

http://lucene.472066.n3.nabble.com/Solr-6-managed-schema-amp-version-control-td4289243.html


I was worried because managed-schema said "" when managed-schema was
automatically generated from schema.xml.
If I need to use Schema API and if I wanted to do some process that cannot
be done with Schema API(modifying unique key, etc), what should I do?


Thanks,
Issei


About reasons of "enablePositionIncrements" deprecation

2016-06-19 Thread Issei Nishigata
Hi, all.

I am using Solr5.5.

I am planning to activate "enablePositionIncrements" by customizing
"XXXFilterFactory" in order to make autoGeneratedPhraseQueries work
properly.
Can anyone tell me about the impact of such customizing?

I am planning two kinds of customizing mentioned below.
- To allow schema.xml to specify "enbalePositionIncrements".
- To make "lucene43XXXFilter" is called in case of
"enablePositionIncrements=false".

I would appreciate if you can tell me the reason "enablePostionIncrements"
is deprecated after Solr4.4 in the first place.
And, I also appreciate why "lucene43XXXFilter" is completely removed in
Solr6.


Thanks,
Issei


After Solr 5.5, mm parameter doesn't work properly

2016-05-29 Thread Issei Nishigata
Hi,

“mm" parameter does not work properly, when I set "q.op=AND” after Solr 5.5.
In Solr 5.4, mm parameter works expectedly with the following setting.

---
[schema]

  
    
  



[request]
http://localhost:8983/solr/collection1/select?defType=edismax=AND=2=solar
—

After Solr 5.5, the result will not be the same as Solr 5.4.
Has the setting of mm parameter specs, or description of file setting changed?


[Solr 5.4]

...
  
    2
    solar
    edismax
    AND
  
...

  
    0
    
      solr
    
  


  solar
  solar
  
  (+DisjunctionMaxQuerytext:so text:ol text:la text:ar)~2/no_coord
  
  +(((text:so text:ol text:la 
text:ar)~2))
  ...




[Solr 6.0.1]


...
  
    2
    solar
    edismax
    AND
  
...

  
    solar
    solar
    
    (+DisjunctionMaxQuery(((+text:so +text:ol +text:la +text:ar/no_coord
    
    +((+text:so +text:ol +text:la 
+text:ar))
...


As shown above, parsedquery also differs from Solr 5.4 and Solr 6.0.1(after 
Solr 5.5).


—
Thanks 
Issei Nishigata