Re: Use case of UTILIZENODE API

2020-06-19 Thread ChienHuaWang
For someone who is interested in this API

It seems a known issue that UTILIZENODE does not enforce policy rules as
found in Jira
https://issues.apache.org/jira/browse/SOLR-12050

Please feel free to comment.



Regards,
Chien



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Vector Scoring Plugin for Solr : Dot Product and Cosine Similarity

2020-06-19 Thread Vincenzo D'Amore
Hi all,

I've started to look at image similarity. For each image in documents I
have a couple of vectors that represent the color and shape similarity.

 As pointer to begin a colleague suggested me to start with this project:

https://github.com/moshebla/solr-vector-scoring

but the project seems old and not maintained... On the other hand, looking
at Solr Documentation I see there is support for dot product and cosine
Similarity

https://lucene.apache.org/solr/guide/7_5/vector-math.html#dot-product-and-cosine-similarity

So Solr has the capability to calculate the similarity between two vectors.
But it is not clear how to use this native feature when searching. Am I
missing something? Any help or even suggestion would be appreciated.

Best regards,
Vincenzo


-- 
Vincenzo D'Amore


Re: [EXTERNAL] Getting rid of Master/Slave nomenclature in Solr

2020-06-19 Thread Walter Underwood
> On Jun 19, 2020, at 7:48 AM, Phill Campbell  
> wrote:
> 
> Delegator - Handler
> 
> A common pattern we are all aware of. Pretty simple.

The Solr master does not delegate and the slave does not handle.
The master is a server that handles replication requests from the
slave.

Delegator/handler is a common pattern, but it is not the pattern
that describes traditional Solr replication.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)



Re: Getting rid of Overseer nomenclature in Solr

2020-06-19 Thread Ilan Ginzburg
Thing is, we can get a cluster wide lock when needed through zookeeper.
It's collection cluster wide lock BTW most of the time (or locks on shards
or replicas of a collection).
We could already today (in master) have multiple Overseer and distribute
(partition) the collections to them...

In 8x we still need cluster wide locks for /clusterstate.json based
operations.

Ilan

Le ven. 19 juin 2020 à 20:52, Jan Høydahl  a écrit :

> supervisor
>
> The name change should probably happen as part of the major overseer
> overhaul that is planned. Perhaps there won’t be much left of the overseer
> if more operations can be done without a cluster-wide lock?
>
> Jan
>
> > 19. jun. 2020 kl. 20:03 skrev Walter Underwood :
> >
> > I just split this off with a different subject line for the “overseer”
> discussion.
> > That seems independent of the other choices.
> >
> > I’ve heard these suggestions:
> >
> > * orchestrator
> > * director
> > * coordinator
> > * cluster manager
> > * manager
> >
> > There is a thing called “process orchestration” which is at a higher
> level than
> > what the overseer does. It might be something like all the customer
> interactions
> > in a billing process. That usage might be confusing for the term
> “orchestrator”.
> >
> > wunder
> > Walter Underwood
> > wun...@wunderwood.org
> > http://observer.wunderwood.org/  (my blog)
> >
> >> On Jun 18, 2020, at 10:44 PM, Thomas Corthals 
> wrote:
> >>
> >> Since "overseer" is also problematic, I'd like to propose
> "orchestrator" as
> >> an alternative.
> >>
> >> Thomas
> >
>
>


Re: Getting rid of Master/Slave nomenclature in Solr

2020-06-19 Thread David Cumings
Time to decouple from the weighty semantics of human experience and look to
nature?

queens/workers/drones/swarms?




On Wed, 17 Jun 2020 at 20:38, Anshum Gupta  wrote:

> Hi everyone,
>
> Moving a conversation that was happening on the PMC list to the public
> forum. Most of the following is just me recapping the conversation that has
> happened so far.
>
> Some members of the community have been discussing getting rid of the
> master/slave nomenclature from Solr.
>
> While this may require a non-trivial effort, a general consensus so far
> seems to be to start this process and switch over incrementally, if a
> single change ends up being too big.
>
> There have been a lot of suggestions around what the new nomenclature might
> look like, a few people don’t want to overlap the naming here with what
> already exists in SolrCloud i.e. leader/follower.
>
> Primary/Replica was an option that was suggested based on what other
> vendors are moving towards based on Wikipedia:
> https://en.wikipedia.org/wiki/Master/slave_(technology)
> , however there were concerns around the use of “replica” as that denotes a
> very specific concept in SolrCloud. Current terminology clearly
> differentiates the use of the traditional replication model from SolrCloud
> and reusing the names would make it difficult for that to happen.
>
> There were similar concerns around using Leader/follower.
>
> Let’s continue this conversation here while making sure that we converge
> without much bike-shedding.
>
> -Anshum
>


-- 
David Cumings
AU: +61 498 137 841
US: +1 (929) 291-0801
UK: +44 7725 057 500 <-- Currently in the UK
IN: +91 82771 96058
d...@cumings.com


Re: Getting rid of Overseer nomenclature in Solr

2020-06-19 Thread Jan Høydahl
supervisor

The name change should probably happen as part of the major overseer overhaul 
that is planned. Perhaps there won’t be much left of the overseer if more 
operations can be done without a cluster-wide lock?

Jan

> 19. jun. 2020 kl. 20:03 skrev Walter Underwood :
> 
> I just split this off with a different subject line for the “overseer” 
> discussion.
> That seems independent of the other choices.
> 
> I’ve heard these suggestions:
> 
> * orchestrator
> * director
> * coordinator
> * cluster manager
> * manager
> 
> There is a thing called “process orchestration” which is at a higher level 
> than
> what the overseer does. It might be something like all the customer 
> interactions
> in a billing process. That usage might be confusing for the term 
> “orchestrator”.
> 
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
>> On Jun 18, 2020, at 10:44 PM, Thomas Corthals  wrote:
>> 
>> Since "overseer" is also problematic, I'd like to propose "orchestrator" as
>> an alternative.
>> 
>> Thomas
> 



Re: [EXTERNAL] Getting rid of Master/Slave nomenclature in Solr

2020-06-19 Thread Jörn Franke
Might be confusing with the nested doc terminology 

> Am 19.06.2020 um 20:14 schrieb Atita Arora :
> 
> I see so many topics being discussed in this thread and I literary got lost
> somewhere , but was just thinking can we call it Parent -Child
> architecture, m sure no one will raise an objection there.
> 
> Although, looking at comments above I still feel it would be a bigger
> effort to convince everyone than making a change. ;)
> 
>> On Fri, 19 Jun 2020, 17:21 Mark H. Wood,  wrote:
>> 
>>> On Fri, Jun 19, 2020 at 09:22:49AM -0400, j.s. wrote:
>>> On 6/18/20 9:50 PM, Rahul Goswami wrote:
 So +1 on "slave" being the problematic term IMO, not "master".
>>> 
>>> but you cannot have a master without a slave, n'est-ce pas?
>> 
>> Well, yes.  In education:  Master of Science, Arts, etc.  In law:
>> Special Master (basically a judge's delegate).  See also "magistrate."
>> None of these has any connotation of the ownership of one person by
>> another.
>> 
>> (It's a one-way relationship:  there is no slavery without mastery,
>> but there are other kinds of mastery.)
>> 
>> But this is an emotional issue, not a logical one.  If doing X makes
>> people angry, and we don't want to make those people angry, then
>> perhaps we should not do X.
>> 
>>> i think it is better to use the metaphor of copying rather than one of
>>> hierarchy. language has so many (unintended) consequences ...
>> 
>> Sensible.
>> 
>> --
>> Mark H. Wood
>> Lead Technology Analyst
>> 
>> University Library
>> Indiana University - Purdue University Indianapolis
>> 755 W. Michigan Street
>> Indianapolis, IN 46202
>> 317-274-0749
>> www.ulib.iupui.edu
>> 


Re: unified highlighter performance in solr 8.5.1

2020-06-19 Thread Nándor Mátravölgyi
Hi!

With the provided test I've profiled the preceding() and following()
calls on the base Java iterators in the different options.

=== default highlighter arguments ===
Calling the test query with SENTENCE base iterator:
- from LengthGoalBreakIterator.following(): 1130 calls of
baseIter.preceding() took 1.039629 seconds in total
- from LengthGoalBreakIterator.following(): 1140 calls of
baseIter.following() took 0.340679 seconds in total
- from LengthGoalBreakIterator.preceding(): 1150 calls of
baseIter.preceding() took 0.099344 seconds in total
- from LengthGoalBreakIterator.preceding(): 1100 calls of
baseIter.following() took 0.015156 seconds in total

Calling the test query with WORD base iterator:
- from LengthGoalBreakIterator.following(): 1200 calls of
baseIter.preceding() took 0.001006 seconds in total
- from LengthGoalBreakIterator.following(): 1700 calls of
baseIter.following() took 0.006278 seconds in total
- from LengthGoalBreakIterator.preceding(): 1710 calls of
baseIter.preceding() took 0.016320 seconds in total
- from LengthGoalBreakIterator.preceding(): 1090 calls of
baseIter.following() took 0.000527 seconds in total

=== hl.fragsizeIsMinimum=true=0 ===
Calling the test query with SENTENCE base iterator:
- from LengthGoalBreakIterator.following(): 860 calls of
baseIter.following() took 0.012593 seconds in total
- from LengthGoalBreakIterator.preceding(): 870 calls of
baseIter.preceding() took 0.022208 seconds in total

Calling the test query with WORD base iterator:
- from LengthGoalBreakIterator.following(): 1360 calls of
baseIter.following() took 0.004789 seconds in total
- from LengthGoalBreakIterator.preceding(): 1370 calls of
baseIter.preceding() took 0.015983 seconds in total

=== hl.fragsizeIsMinimum=true ===
Calling the test query with SENTENCE base iterator:
- from LengthGoalBreakIterator.following(): 980 calls of
baseIter.following() took 0.010253 seconds in total
- from LengthGoalBreakIterator.preceding(): 980 calls of
baseIter.preceding() took 0.341997 seconds in total

Calling the test query with WORD base iterator:
- from LengthGoalBreakIterator.following(): 1670 calls of
baseIter.following() took 0.005150 seconds in total
- from LengthGoalBreakIterator.preceding(): 1680 calls of
baseIter.preceding() took 0.013657 seconds in total

=== hl.fragAlignRatio=0 ===
Calling the test query with SENTENCE base iterator:
- from LengthGoalBreakIterator.following(): 1070 calls of
baseIter.preceding() took 1.312056 seconds in total
- from LengthGoalBreakIterator.following(): 1080 calls of
baseIter.following() took 0.678575 seconds in total
- from LengthGoalBreakIterator.preceding(): 1080 calls of
baseIter.preceding() took 0.020507 seconds in total
- from LengthGoalBreakIterator.preceding(): 1080 calls of
baseIter.following() took 0.006977 seconds in total

Calling the test query with WORD base iterator:
- from LengthGoalBreakIterator.following(): 880 calls of
baseIter.preceding() took 0.000706 seconds in total
- from LengthGoalBreakIterator.following(): 1370 calls of
baseIter.following() took 0.004110 seconds in total
- from LengthGoalBreakIterator.preceding(): 1380 calls of
baseIter.preceding() took 0.014752 seconds in total
- from LengthGoalBreakIterator.preceding(): 1380 calls of
baseIter.following() took 0.000106 seconds in total

There is definitely a big difference between SENTENCE and WORD. I'm
not sure how we can improve the logic on our side while keeping the
features as is. Since the number of calls is roughly the same for when
the performance is good and bad, it seems to depend on what the text
is that the iterator is traversing.


Re: [EXTERNAL] Getting rid of Master/Slave nomenclature in Solr

2020-06-19 Thread Atita Arora
I see so many topics being discussed in this thread and I literary got lost
somewhere , but was just thinking can we call it Parent -Child
architecture, m sure no one will raise an objection there.

Although, looking at comments above I still feel it would be a bigger
effort to convince everyone than making a change. ;)

On Fri, 19 Jun 2020, 17:21 Mark H. Wood,  wrote:

> On Fri, Jun 19, 2020 at 09:22:49AM -0400, j.s. wrote:
> > On 6/18/20 9:50 PM, Rahul Goswami wrote:
> > > So +1 on "slave" being the problematic term IMO, not "master".
> >
> > but you cannot have a master without a slave, n'est-ce pas?
>
> Well, yes.  In education:  Master of Science, Arts, etc.  In law:
> Special Master (basically a judge's delegate).  See also "magistrate."
> None of these has any connotation of the ownership of one person by
> another.
>
> (It's a one-way relationship:  there is no slavery without mastery,
> but there are other kinds of mastery.)
>
> But this is an emotional issue, not a logical one.  If doing X makes
> people angry, and we don't want to make those people angry, then
> perhaps we should not do X.
>
> > i think it is better to use the metaphor of copying rather than one of
> > hierarchy. language has so many (unintended) consequences ...
>
> Sensible.
>
> --
> Mark H. Wood
> Lead Technology Analyst
>
> University Library
> Indiana University - Purdue University Indianapolis
> 755 W. Michigan Street
> Indianapolis, IN 46202
> 317-274-0749
> www.ulib.iupui.edu
>


Getting rid of Overseer nomenclature in Solr

2020-06-19 Thread Walter Underwood
I just split this off with a different subject line for the “overseer” 
discussion.
That seems independent of the other choices.

I’ve heard these suggestions:

* orchestrator
* director
* coordinator
* cluster manager
* manager

There is a thing called “process orchestration” which is at a higher level than
what the overseer does. It might be something like all the customer interactions
in a billing process. That usage might be confusing for the term “orchestrator”.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Jun 18, 2020, at 10:44 PM, Thomas Corthals  wrote:
> 
> Since "overseer" is also problematic, I'd like to propose "orchestrator" as
> an alternative.
> 
> Thomas



Re: [EXTERNAL] Getting rid of Master/Slave nomenclature in Solr

2020-06-19 Thread Mark H. Wood
On Fri, Jun 19, 2020 at 09:22:49AM -0400, j.s. wrote:
> On 6/18/20 9:50 PM, Rahul Goswami wrote:
> > So +1 on "slave" being the problematic term IMO, not "master".
> 
> but you cannot have a master without a slave, n'est-ce pas?

Well, yes.  In education:  Master of Science, Arts, etc.  In law:
Special Master (basically a judge's delegate).  See also "magistrate."
None of these has any connotation of the ownership of one person by
another.

(It's a one-way relationship:  there is no slavery without mastery,
but there are other kinds of mastery.)

But this is an emotional issue, not a logical one.  If doing X makes
people angry, and we don't want to make those people angry, then
perhaps we should not do X.

> i think it is better to use the metaphor of copying rather than one of 
> hierarchy. language has so many (unintended) consequences ...

Sensible.

-- 
Mark H. Wood
Lead Technology Analyst

University Library
Indiana University - Purdue University Indianapolis
755 W. Michigan Street
Indianapolis, IN 46202
317-274-0749
www.ulib.iupui.edu


signature.asc
Description: PGP signature


Re: [EXTERNAL] Getting rid of Master/Slave nomenclature in Solr

2020-06-19 Thread Phill Campbell
The entire idea of removing a word out of our language is problematic.
There will have to be a lot of history books that detail the terrible 
conditions of peoples over recorded history changed, or removed.

I find the “F” word extremely offensive. I find references to Deity while 
cursing extremely offensive. It is my privilege to deal with offense for the 
sake of liberty.

The use of the word is not promoting the practice nor is it denigrating those 
that have that in their history.

The “world” has decided, what ever.

Delegator - Handler

A common pattern we are all aware of. Pretty simple.



> On Jun 19, 2020, at 8:21 AM, Ilan Ginzburg  wrote:
> 
> +1 to Jan's "clustered" vs "non clustered".
> 
> If we clean up terminology, I suggest we also clarify the meaning and use
> of Slice vs Shard vs Leader vs Replica vs Core. Here's my understanding:
> 
> I consider Slice == Shard (and would happily drop Slice): a logical concept
> of a specific subset of a collection.
> A Shard then has one or multiple copies of the data called Replicas (if a
> shard has no copy of the data there's an issue). The Leader is one such
> Replica. A shard with a replication factor of 1 has a single Replica that
> happens to be the Leader. "Replica" does therefore not imply "replication".
> A Core is an in memory instantiation of a disk index representing a
> Replica. I believe that often the on disk index is referred to as "Core" as
> well (I'm not bothered by this, there's no associated confusion IMO).
> 
> Overseer is a central place where a fair bit of the cluster management
> logic is implemented today (Collection API, Autoscaling, Cluster state
> change). It is therefore a cluster manager. Note that a different
> implementation of "Clustered Solr" (a.k.a. SolrCloud) can most likely be
> done without the need of a central process in addition to the already
> centralized storage backend (currently ZooKeeper). In other words, Overseer
> is not IMO the defining characteristic of SolrCloud, it is one
> implementation choice, and there are others. To keep in mind for clarity
> and to guide renaming.
> 
> On Fri, Jun 19, 2020 at 3:23 PM j.s.  wrote:
> 
>> hi
>> 
>> solr is very helpful.
>> 
>> On 6/18/20 9:50 PM, Rahul Goswami wrote:
>>> So +1 on "slave" being the problematic term IMO, not "master".
>> 
>> but you cannot have a master without a slave, n'est-ce pas?
>> 
>> i think it is better to use the metaphor of copying rather than one of
>> hierarchy. language has so many (unintended) consequences ...
>> 
>> good luck!
>> 



Re: [EXTERNAL] Getting rid of Master/Slave nomenclature in Solr

2020-06-19 Thread Ilan Ginzburg
+1 to Jan's "clustered" vs "non clustered".

If we clean up terminology, I suggest we also clarify the meaning and use
of Slice vs Shard vs Leader vs Replica vs Core. Here's my understanding:

I consider Slice == Shard (and would happily drop Slice): a logical concept
of a specific subset of a collection.
A Shard then has one or multiple copies of the data called Replicas (if a
shard has no copy of the data there's an issue). The Leader is one such
Replica. A shard with a replication factor of 1 has a single Replica that
happens to be the Leader. "Replica" does therefore not imply "replication".
A Core is an in memory instantiation of a disk index representing a
Replica. I believe that often the on disk index is referred to as "Core" as
well (I'm not bothered by this, there's no associated confusion IMO).

Overseer is a central place where a fair bit of the cluster management
logic is implemented today (Collection API, Autoscaling, Cluster state
change). It is therefore a cluster manager. Note that a different
implementation of "Clustered Solr" (a.k.a. SolrCloud) can most likely be
done without the need of a central process in addition to the already
centralized storage backend (currently ZooKeeper). In other words, Overseer
is not IMO the defining characteristic of SolrCloud, it is one
implementation choice, and there are others. To keep in mind for clarity
and to guide renaming.

On Fri, Jun 19, 2020 at 3:23 PM j.s.  wrote:

> hi
>
> solr is very helpful.
>
> On 6/18/20 9:50 PM, Rahul Goswami wrote:
> > So +1 on "slave" being the problematic term IMO, not "master".
>
> but you cannot have a master without a slave, n'est-ce pas?
>
> i think it is better to use the metaphor of copying rather than one of
> hierarchy. language has so many (unintended) consequences ...
>
> good luck!
>


Index files on Windows fileshare

2020-06-19 Thread Fiz N
Hello Solr experts,

I am using standalone version of SOLR 8.5 on Windows machine.

1)  I want to index all types of files under different directory in the
file share.

2) I need to index  absolute path of the files and store it solr field. I
need that info so that end user can click and open the file(Pop-up)

Could you please tell me how to go about this?
This is for POC purpose once we finalize the solution we would be further
going ahead with stable approach.

Thanks
Fiz Nadian.


Re: Gettings interestingTerms from solr.MoreLikeThisHandler using SolrJ

2020-06-19 Thread Shawn Heisey

On 6/18/2020 5:31 AM, Zander, Sebastian wrote:

In the returning QueryResponse I can't find the interestingTerms.
I would really like to grab it on this way, before calling another time.
Any advices? I'm running solr 8.5.2


If you can send the full json or XML response, I think I can show you 
how to parse it with SolrJ.  I don't have easy access to production Solr 
servers, so it's a little difficult for me to try it out myself.


Thanks,
Shawn


Re: [EXTERNAL] Getting rid of Master/Slave nomenclature in Solr

2020-06-19 Thread j.s.

hi

solr is very helpful.

On 6/18/20 9:50 PM, Rahul Goswami wrote:

So +1 on "slave" being the problematic term IMO, not "master".


but you cannot have a master without a slave, n'est-ce pas?

i think it is better to use the metaphor of copying rather than one of 
hierarchy. language has so many (unintended) consequences ...


good luck!


Re: [EXTERNAL] Getting rid of Master/Slave nomenclature in Solr

2020-06-19 Thread Jan Høydahl
> Alt F:  "Managed Clustering" vs. "Manual Clustering" Mode

I don’t view a set of M/S nodes as «cluster» mode, since Solr is really not 
doing any cluster features. It’s like spinning up two standalone MySql servers 
with a cron job that runs a SQL from one to the other to update itself. It 
provides replication, but the client app still needs to know about each 
individual MySql DB and choose to manually send INSERTs to the «leader» and the 
client also would need to know up front that those two servers are part of a 
«shard» so you could add LB across them. When the first MySql crashes, the cron 
script would start failing and it would not recover, until you manually decide 
in your APP that the «replica» is to become «leader». Solr Master/Slave is the 
same, nothing is really clustered from the application’s point of view.

So perhaps just

Alt G: "Clustered" vs «Non-clustered» 

For non-clustered mode we can then refer to shards, where each shard will have 
a leader that is the one you have to index to, and one or more replicas that 
act like PULL replicas, no other types supported in non-clustered mode.

PS: I’d rather not include ZK in the naming, since we try to hide ZK and 
perhaps replace it with something else.

Jan

> 19. jun. 2020 kl. 04:40 skrev Trey Grainger :
> 
>> 
>> Let’s instead find a new good name for the cluster type. Standalone kind
>> of works
>> for me, but I see it can be confused with single-node.
> 
> Yeah, I've typically referred to it as "standalone", but I don't think it's
> descriptive enough. I can see why some people have been calling it
> "master/slave" mode in lieu of a more descriptive alternative. I think a
> new name (other than "standalone" or "legacy") would be superb.
> 
> We have also discussed replacing SolrCloud (which is a terrible name) with
>> something more descriptive.
> 
> Today: SolrCloud vs Master/slave
>> Alt A: SolrCloud vs Standalone
>> Alt B: SolrCloud vs Legacy
>> Alt C: Clustered vs Independent
>> Alt D: Clustered vs Manual mode
> 
> 
> +1 SolrCloud is even less descriptive and IMHO just sounds silly at this
> point.
> 
> re: "Clustered" vs Independent/Manual. The thing I don't like about that is
> that you typically have clusters in both modes. I think the key distinction
> is whether Solr "manages" the cluster automatically for you or whether you
> manage it manually yourself.
> 
> What do you think about:
> Alt E: "Managed Clustering" vs. "Unmanaged Clustering" Mode
> Alt F:  "Managed Clustering" vs. "Manual Clustering" Mode
> ?
> 
> I think I prefer option F.
> 
> Trey Grainger
> Founder, Searchkernel
> https://searchkernel.com
> 
> On Thu, Jun 18, 2020 at 5:59 PM Jan Høydahl  wrote:
> 
>> I support Mike Drob and Trey Grainger. We shuold re-use the leader/replica
>> terminology from Cloud. Even if you hand-configure a master/slave cluster
>> and orchestrate what doc goes to which node/shard, and hand-code your
>> shards
>> parameter, you will still have a cluster where you’d send updates to the
>> leader of
>> each shard and the replicas would replicate the index from the leader.
>> 
>> Let’s instead find a new good name for the cluster type. Standalone kind
>> of works
>> for me, but I see it can be confused with single-node. We have also
>> discussed
>> replacing SolrCloud (which is a terrible name) with something more
>> descriptive.
>> 
>> Today: SolrCloud vs Master/slave
>> Alt A: SolrCloud vs Standalone
>> Alt B: SolrCloud vs Legacy
>> Alt C: Clustered vs Independent
>> Alt D: Clustered vs Manual mode
>> 
>> Jan
>> 
>>> 18. jun. 2020 kl. 15:53 skrev Mike Drob :
>>> 
>>> I personally think that using Solr cloud terminology for this would be
>> fine
>>> with leader/follower. The leader is the one that accepts updates,
>> followers
>>> cascade the updates somehow. The presence of ZK or election doesn’t
>> really
>>> change this detail.
>>> 
>>> However, if folks feel that it’s confusing, then I can’t tell them that
>>> they’re not confused. Especially when they’re working with others who
>> have
>>> less Solr experience than we do and are less familiar with the
>> intricacies.
>>> 
>>> Primary/Replica seems acceptable. Coordinator instead of Overseer seems
>>> acceptable.
>>> 
>>> Would love to see this in 9.0!
>>> 
>>> Mike
>>> 
>>> On Thu, Jun 18, 2020 at 8:25 AM John Gallagher
>>>  wrote:
>>> 
 While on the topic of renaming roles, I'd like to propose finding a
>> better
 term than "overseer" which has historical slavery connotations as well.
 Director, perhaps?
 
 
 John Gallagher
 
 On Thu, Jun 18, 2020 at 8:48 AM Jason Gerlowski 
 wrote:
 
> +1 to rename master/slave, and +1 to choosing terminology distinct
> from what's used for SolrCloud.  I could be happy with several of the
> proposed options.  Since a good few have been proposed though, maybe
> an eventual vote thread is the most organized way to aggregate the
> opinions here.
> 
> I'm less positive about the 

Re: [EXTERNAL] Getting rid of Master/Slave nomenclature in Solr

2020-06-19 Thread gnandre
Another alternative for master-slave nodes might be parent-child nodes.
This was adopted in Python too afaik.

On Fri, Jun 19, 2020, 2:07 AM gnandre  wrote:

> What about blacklist and whitelist for shards? May I suggest blocklist and
> safelist?
>
> On Fri, Jun 19, 2020, 1:45 AM Thomas Corthals 
> wrote:
>
>> Since "overseer" is also problematic, I'd like to propose "orchestrator"
>> as
>> an alternative.
>>
>> Thomas
>>
>> Op vr 19 jun. 2020 04:34 schreef Walter Underwood > >:
>>
>> > We don’t get to decide whether “master” is a problem. The rest of the
>> world
>> > has already decided that it is a problem.
>> >
>> > Our task is to replace the terms “master” and “slave” in Solr.
>> >
>> > wunder
>> > Walter Underwood
>> > wun...@wunderwood.org
>> > http://observer.wunderwood.org/  (my blog)
>> >
>> > > On Jun 18, 2020, at 6:50 PM, Rahul Goswami 
>> > wrote:
>> > >
>> > > I agree with Phill, Noble and Ilan above. The problematic term is
>> "slave"
>> > > (not master) which I am all for changing if it causes less regression
>> > than
>> > > removing BOTH master and slave. Since some people have pointed out
>> Github
>> > > changing the "master" terminology, in my personal opinion, it was not
>> a
>> > > measured response to addressing the bigger problem we are all trying
>> to
>> > > tackle. There is no concept of a "slave" branch, and "master" by
>> itself
>> > is
>> > > a pretty generic term (Is someone having "mastery" over a skill a bad
>> > > thing?). I fear all it would end up achieving in the end with Github
>> is a
>> > > mess of broken build scripts at best.
>> > > So +1 on "slave" being the problematic term IMO, not "master".
>> > >
>> > > On Thu, Jun 18, 2020 at 8:19 PM Phill Campbell
>> > >  wrote:
>> > >
>> > >> Master - Worker
>> > >> Master - Peon
>> > >> Master - Helper
>> > >> Master - Servant
>> > >>
>> > >> The term that is not wanted is “slave’. The term “master” is not a
>> > problem
>> > >> IMO.
>> > >>
>> > >>> On Jun 18, 2020, at 3:59 PM, Jan Høydahl 
>> > wrote:
>> > >>>
>> > >>> I support Mike Drob and Trey Grainger. We shuold re-use the
>> > >> leader/replica
>> > >>> terminology from Cloud. Even if you hand-configure a master/slave
>> > cluster
>> > >>> and orchestrate what doc goes to which node/shard, and hand-code
>> your
>> > >> shards
>> > >>> parameter, you will still have a cluster where you’d send updates to
>> > the
>> > >> leader of
>> > >>> each shard and the replicas would replicate the index from the
>> leader.
>> > >>>
>> > >>> Let’s instead find a new good name for the cluster type. Standalone
>> > kind
>> > >> of works
>> > >>> for me, but I see it can be confused with single-node. We have also
>> > >> discussed
>> > >>> replacing SolrCloud (which is a terrible name) with something more
>> > >> descriptive.
>> > >>>
>> > >>> Today: SolrCloud vs Master/slave
>> > >>> Alt A: SolrCloud vs Standalone
>> > >>> Alt B: SolrCloud vs Legacy
>> > >>> Alt C: Clustered vs Independent
>> > >>> Alt D: Clustered vs Manual mode
>> > >>>
>> > >>> Jan
>> > >>>
>> >  18. jun. 2020 kl. 15:53 skrev Mike Drob :
>> > 
>> >  I personally think that using Solr cloud terminology for this
>> would be
>> > >> fine
>> >  with leader/follower. The leader is the one that accepts updates,
>> > >> followers
>> >  cascade the updates somehow. The presence of ZK or election doesn’t
>> > >> really
>> >  change this detail.
>> > 
>> >  However, if folks feel that it’s confusing, then I can’t tell them
>> > that
>> >  they’re not confused. Especially when they’re working with others
>> who
>> > >> have
>> >  less Solr experience than we do and are less familiar with the
>> > >> intricacies.
>> > 
>> >  Primary/Replica seems acceptable. Coordinator instead of Overseer
>> > seems
>> >  acceptable.
>> > 
>> >  Would love to see this in 9.0!
>> > 
>> >  Mike
>> > 
>> >  On Thu, Jun 18, 2020 at 8:25 AM John Gallagher
>> >   wrote:
>> > 
>> > > While on the topic of renaming roles, I'd like to propose finding
>> a
>> > >> better
>> > > term than "overseer" which has historical slavery connotations as
>> > well.
>> > > Director, perhaps?
>> > >
>> > >
>> > > John Gallagher
>> > >
>> > > On Thu, Jun 18, 2020 at 8:48 AM Jason Gerlowski <
>> > gerlowsk...@gmail.com
>> > >>>
>> > > wrote:
>> > >
>> > >> +1 to rename master/slave, and +1 to choosing terminology
>> distinct
>> > >> from what's used for SolrCloud.  I could be happy with several of
>> > the
>> > >> proposed options.  Since a good few have been proposed though,
>> maybe
>> > >> an eventual vote thread is the most organized way to aggregate
>> the
>> > >> opinions here.
>> > >>
>> > >> I'm less positive about the prospect of changing the name of our
>> > >> primary git branch.  Most projects that contributors might come
>> > from,
>> > >> most tutorials out there to learn git, most tools 

Re: [EXTERNAL] Getting rid of Master/Slave nomenclature in Solr

2020-06-19 Thread gnandre
What about blacklist and whitelist for shards? May I suggest blocklist and
safelist?

On Fri, Jun 19, 2020, 1:45 AM Thomas Corthals  wrote:

> Since "overseer" is also problematic, I'd like to propose "orchestrator" as
> an alternative.
>
> Thomas
>
> Op vr 19 jun. 2020 04:34 schreef Walter Underwood :
>
> > We don’t get to decide whether “master” is a problem. The rest of the
> world
> > has already decided that it is a problem.
> >
> > Our task is to replace the terms “master” and “slave” in Solr.
> >
> > wunder
> > Walter Underwood
> > wun...@wunderwood.org
> > http://observer.wunderwood.org/  (my blog)
> >
> > > On Jun 18, 2020, at 6:50 PM, Rahul Goswami 
> > wrote:
> > >
> > > I agree with Phill, Noble and Ilan above. The problematic term is
> "slave"
> > > (not master) which I am all for changing if it causes less regression
> > than
> > > removing BOTH master and slave. Since some people have pointed out
> Github
> > > changing the "master" terminology, in my personal opinion, it was not a
> > > measured response to addressing the bigger problem we are all trying to
> > > tackle. There is no concept of a "slave" branch, and "master" by itself
> > is
> > > a pretty generic term (Is someone having "mastery" over a skill a bad
> > > thing?). I fear all it would end up achieving in the end with Github
> is a
> > > mess of broken build scripts at best.
> > > So +1 on "slave" being the problematic term IMO, not "master".
> > >
> > > On Thu, Jun 18, 2020 at 8:19 PM Phill Campbell
> > >  wrote:
> > >
> > >> Master - Worker
> > >> Master - Peon
> > >> Master - Helper
> > >> Master - Servant
> > >>
> > >> The term that is not wanted is “slave’. The term “master” is not a
> > problem
> > >> IMO.
> > >>
> > >>> On Jun 18, 2020, at 3:59 PM, Jan Høydahl 
> > wrote:
> > >>>
> > >>> I support Mike Drob and Trey Grainger. We shuold re-use the
> > >> leader/replica
> > >>> terminology from Cloud. Even if you hand-configure a master/slave
> > cluster
> > >>> and orchestrate what doc goes to which node/shard, and hand-code your
> > >> shards
> > >>> parameter, you will still have a cluster where you’d send updates to
> > the
> > >> leader of
> > >>> each shard and the replicas would replicate the index from the
> leader.
> > >>>
> > >>> Let’s instead find a new good name for the cluster type. Standalone
> > kind
> > >> of works
> > >>> for me, but I see it can be confused with single-node. We have also
> > >> discussed
> > >>> replacing SolrCloud (which is a terrible name) with something more
> > >> descriptive.
> > >>>
> > >>> Today: SolrCloud vs Master/slave
> > >>> Alt A: SolrCloud vs Standalone
> > >>> Alt B: SolrCloud vs Legacy
> > >>> Alt C: Clustered vs Independent
> > >>> Alt D: Clustered vs Manual mode
> > >>>
> > >>> Jan
> > >>>
> >  18. jun. 2020 kl. 15:53 skrev Mike Drob :
> > 
> >  I personally think that using Solr cloud terminology for this would
> be
> > >> fine
> >  with leader/follower. The leader is the one that accepts updates,
> > >> followers
> >  cascade the updates somehow. The presence of ZK or election doesn’t
> > >> really
> >  change this detail.
> > 
> >  However, if folks feel that it’s confusing, then I can’t tell them
> > that
> >  they’re not confused. Especially when they’re working with others
> who
> > >> have
> >  less Solr experience than we do and are less familiar with the
> > >> intricacies.
> > 
> >  Primary/Replica seems acceptable. Coordinator instead of Overseer
> > seems
> >  acceptable.
> > 
> >  Would love to see this in 9.0!
> > 
> >  Mike
> > 
> >  On Thu, Jun 18, 2020 at 8:25 AM John Gallagher
> >   wrote:
> > 
> > > While on the topic of renaming roles, I'd like to propose finding a
> > >> better
> > > term than "overseer" which has historical slavery connotations as
> > well.
> > > Director, perhaps?
> > >
> > >
> > > John Gallagher
> > >
> > > On Thu, Jun 18, 2020 at 8:48 AM Jason Gerlowski <
> > gerlowsk...@gmail.com
> > >>>
> > > wrote:
> > >
> > >> +1 to rename master/slave, and +1 to choosing terminology distinct
> > >> from what's used for SolrCloud.  I could be happy with several of
> > the
> > >> proposed options.  Since a good few have been proposed though,
> maybe
> > >> an eventual vote thread is the most organized way to aggregate the
> > >> opinions here.
> > >>
> > >> I'm less positive about the prospect of changing the name of our
> > >> primary git branch.  Most projects that contributors might come
> > from,
> > >> most tutorials out there to learn git, most tools built on top of
> > git
> > >> - the majority are going to assume "master" as the main branch.  I
> > >> appreciate the change that Github is trying to effect in changing
> > the
> > >> default for new projects, but it'll be a long time before that
> > >> competes with the huge bulk of projects, documentation, etc.