Shard has the interesting additional implication that it is part of a composite index made up of many sub-indexes.
A lucene index could be a complete index or a shard. I would presume the same of what might be called a core. On Thu, Jan 14, 2010 at 3:21 PM, Jason Rutherglen < jason.rutherg...@gmail.com> wrote: > Uri, > > > "core" to represent a single index and "shard" to be > > represented by a single core > > Can you elaborate on what you mean, isn't a core a single index > too? It seems like shard was used to represent a remote index > (perhaps?). Though here I'd prefer "remote core", because to the > uninitiated Solr outsider it's immediately obvious (i.e. they > need only know what a core is, in the Solr glossary or term > dictionary). > > In Google vernacular, which is where the name shard came from, a > "shard" is basically a local sub-index > http://research.google.com/archive/googlecluster.html where > there would be many "shards" per server. However that's a > digression at this point. > > I personally prefer relatively straightforward names, that are > self-evident, rather than inventing new language for fairly > simple concepts. Slice, even though it comes from our buddy > Yonik, probably doesn't make any immediate sense to external > users when compared with the word shard. Of course software > projects have a tendency to create their own words to somewhat > mystify users into believing in some sort of magic occurring > underneath. If that's what we're after, it's cool, I mean that > makes sense. And I don't mean to be derogatory here however this > is an open source project created in part to educate users on > search and be made easily accessible as possible, to the > greatest number of users possible. I think Doug did a create job > of this when Lucene started with amazingly succinct code for > fairly complex concepts (eg, anti-mystification of search). > > Jason > > On Thu, Jan 14, 2010 at 2:58 PM, Uri Boness <ubon...@gmail.com> wrote: > > Although Jason has some valid points here, I'm with Yonik here. I do > believe > > that we've gotten used to the terms "core" to represent a single index > and > > "shard" to be represented by a single core. A "node" seems to indicate a > > machine or a JVM. Changing any of these (informal perhaps) definitions > will > > only cause confusion. That's why I think a "slice" is a good solution > now... > > first it's a new term to a new view of the index (logical shard AFAIK > don't > > really exists yet) so people won't need to get used to it, but it's also > > descriptive and intuitive. I do like Jason's idea about having a protocol > > attached to the URL's. > > > > Cheers, > > Uri > > > > Jason Rutherglen wrote: > >>> > >>> But I've kind of gotten used to thinking of shards as the > >>> actual physical queryable things... > >>> > >> > >> I think a mistake was made referring to Solr cores as shards. > >> It's the same thing with 2 different names. Slices adds yet > >> another name which seems to imply the same thing yet again. I'd > >> rather see disambiguation here, and call them cores (partially > >> because that's what's in the code and on the wiki), and cores > >> only. It's a Solr specific term, it's going to be confused with > >> microprocessor cores, but at least there's only one name, which > >> as search people, we know creates fewer posting lists :). > >> > >> Logical groupings of cores can occur, which can be aptly named > >> core groups. This way I can submit a query to a core group, and > >> it's reasonable to assume I'm hitting N cores. Further, cores > >> could point to a logical or physical entity via a URL. (As a > >> side note, I've always found it odd that the shards param to > >> RequestHandler lacks the protocol, what if I want to use HTTPS > >> for example?). > >> > >> So there could be http://host/solr/core1 (physical), > >> core://megacorename (logical), > >> coregroup://supergreatcoregroupname (a group of cores) in the > >> "shards" parameter (whose name can perhaps be changed for > >> clarity in a future release). Then people can mix and match and > >> we won't have many different XML elements floating around. We'd > >> have a simple list of URLs that are transposed into a real > >> physical network request. > >> > >> > >> On Thu, Jan 14, 2010 at 12:56 PM, Yonik Seeley > >> <yo...@lucidimagination.com> wrote: > >> > >>> > >>> On Thu, Jan 14, 2010 at 1:38 PM, Yonik Seeley > >>> <yo...@lucidimagination.com> wrote: > >>> > >>>> > >>>> On Thu, Jan 14, 2010 at 12:46 PM, Yonik Seeley > >>>> <yo...@lucidimagination.com> wrote: > >>>> > >>>>> > >>>>> I'm actually starting to lean toward "slice" instead of "logical > >>>>> shard". > >>>>> > >>> > >>> Alternate terminology could be "index" for the actual physical lucene > >>> lindex (and also enough of the URL that unambiguously identifies it), > >>> and then "shard" could be the logical entity. > >>> > >>> But I've kind of gotten used to thinking of shards as the actual > >>> physical queryable things... > >>> > >>> -Yonik > >>> http://www.lucidimagination.com > >>> > >>> > >> > >> > > > -- Ted Dunning, CTO DeepDyve