Re: Solr Cloud wiki and branch notes

2010-01-17 Thread Andrzej Bialecki
On 2010-01-16 21:11, Yonik Seeley wrote: Agreed - but it could be as simple as qualifying this with from shardX on node2. Right - it's pretty clear there are both physical and logical shards... but it's less clear to me at this point if distinguishing them in the vocabulary helps or hurts.

Re: Solr Cloud wiki and branch notes

2010-01-17 Thread Yonik Seeley
On Sun, Jan 17, 2010 at 9:06 AM, Andrzej Bialecki a...@getopt.org wrote: On 2010-01-16 21:11, Yonik Seeley wrote: If we were building from scratch perhaps - but it seems like if we can just model what people do today with Solr (but just make it a lot easier), that's a good start.  The opaque

Re: Solr Cloud wiki and branch notes

2010-01-17 Thread Ted Dunning
Jason V and Jason R have done just that. Great idea. Cool work. But a unified management interface would *really* be nice. On Sun, Jan 17, 2010 at 6:06 AM, Andrzej Bialecki a...@getopt.org wrote: Well, then if we don't intend to support updates in this iteration then perhaps there is no

Re: Solr Cloud wiki and branch notes

2010-01-17 Thread Ted Dunning
+1 Hadoop still calls it a copy of a block if you have replication factor of 1. Why not? (for that matter, I still call it an integer if it has a value of 1) On Sun, Jan 17, 2010 at 6:06 AM, Andrzej Bialecki a...@getopt.org wrote: I originally started off with replica too... but there may

Re: Solr Cloud wiki and branch notes

2010-01-17 Thread Ted Dunning
Control is easily retained if you make pluggable the selection of shards to which you want to do the horizontal broadcast. The shard management layer shouldn't know or care what query you are doing and in most cases it should just use the trivial all-shards selection policy. On Sun, Jan 17, 2010

Re: Solr Cloud wiki and branch notes

2010-01-16 Thread Yonik Seeley
On Fri, Jan 15, 2010 at 7:36 PM, Andrzej Bialecki a...@getopt.org wrote: Hi, My 0.02 PLN on the subject ... Terminology --- First the terminology: reading your emails I have a feeling that my head is about to explode. We have to agree on the vocabulary, otherwise we have no hope

Re: Solr Cloud wiki and branch notes

2010-01-16 Thread Mark Miller
Andrzej Bialecki wrote: I avoided the word collection, because Solr deploys various cores under collectionX names, leading users to assume that core == collection. Global index is two words but it's unambiguous. I'm fine with the collection if we clarify the definition and avoid using this

Re: Solr Cloud wiki and branch notes

2010-01-16 Thread Yonik Seeley
On Sat, Jan 16, 2010 at 2:40 PM, Andrzej Bialecki a...@getopt.org wrote: I avoided the word collection, because Solr deploys various cores under collectionX names, leading users to assume that core == collection. For distributed search, it's already common to name the cores the same thing for

Re: Solr Cloud wiki and branch notes

2010-01-16 Thread Ted Dunning
My experience with Katta is that very quickly my developers adopted index as the aggregate of all the shards which is exactly what Andrzej is proposing. Confusion with the index contains shards, nodes host shards terminology has been minimal. On Sat, Jan 16, 2010 at 11:40 AM, Andrzej Bialecki

Solr Cloud wiki and branch notes

2010-01-15 Thread Jason Rutherglen
Here's some rough notes after running the unit tests, reviewing some of the code (though not understanding it), and reviewing the wiki page http://wiki.apache.org/solr/SolrCloud We need a protocol in the URL, otherwise it's inflexible I'm overwhelmed with all the ?? question areas of the

Re: Solr Cloud wiki and branch notes

2010-01-15 Thread Yonik Seeley
On Fri, Jan 15, 2010 at 4:12 PM, Jason Rutherglen jason.rutherg...@gmail.com wrote: The page is huge, which signals to me maybe we're trying to do too much This is really about doing not-so-much in the very near term, while thinking ahead to the longer term. Revamping distributed search could

Re: Solr Cloud wiki and branch notes

2010-01-15 Thread Jason Rutherglen
This is really about doing not-so-much in the very near term, while thinking ahead to the longer term. Lets have a page dedicated to release 1.0 of cloud? I feel uncomfortable editing the existing wiki because I don't know what the plans are for the first release. I need to revisit Katta as my

Re: Solr Cloud wiki and branch notes

2010-01-15 Thread Andrzej Bialecki
Hi, My 0.02 PLN on the subject ... Terminology --- First the terminology: reading your emails I have a feeling that my head is about to explode. We have to agree on the vocabulary, otherwise we have no hope of reaching any consensus. I propose the following vocabulary that has been

Re: Solr Cloud wiki and branch notes

2010-01-15 Thread Ted Dunning
On Fri, Jan 15, 2010 at 4:36 PM, Andrzej Bialecki a...@getopt.org wrote: My 0.02 PLN on the subject ... Polish currency seems pretty strong lately. There are a lot of good ideas for this small sum. Terminology * (global) search index * index shard: * partitioning: * search node: *