Thinking more about your usage…
It sounds like you will use your own ID scheme so the payload for getChildren()
will be much smaller. This is probably more do-able than I originally thought.
Further, if you run into performance problems you can decide to not use a
ServiceProvider (which uses a cache internally). Instead, you can query
directly each time by calling:
ServiceDiscovery.queryForInstance(String name, String id)
You can treat this as a binary search. If the id you try isn't available,
search again using a bsearch algorithm.
Just a thought…
-JZ
On Jan 28, 2013, at 3:43 PM, Jordan Zimmerman <[email protected]>
wrote:
> It's really up to you. The only thing to be concerned with is the Jute
> transport limits. Of course, you can always increase this. See jute.maxbuffer
> here: http://zookeeper.apache.org/doc/r3.4.5/zookeeperAdmin.html
>
> For Curator-Discovery you can calculate the space you need. See here:
> https://github.com/Netflix/curator/wiki/Service-Discovery - Curator-Discovery
> writes nodes with their UUID name. Each UUID node will be 36 bytes. So, a
> getChildren() call on 10K nodes is 10K as 36 or ~360K. So, you're well under
> the 1MB limit there. The main issue is if you use a ServiceCache the initial
> load will require 10K+ ZK calls. This will probably be acceptable. On a
> gigabit network this won't take too long. Bear in mind that there will also
> be 10K+ watchers in ZK. Again, ZK should handle this fine.
>
> Do you expect to grow an order of magnitude from 10K? If so, you might
> consider clustering. If 10K is the limit you should be fine.
>
> -JZ
>
> On Jan 28, 2013, at 3:34 PM, Yasin <[email protected]> wrote:
>
>> This will work if we think each rack as a different service. But now the
>> service retrieval becomes problematic. Event though I think each service as
>> a different service, I should think all the service as a single service. We
>> still need to keep on that id numbering thing. I mean rack1 might have ids
>> from 1..32, and rack2 might have 33..64. So I still need to find a node
>> whose id is best for the client, either has the same number or next smallest
>> one.
>> Another approach would be to classify nodes into, say 10 clusters, and put
>> each node in an appropriate znode. For example,
>> root/cluster1_1000/node1,...,root/cluster1_1000/node1000,
>> root/cluster1001_2000/node1,...,root/cluster1001_2000/node1000, and soon. If
>> the service gets "getInstance(245)" it will know that the desired node will
>> be in root/cluster1001_2000/. Then I will use some heuristics to retrieve
>> some nodes, sort them based on their ids, and find the most appropriate one.
>> How about this idea?
>>
>> Thanks
>>
>> Yasin
>>
>>
>>
>> --
>> View this message in context:
>> http://zookeeper-user.578899.n2.nabble.com/curator-service-discovery-search-to-Select-a-service-intance-tp7578447p7578456.html
>> Sent from the zookeeper-user mailing list archive at Nabble.com.
>