Re: What roles do "even" nodes play in the ensamble
Awesome, thanks guys. Your patience and input is greatly appreciated. On Wed, 2010-08-25 at 21:30 -0700, Henry Robinson wrote: > Todd - > > > > No, this is not the case. There are no 'backup' or 'failover' nodes in > ZooKeeper. All servers that can vote are working as part of the > cluster until they fail. You need a majority of your voting servers > alive. > > > If you have three servers, a majority is of size two. The number of > nodes that can fail before a majority is no longer alive is one. > If you have four servers, a majority is of size three. The number of > nodes that can fail before a majority is no longer alive is one. > If you have five servers, a majority is of size three. The number of > nodes that can fail before a majority is no longer alive is two. > > > This is why four servers is worse than three for availability. In both > cases, two servers have to fail before the cluster is no longer > available. However if failures are independently distributed, this is > more likely to happen in a cluster of four nodes than a cluster of > three (think of it as 'more things available to go wrong'). > > > If you have four servers and one dies, the 'majority' that still needs > to be alive is still three - it doesn't drop down to two. The majority > is of all voting servers, alive or dead. > > > Hope this helps - > > > Henry > > > On 25 August 2010 21:01, Todd Nine wrote: > > Thanks Dave. I've been using Cassandra, so I'm trying to get > my head > around the configuration/operational differences with ZK. You > state > that using 4 would actually decrease my reliability. Can you > explain > that further? I was under the impression that a 4th node > would act as a > non voting read only node until one of the other 3 fails. I > thought > that this extra node would give me some breathing room by > allowing any > node to fail and still have 3 voting nodes. Is this not the > case? > > Thanks, > > Todd > > > > > > > On Wed, 2010-08-25 at 21:13 -0600, Ted Dunning wrote: > > > Just use 3 nodes. Life will be better. > > > > > > > > You can configure the fourth node in the event of one of the > first > > three failing and bring it on line. Then you can > re-configure and > > restart each of the others one at a time. This gives you > flexibility > > because you have 4 nodes, but doesn't decrease your > reliability the > > way that using a four node cluster would. If you need to do > > maintenance on one node, just configure that node out as if > it had > > failed. > > > > > > On Wed, Aug 25, 2010 at 4:26 PM, Dave Wright > > > wrote: > > > > You can certainly serve more reads with a 4th node, > but I'm > > not sure > > what you mean by "it won't have a voting role". It > still > > participates > > in voting for leaders as do all non-observers > regardless of > > whether it > > is an even or odd number. With zookeeper there is no > voting on > > each > > transaction, only leader changes. > > > > -Dave Wright > > > > > > > > On Wed, Aug 25, 2010 at 6:22 PM, Todd Nine > > wrote: > > > Do I get any read performance increase (similar to > an > > observer) since > > > the node will not have a voting role? > > > > > > > > > > > > > > > > > > > > -- > Henry Robinson > Software Engineer > Cloudera > 415-994-6679 >
Re: What roles do "even" nodes play in the ensamble
Todd - No, this is not the case. There are no 'backup' or 'failover' nodes in ZooKeeper. All servers that can vote are working as part of the cluster until they fail. You need a majority of your voting servers alive. If you have three servers, a majority is of size two. The number of nodes that can fail before a majority is no longer alive is one. If you have four servers, a majority is of size three. The number of nodes that can fail before a majority is no longer alive is one. If you have five servers, a majority is of size three. The number of nodes that can fail before a majority is no longer alive is two. This is why four servers is worse than three for availability. In both cases, two servers have to fail before the cluster is no longer available. However if failures are independently distributed, this is more likely to happen in a cluster of four nodes than a cluster of three (think of it as 'more things available to go wrong'). If you have four servers and one dies, the 'majority' that still needs to be alive is still three - it doesn't drop down to two. The majority is of all voting servers, alive or dead. Hope this helps - Henry On 25 August 2010 21:01, Todd Nine wrote: > Thanks Dave. I've been using Cassandra, so I'm trying to get my head > around the configuration/operational differences with ZK. You state > that using 4 would actually decrease my reliability. Can you explain > that further? I was under the impression that a 4th node would act as a > non voting read only node until one of the other 3 fails. I thought > that this extra node would give me some breathing room by allowing any > node to fail and still have 3 voting nodes. Is this not the case? > > Thanks, > > Todd > > > > > On Wed, 2010-08-25 at 21:13 -0600, Ted Dunning wrote: > > > Just use 3 nodes. Life will be better. > > > > > > > > You can configure the fourth node in the event of one of the first > > three failing and bring it on line. Then you can re-configure and > > restart each of the others one at a time. This gives you flexibility > > because you have 4 nodes, but doesn't decrease your reliability the > > way that using a four node cluster would. If you need to do > > maintenance on one node, just configure that node out as if it had > > failed. > > > > > > On Wed, Aug 25, 2010 at 4:26 PM, Dave Wright > > wrote: > > > > You can certainly serve more reads with a 4th node, but I'm > > not sure > > what you mean by "it won't have a voting role". It still > > participates > > in voting for leaders as do all non-observers regardless of > > whether it > > is an even or odd number. With zookeeper there is no voting on > > each > > transaction, only leader changes. > > > > -Dave Wright > > > > > > > > On Wed, Aug 25, 2010 at 6:22 PM, Todd Nine > > wrote: > > > Do I get any read performance increase (similar to an > > observer) since > > > the node will not have a voting role? > > > > > > > > > > > > > > > -- Henry Robinson Software Engineer Cloudera 415-994-6679
Re: What roles do "even" nodes play in the ensamble
Thanks Dave. I've been using Cassandra, so I'm trying to get my head around the configuration/operational differences with ZK. You state that using 4 would actually decrease my reliability. Can you explain that further? I was under the impression that a 4th node would act as a non voting read only node until one of the other 3 fails. I thought that this extra node would give me some breathing room by allowing any node to fail and still have 3 voting nodes. Is this not the case? Thanks, Todd On Wed, 2010-08-25 at 21:13 -0600, Ted Dunning wrote: > Just use 3 nodes. Life will be better. > > > > You can configure the fourth node in the event of one of the first > three failing and bring it on line. Then you can re-configure and > restart each of the others one at a time. This gives you flexibility > because you have 4 nodes, but doesn't decrease your reliability the > way that using a four node cluster would. If you need to do > maintenance on one node, just configure that node out as if it had > failed. > > > On Wed, Aug 25, 2010 at 4:26 PM, Dave Wright > wrote: > > You can certainly serve more reads with a 4th node, but I'm > not sure > what you mean by "it won't have a voting role". It still > participates > in voting for leaders as do all non-observers regardless of > whether it > is an even or odd number. With zookeeper there is no voting on > each > transaction, only leader changes. > > -Dave Wright > > > > On Wed, Aug 25, 2010 at 6:22 PM, Todd Nine > wrote: > > Do I get any read performance increase (similar to an > observer) since > > the node will not have a voting role? > > > > > > > >
Re: What roles do "even" nodes play in the ensamble
Just use 3 nodes. Life will be better. You can configure the fourth node in the event of one of the first three failing and bring it on line. Then you can re-configure and restart each of the others one at a time. This gives you flexibility because you have 4 nodes, but doesn't decrease your reliability the way that using a four node cluster would. If you need to do maintenance on one node, just configure that node out as if it had failed. On Wed, Aug 25, 2010 at 4:26 PM, Dave Wright wrote: > You can certainly serve more reads with a 4th node, but I'm not sure > what you mean by "it won't have a voting role". It still participates > in voting for leaders as do all non-observers regardless of whether it > is an even or odd number. With zookeeper there is no voting on each > transaction, only leader changes. > > -Dave Wright > > On Wed, Aug 25, 2010 at 6:22 PM, Todd Nine > wrote: > > Do I get any read performance increase (similar to an observer) since > > the node will not have a voting role? > > > > >
Re: What roles do "even" nodes play in the ensamble
You can certainly serve more reads with a 4th node, but I'm not sure what you mean by "it won't have a voting role". It still participates in voting for leaders as do all non-observers regardless of whether it is an even or odd number. With zookeeper there is no voting on each transaction, only leader changes. -Dave Wright On Wed, Aug 25, 2010 at 6:22 PM, Todd Nine wrote: > Do I get any read performance increase (similar to an observer) since > the node will not have a voting role? > >
Re: What roles do "even" nodes play in the ensamble
Do I get any read performance increase (similar to an observer) since the node will not have a voting role? On Wed, 2010-08-25 at 15:18 -0700, Henry Robinson wrote: > Dave is correct - if you have N nodes you need (N/2) + 1 votes (i.e. a > majority) in the standard case to get a vote to pass. > > Adding a fourth voting node to a three node cluster will cause the size of a > majority to jump from 2 to 3. The number of nodes that need to fail before > you can no longer get a majority is 2 in both cases - so you don't get any > reliability for adding a new voting node to a odd-numbered cluster. > > The new node will always act as a voter unless you explicitly configure it > as an observer. > > Henry > > On 25 August 2010 15:11, Dave Wright wrote: > > > I'm not an expert on voting, so there may be a better answer, but from my > > understanding all 4 nodes participate in the voting and you need a majority > > of 3 to elect a leader. > > > > -Dave > > > > On Wed, Aug 25, 2010 at 6:09 PM, Todd Nine > > wrote: > > > > > Thanks for that Dave. If I do not configure it as an observer just a > > > normal member, what will the last even node to join do? > > > > > > > > > 1. Will it participate as a voter on startup? (I'm assuming not, just > > read > > > only) > > > > > > 2. If one of the voter nodes 1 through 3 dies, does it become a voter? > > > > > > > > >todd > > > SENIOR SOFTWARE ENGINEER > > > > > > todd nine| spidertracks ltd | 117a the square > > > po box 5203 | palmerston north 4441 | new zealand > > > P: +64 6 353 3395 | M: +64 210 255 8576 > > > E: t...@spidertracks.co.nz W: www.spidertracks.com > > > > > > > > > > > > > > > > > > On Wed, 2010-08-25 at 17:57 -0400, Dave Wright wrote: > > > > > > > > > > > 1. When the 4th ZK node joins the cluster, does it take on the observer > > > > role since a quorum cannot be reached with the new node? Can I still > > > > connect my clients to it and create/remove nodes and receive events? > > > > > > No, it joins as a normal member unless you've configured it as an > > > observer. Note that with 4 nodes you now need 3 running to get a > > > majority, which is why even numbers aren't recommended. > > > > > > > > > > > > > > > 2. In the event 1 of the 3 voting nodes fails, will this 4th node > > become > > > > a voting member of the ensemble? > > > > > > If configured as an observer it remains an observer. > > > > > > > > > > > 3. When a new node comes online, it may have a different ip than the > > > > previous node. Do I need to update all node configurations and perform > > > > a rolling restart, or will simply connecting the new node to the > > > > existing ensemble make all nodes aware it is running? > > > > > > Unfortunately ZK doesn't have any kind of dynamic configuration like > > > that currently. You need to update all the config files and restart > > > the ensemble. > > > > > > -Dave Wright > > > > > > > > > > >
Re: What roles do "even" nodes play in the ensamble
Dave is correct - if you have N nodes you need (N/2) + 1 votes (i.e. a majority) in the standard case to get a vote to pass. Adding a fourth voting node to a three node cluster will cause the size of a majority to jump from 2 to 3. The number of nodes that need to fail before you can no longer get a majority is 2 in both cases - so you don't get any reliability for adding a new voting node to a odd-numbered cluster. The new node will always act as a voter unless you explicitly configure it as an observer. Henry On 25 August 2010 15:11, Dave Wright wrote: > I'm not an expert on voting, so there may be a better answer, but from my > understanding all 4 nodes participate in the voting and you need a majority > of 3 to elect a leader. > > -Dave > > On Wed, Aug 25, 2010 at 6:09 PM, Todd Nine > wrote: > > > Thanks for that Dave. If I do not configure it as an observer just a > > normal member, what will the last even node to join do? > > > > > > 1. Will it participate as a voter on startup? (I'm assuming not, just > read > > only) > > > > 2. If one of the voter nodes 1 through 3 dies, does it become a voter? > > > > > >todd > > SENIOR SOFTWARE ENGINEER > > > > todd nine| spidertracks ltd | 117a the square > > po box 5203 | palmerston north 4441 | new zealand > > P: +64 6 353 3395 | M: +64 210 255 8576 > > E: t...@spidertracks.co.nz W: www.spidertracks.com > > > > > > > > > > > > On Wed, 2010-08-25 at 17:57 -0400, Dave Wright wrote: > > > > > > > > 1. When the 4th ZK node joins the cluster, does it take on the observer > > > role since a quorum cannot be reached with the new node? Can I still > > > connect my clients to it and create/remove nodes and receive events? > > > > No, it joins as a normal member unless you've configured it as an > > observer. Note that with 4 nodes you now need 3 running to get a > > majority, which is why even numbers aren't recommended. > > > > > > > > > > > 2. In the event 1 of the 3 voting nodes fails, will this 4th node > become > > > a voting member of the ensemble? > > > > If configured as an observer it remains an observer. > > > > > > > > 3. When a new node comes online, it may have a different ip than the > > > previous node. Do I need to update all node configurations and perform > > > a rolling restart, or will simply connecting the new node to the > > > existing ensemble make all nodes aware it is running? > > > > Unfortunately ZK doesn't have any kind of dynamic configuration like > > that currently. You need to update all the config files and restart > > the ensemble. > > > > -Dave Wright > > > > > -- Henry Robinson Software Engineer Cloudera 415-994-6679
Re: What roles do "even" nodes play in the ensamble
I'm not an expert on voting, so there may be a better answer, but from my understanding all 4 nodes participate in the voting and you need a majority of 3 to elect a leader. -Dave On Wed, Aug 25, 2010 at 6:09 PM, Todd Nine wrote: > Thanks for that Dave. If I do not configure it as an observer just a > normal member, what will the last even node to join do? > > > 1. Will it participate as a voter on startup? (I'm assuming not, just read > only) > > 2. If one of the voter nodes 1 through 3 dies, does it become a voter? > > >todd > SENIOR SOFTWARE ENGINEER > > todd nine| spidertracks ltd | 117a the square > po box 5203 | palmerston north 4441 | new zealand > P: +64 6 353 3395 | M: +64 210 255 8576 > E: t...@spidertracks.co.nz W: www.spidertracks.com > > > > > > On Wed, 2010-08-25 at 17:57 -0400, Dave Wright wrote: > > > > > 1. When the 4th ZK node joins the cluster, does it take on the observer > > role since a quorum cannot be reached with the new node? Can I still > > connect my clients to it and create/remove nodes and receive events? > > No, it joins as a normal member unless you've configured it as an > observer. Note that with 4 nodes you now need 3 running to get a > majority, which is why even numbers aren't recommended. > > > > > > > 2. In the event 1 of the 3 voting nodes fails, will this 4th node become > > a voting member of the ensemble? > > If configured as an observer it remains an observer. > > > > > 3. When a new node comes online, it may have a different ip than the > > previous node. Do I need to update all node configurations and perform > > a rolling restart, or will simply connecting the new node to the > > existing ensemble make all nodes aware it is running? > > Unfortunately ZK doesn't have any kind of dynamic configuration like > that currently. You need to update all the config files and restart > the ensemble. > > -Dave Wright > >
Re: What roles do "even" nodes play in the ensamble
Thanks for that Dave. If I do not configure it as an observer just a normal member, what will the last even node to join do? 1. Will it participate as a voter on startup? (I'm assuming not, just read only) 2. If one of the voter nodes 1 through 3 dies, does it become a voter? todd SENIOR SOFTWARE ENGINEER todd nine| spidertracks ltd | 117a the square po box 5203 | palmerston north 4441 | new zealand P: +64 6 353 3395 | M: +64 210 255 8576 E: t...@spidertracks.co.nz W: www.spidertracks.com On Wed, 2010-08-25 at 17:57 -0400, Dave Wright wrote: > > > > 1. When the 4th ZK node joins the cluster, does it take on the observer > > role since a quorum cannot be reached with the new node? Can I still > > connect my clients to it and create/remove nodes and receive events? > > No, it joins as a normal member unless you've configured it as an > observer. Note that with 4 nodes you now need 3 running to get a > majority, which is why even numbers aren't recommended. > > > > > > > 2. In the event 1 of the 3 voting nodes fails, will this 4th node become > > a voting member of the ensemble? > > If configured as an observer it remains an observer. > > > > > 3. When a new node comes online, it may have a different ip than the > > previous node. Do I need to update all node configurations and perform > > a rolling restart, or will simply connecting the new node to the > > existing ensemble make all nodes aware it is running? > > Unfortunately ZK doesn't have any kind of dynamic configuration like > that currently. You need to update all the config files and restart > the ensemble. > > -Dave Wright
Re: What roles do "even" nodes play in the ensamble
> > 1. When the 4th ZK node joins the cluster, does it take on the observer > role since a quorum cannot be reached with the new node? Can I still > connect my clients to it and create/remove nodes and receive events? No, it joins as a normal member unless you've configured it as an observer. Note that with 4 nodes you now need 3 running to get a majority, which is why even numbers aren't recommended. > > > 2. In the event 1 of the 3 voting nodes fails, will this 4th node become > a voting member of the ensemble? If configured as an observer it remains an observer. > > 3. When a new node comes online, it may have a different ip than the > previous node. Do I need to update all node configurations and perform > a rolling restart, or will simply connecting the new node to the > existing ensemble make all nodes aware it is running? Unfortunately ZK doesn't have any kind of dynamic configuration like that currently. You need to update all the config files and restart the ensemble. -Dave Wright
What roles do "even" nodes play in the ensamble
Hey guys, Forgive me if this is documented somewhere, but I can't find an answer. Our application is not enormous, so we will be using 4 application nodes that will also initially run Zookeeper. As our load increases, Zookeeper will be moved to nodes that only run ZK and no other processes. Given that we will initially only have 4 nodes in our cluster and I have a few questions around the semantics of an even number of nodes. 1. When the 4th ZK node joins the cluster, does it take on the observer role since a quorum cannot be reached with the new node? Can I still connect my clients to it and create/remove nodes and receive events? 2. In the event 1 of the 3 voting nodes fails, will this 4th node become a voting member of the ensemble? 3. When a new node comes online, it may have a different ip than the previous node. Do I need to update all node configurations and perform a rolling restart, or will simply connecting the new node to the existing ensemble make all nodes aware it is running? Thanks, Todd
Re: Non Hadoop scheduling frameworks
Thanks for the feedback. I'm probably going to modify quartz to work with Zookeeper to start and launch jobs. Architecturally, I don't think persisting Jobs or trigger history in ZK is a very good idea, it's turning it into a persistent data store, which is not designed for. I was thinking I could change the core APIs in the following way. Implement leader/follower election as a standalone module. Is this already done somewhere? I know there's a recipe but if the code is done that's less for me to do. Implement an abstract JobStore implementation (ZooKeeperJobStore) with the following properties Default Case 1. All calls that deal with returning triggers will use the follower/leader semantics. All nodes (including the leader) will be followers. They will only be returned jobs they should run for the call aquireNextTrigger 2. All calls to writing triggers will write triggers to the datastore and to a trigger queue in ZK 3. The leader will pick up triggers from the queue, and distribute them to the next available node via the ZK trigger queues per node. Each operation will attempt to be wisely partitioned. In the first implementation, it will simply schedule the job on a node that has the least executions near the time specified for the trigger. In the next release, I could use average job duration semantics to try to avoid scheduling overlapping jobs, especially in long running jobs. Failover 1. The leader will scan all current followers when a follower leaves, or after a new leader is designated. 2. For any node with jobs that is not currently a follower, it's triggers will be re-written to the trigger queue from above 3. The redistribution semantics will fire from above Does this sound reasonable? After performing more research I think job semantics such as partitioning and parallel processing are outside the scope of how the scheduler should work. Those semantics are more internal to the job itself, and I think they should remain outside of the scope of this project. todd SENIOR SOFTWARE ENGINEER todd nine| spidertracks ltd | On Tue, 2010-08-24 at 04:20 +, Ted Dunning wrote: > These are pretty easy to solve with ZK. Ephemerality, exclusive create, > atomic update and file versions allow you to implement most of the semantics > you need. > > I don't know of any recipes available for this, but they would be worthy > additions to ZK. > > On Mon, Aug 23, 2010 at 11:33 PM, Todd Nine wrote: > > > Solving UC1 and UC2 via zookeeper or some other framework if one is > > recommended. We don't run Hadoop, just ZK and Cassandra as we don't have a > > need for map/reduce. I'm searching for any existing framework that can > > perform standard time based scheduling in a distributed environment. As I > > said earlier, Quartz is the closest model to what we're looking for, but it > > can't be used in a distributed parallel environment. Any suggestions for a > > system that could accomplish this would be helpful. > > > > Thanks, > > Todd > > > > On 24 August 2010 11:27, Mahadev Konar wrote: > > > > > Hi Todd, > > > Just to be clear, are you looking at solving UC1 and UC2 via zookeeper? > > Or > > > is this a broader question for scheduling on cassandra nodes? For the > > latter > > > this probably isnt the right mailing list. > > > > > > Thanks > > > mahadev > > > > > > > > > On 8/23/10 4:02 PM, "Todd Nine" wrote: > > > > > > Hi all, > > > We're using Zookeeper for Leader Election and system monitoring. We're > > > also using it for synchronizing our cluster wide jobs with barriers. > > > We're > > > running into an issue where we now have a single job, but each node can > > > fire > > > the job independently of others with different criteria in the job. In > > the > > > event of a system failure, another node in our application cluster will > > > need > > > to fire this Job. I've used quartz previously (we're running Java 6), > > but > > > it simply isn't designed for the use case we have. I found this article > > on > > > cloudera. > > > > > > http://www.cloudera.com/blog/2008/11/job-scheduling-in-hadoop/ > > > > > > > > > I've looked at both plugins, but they require hadoop. We're not > > currently > > > running hadoop, we only have Cassandra. Here are the 2 basic use cases > > we > > > need to support. > > > > > > UC1: Synchronized Jobs > > > 1. A job is fired across all nodes > > > 2. The nodes wait until the barrier is entered by all participants > > > 3. The nodes process the data and leave > > > 4. On all nodes leaving the barrier, the Leader node marks the job as > > > complete. > > > > > > > > > UC2: Multiple Jobs per Node > > > 1. A Job is scheduled for a future time on a specific node (usually the > > > same > > > node that's creating the trigger) > > > 2. A Trigger can be overwritten and cancelled without the job firing > > > 3. In the event of a node failure, the Leader will take all pending jobs > > > from the failed node, and partition them acro
Re: Searching more ZooKeeper content
I am definitely a +1 on this, given that its powered by Solr. Thanks mahadev On 8/25/10 9:22 AM, "Alex Baranau" wrote: > Hello guys, > > Over at http://search-hadoop.com we index ZooKeeper project's mailing lists, > wiki, web site, > source code, javadoc, jira... > > Would the community be interested in a patch that replaces the > Google-powered > search with that from search-hadoop.com, set to search only ZooKeeper > project by > default? > > We look into adding this search service for all Hadoop's sub-projects. > > Assuming people are for this, any suggestions for how the search should > function by default or any specific instructions for how the search box > should > be modified would be great! > > Thank you, > Alex Baranau. > > P.S. HBase community already accepted our proposal (please refer to > https://issues.apache.org/jira/browse/HBASE-2886) and new version (0.90) > will include new search box. Also the patch is available for TIKA (we are in > the process of discussing some details now): > https://issues.apache.org/jira/browse/TIKA-488. ZooKeeper's site looks much > like Avro's for which we also created patch recently ( > https://issues.apache.org/jira/browse/AVRO-626). >
Re: Size of a znode in memory
Hi Marten, The usual memory footprint of a znode is around 40-80 bytes. I think Ben is planning to document a way to calculate approximate memory footprint of your zk servers given a set of updates and there sizes. thanks mahadev On 8/25/10 11:49 AM, "Maarten Koopmans" wrote: > Hi, > > Is there a way to know/measure the size of a znode? My average znode has a > name of 32 bytes and user data of max 128 bytes. > > Or is the only way to run a smoke test and watch the heap growth via jconsole > or so? > > Thanks, Maarten >
Size of a znode in memory
Hi, Is there a way to know/measure the size of a znode? My average znode has a name of 32 bytes and user data of max 128 bytes. Or is the only way to run a smoke test and watch the heap growth via jconsole or so? Thanks, Maarten
Searching more ZooKeeper content
Hello guys, Over at http://search-hadoop.com we index ZooKeeper project's mailing lists, wiki, web site, source code, javadoc, jira... Would the community be interested in a patch that replaces the Google-powered search with that from search-hadoop.com, set to search only ZooKeeper project by default? We look into adding this search service for all Hadoop's sub-projects. Assuming people are for this, any suggestions for how the search should function by default or any specific instructions for how the search box should be modified would be great! Thank you, Alex Baranau. P.S. HBase community already accepted our proposal (please refer to https://issues.apache.org/jira/browse/HBASE-2886) and new version (0.90) will include new search box. Also the patch is available for TIKA (we are in the process of discussing some details now): https://issues.apache.org/jira/browse/TIKA-488. ZooKeeper's site looks much like Avro's for which we also created patch recently ( https://issues.apache.org/jira/browse/AVRO-626).