@Ryan

Hmmm. Interesting. 

Lets take a step back for a second.
Which do you prefer: A push model or a pull/poll  model?

That would kind of dictate the decisions you would make in terms of design. 

To your point about moving away from ZK, it would mean putting those features 
in to HBase directly. It can be done, but if you won’t fix coprocessors, then 
why re-invent the wheel for a system that really isn’t a standalone system like 
an Oracle database. (Look at it this way… when was the last time you installed 
an HBase instance where you didn’t have HDFS? ) 

So if you did actually do this, the features found in ZK would probably be 
tossed in to the HMaster and you will end up requiring a quorum of HMasters 
just like you have in ZK. 
The downside… by keeping with ZK you have the ability to interact with other 
systems that also use ZK so you could potentially evolve in to a system where 
you can run a single query against multiple data sources?
(Someone is doing that right? ) 

And if you think about it… a push model would be more efficient and if you’re 
going to have the RS push it to something… it would most likely be ZK. 

Sorry, I’m just being lazy…


-Mike

PS.  On the Ganglia topic… why not just store the data in HBase and have 
ganglia or your own D3 view  hit HBase instead? 

On Sep 5, 2014, at 7:17 PM, Ryan Rawson <ryano...@gmail.com> wrote:

> I guess my thought is that it'd be nice to minimize dependency on ZK,
> and eventually remove it all together.  It just adds too much
> deployment complexity, and code complexity -- about 10000 lines of
> code.
> 
> I do like the notion of HBase self-hosting it's own performance data,
> it's what Oracle and other databases do.  Ganglia is annoying to
> install, and often isnt.
> 
> On Fri, Sep 5, 2014 at 11:10 AM, Michael Segel
> <michael_se...@hotmail.com> wrote:
>> @Ted,
>> 
>> Yes, that’s the general idea or rather a specific use case for what I was 
>> thinking.
>> So it would be a different mechanism to help manage the information.
>> I would think that it would result in faster access to the information.
>> 
>> This would be very important if one were to do some query optimization… and 
>> by using ZK… you could think beyond just HBase, but doing a query to join 
>> data from both HBase and non HBase systems.
>> 
>> Just a thought… ;-)
>> 
>> -Mike
>> 
>> On Sep 5, 2014, at 2:29 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>> 
>>> This reminds me of
>>> HBASE-7958 Statistics per-column family per-region
>>> 
>>> Cheers
>>> 
>>> 
>>> On Thu, Sep 4, 2014 at 6:23 PM, Mikhail Antonov <olorinb...@gmail.com>
>>> wrote:
>>> 
>>>> I think ZK isn't the best possible storage for statistics. A separate stats
>>>> table may be better solution.
>>>> 
>>>> -Mikhail
>>>> 
>>>> 
>>>> 2014-09-04 15:48 GMT-07:00 Michael Segel <michael_se...@hotmail.com>:
>>>> 
>>>>> So suppose I want to capture metadata about a table across all of the
>>>>> regions for that table.
>>>>> 
>>>>> Has anyone used a coprocessor to capture a region’s statistics and push
>>>>> them up to ZK where its stored by (table, region, <metadata object>) and
>>>>> then a table wide value is also stored based on a computational update?
>>>>> 
>>>>> So if I wanted to store the row counts for each region of a table,  each
>>>>> region would update its record in ZK on each insert / delete (can you
>>>>> easily remove a tombstone?) and then update the computational value?
>>>>> (Assuming you could lock those values for a short enough time to do the
>>>>> quick computation. If not, then it can be computed on the fly)
>>>>> 
>>>>> Has this been done?
>>>>> 
>>>>> Thoughts?
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> --
>>>> Thanks,
>>>> Michael Antonov
>>>> 
>> 
> 

Reply via email to