What is it all about? HBase sucks. Too many problems to newcomers, few-weeks-warm-up to begin with!!!!!!!!!!!!!!!! Is it really-really supported by Microsoft employees?!
And, SEO of course: =================== -- Fuad Efendi 416-993-2060 Tokenizer Inc., Canada Data Mining, Search Engines http://www.tokenizer.ca On 11-07-29 7:49 PM, "Otis Gospodnetic" <[email protected]> wrote: >Hi, > >I'm for publishing all performance metrics in JMX (in addition to >exposing it wherever else you guys decide). That's because JMX is >probably the easiest for our SPM for HBase [1] to get to HBase >performance metrics and I suspect we are not alone. > >Otis >[1] http://sematext.com/spm/hbase-performance-monitoring/index.html >---- >Sematext :: http://sematext.com/ :: Solr - Lucene - Hadoop - HBase >Hadoop ecosystem search :: http://search-hadoop.com/ > > > >>________________________________ >>From: Andrew Purtell <[email protected]> >>To: Doug Meil <[email protected]>; "[email protected]" >><[email protected]> >>Sent: Friday, July 29, 2011 4:34 PM >>Subject: Re: HBASE-4089 & HBASE-4147 - on the topic of ops output >> >>> I'd rather see this output being able to be captured by something the >>>sink that Todd suggested, rather than focusing on shell access. >> >> >>I don't agree. >> >> >>Look at what we have existing and proposed: >> >> - Java API access to server and region load information, that the >>shell uses >> >> - A proposal to dump some stats into log files, that then has to be >>scraped >> >> - A proposal (by the FB guys) to export some JSON via a HTTP servlet >> >>This is not good design, this is a bunch of random shit stuck together. >> >>Note that what Todd proposed does not preclude adding Java client API >>support for retrieving it. >> >>At a minimum all of this information must be accessible via the Java >>client API, to enable programmatic monitoring and analysis use cases. >>I'll add the shell support if nobody else cares about it, that is a >>relatively small detail, but one I think is important. >> >>Best regards, >> >> >> - Andy >> >> >>Problems worthy of attack prove their worth by hitting back. - Piet Hein >>(via Tom White) >> >> >>>________________________________ >>>From: Doug Meil <[email protected]> >>>To: "[email protected]" <[email protected]>; >>>"[email protected]" <[email protected]> >>>Sent: Friday, July 29, 2011 11:39 AM >>>Subject: Re: HBASE-4089 & HBASE-4147 - on the topic of ops output >>> >>> >>>I'd rather see this output being able to be captured by something the >>>sink >>>that Todd suggested, rather than focusing on shell access. HServerLoad >>>is >>>super-summary at the RS level, and both the items in 4089 and 4147 are >>>proposed to be "summarized" but still have reasonable detail (e.g., even >>>table/CF summary there could be dozens of entries given a reasonably >>>complex system). >>> >>> >>> >>> >>>On 7/29/11 1:15 PM, "Andrew Purtell" <[email protected]> wrote: >>> >>>>There is also the matter of HServerLoad and how that is used by the >>>>shell >>>>and master UI to report on cluster status. >>>> >>>>I'd like the shell to be able to let the user explore all of these >>>>different reports interactively. >>>> >>>>At the very least, they should all be handled the same way. >>>> >>>>And then there is Riley's work over at FB on a slow query log. How does >>>>that fit in? >>>> >>>>Best regards, >>>> >>>> >>>> - Andy >>>> >>>>Problems worthy of attack prove their worth by hitting back. - Piet >>>>Hein >>>>(via Tom White) >>>> >>>> >>>>>________________________________ >>>>>From: Todd Lipcon <[email protected]> >>>>>To: [email protected] >>>>>Sent: Friday, July 29, 2011 9:58 AM >>>>>Subject: Re: HBASE-4089 & HBASE-4147 - on the topic of ops output >>>>> >>>>>What I'd prefer is something like: >>>>> >>>>>interface BlockCacheReportSink { >>>>> public void reportStats(BlockCacheReport report); >>>>>} >>>>> >>>>>class LoggingBlockCacheReportSink { >>>>> ... { >>>>> log it with whatever formatting you want >>>>> } >>>>>} >>>>> >>>>>then a configuration which could default to the logging >>>>>implementation, >>>>>but >>>>>orgs could easily substitute their own implementation. For example, I >>>>>could >>>>>see wanting to do an implementation where it keeps local RRD graphs of >>>>>some >>>>>stats, or pushes them to a central management server. >>>>> >>>>>The assumption is that BlockCacheReport is a fairly straightforward >>>>>"struct" >>>>>with the non-formatted information available. >>>>> >>>>>-Todd >>>>> >>>>>On Fri, Jul 29, 2011 at 4:15 AM, Doug Meil >>>>><[email protected]>wrote: >>>>> >>>>>> >>>>>> Hi Folks- >>>>>> >>>>>> You probably already my email yesterday on this... >>>>>> https://issues.apache.org/jira/browse/HBASE-4089 (block cache >>>>>>report) >>>>>> >>>>>> ...and I just created this one... >>>>>> https://issues.apache.org/jira/browse/HBASE-4147 (StoreFile query >>>>>> report) >>>>>> >>>>>> What I'd like to run past the dev-list is this: if Hbase had >>>>>>periodic >>>>>> summary usage statistics, where should they go? What I'd like to >>>>>>throw >>>>>> out for discussion is that I'm suggesting that it should simply go >>>>>>to >>>>>>the >>>>>> log files and users can slice and dice this on their own. No UI >>>>>>(I.e., >>>>>> JSPs), no JMX, etc. >>>>>> >>>>>> >>>>>> The summary out the output is this: >>>>>> BlockCacheReport: on configured interval, print out summary of >>>>>>blockcache >>>>>> (at table/CF level) to log file. This one is point-in-time, not >>>>>>delta. >>>>>> >>>>>> StoreFile read report: on configured interval, print out summary of >>>>>> StoreFile accesses and how much time was spent reading each >>>>>>StoreFile >>>>>>to >>>>>> log file. >>>>>> >>>>>> Thoughts? >>>>>> >>>>>> Doug >>>>>> >>>>>> > >>>>>> >>>>>> >>>>> >>>>> >>>>>-- >>>>>Todd Lipcon >>>>>Software Engineer, Cloudera >>>>> >>>>> >>> >>> >>> >>> >>
