Re: HBASE-4089 & HBASE-4147 - on the topic of ops output

Doug Meil Sun, 31 Jul 2011 16:17:50 -0700

re:  "Whirr"

I'm sure Cloudera would love to hear your feedback on Whirr, but please
address it to them directly and not on the Hbase dist-list.


re:  "Is it really-really supported by Microsoft employees?!"

It is really, really not.


As you pointed out, and as is cited in the Apache Hbase book, Hbase was
inspired from BigTable.  And as for the refund that Ryan suggested I'm
sure Google would be happy to make good on it.  Watch out for the exchange
rate at the border.




On 7/31/11 6:38 PM, "Fuad Efendi" <[email protected]> wrote:

>Great,
>
>
>
>My VP is former MS-98 :)))
>
>
>
>(BTW, many thanks to http://www.cloudera.com/ employees; please make Whirr
>really "rrrrrŠŠŠ" "open source"? I don't see any meaningŠ reinventing a
>bike is cheaper nowadays!!!)
>
>
>
>
>
>
>And, SEO of course: (remember: Google is SMARTER!!! You are just "clone".)
>
>==========================================================================
>
>
>-- 
>Fuad Efendi
>416-993-2060
>Tokenizer Inc., Canada
>Data Mining, Search Engines
>http://www.tokenizer.ca
>
>
>
>
>
>
>
>
>On 11-07-31 6:21 PM, "Ryan Rawson" <[email protected]> wrote:
>
>>You should ask for your money back!!
>>
>>On Sun, Jul 31, 2011 at 3:10 PM, Fuad Efendi <[email protected]> wrote:
>>> What is it all about? HBase sucks. Too many problems to newcomers,
>>> few-weeks-warm-up to begin with!!!!!!!!!!!!!!!! Is it really-really
>>> supported by Microsoft employees?!
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> And, SEO of course:
>>> ===================
>>>
>>>
>>> --
>>> Fuad Efendi
>>> 416-993-2060
>>> Tokenizer Inc., Canada
>>> Data Mining, Search Engines
>>> http://www.tokenizer.ca
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 11-07-29 7:49 PM, "Otis Gospodnetic" <[email protected]>
>>>wrote:
>>>
>>>>Hi,
>>>>
>>>>I'm for publishing all performance metrics in JMX (in addition to
>>>>exposing it wherever else you guys decide).  That's because JMX is
>>>>probably the easiest for our SPM for HBase [1] to get to HBase
>>>>performance metrics and I suspect we are not alone.
>>>>
>>>>Otis
>>>>[1] http://sematext.com/spm/hbase-performance-monitoring/index.html
>>>>----
>>>>Sematext :: http://sematext.com/ :: Solr - Lucene - Hadoop - HBase
>>>>Hadoop ecosystem search :: http://search-hadoop.com/
>>>>
>>>>
>>>>
>>>>>________________________________
>>>>>From: Andrew Purtell <[email protected]>
>>>>>To: Doug Meil <[email protected]>; "[email protected]"
>>>>><[email protected]>
>>>>>Sent: Friday, July 29, 2011 4:34 PM
>>>>>Subject: Re: HBASE-4089 & HBASE-4147 - on the topic of ops output
>>>>>
>>>>>> I'd rather see this output being able to be captured by something
>>>>>>the
>>>>>>sink that Todd suggested, rather than focusing on shell access.
>>>>>
>>>>>
>>>>>I don't agree.
>>>>>
>>>>>
>>>>>Look at what we have existing and proposed:
>>>>>
>>>>>    - Java API access to server and region load information, that the
>>>>>shell uses
>>>>>
>>>>>    - A proposal to dump some stats into log files, that then has to
>>>>>be
>>>>>scraped
>>>>>
>>>>>    - A proposal (by the FB guys) to export some JSON via a HTTP
>>>>>servlet
>>>>>
>>>>>This is not good design, this is a bunch of random shit stuck
>>>>>together.
>>>>>
>>>>>Note that what Todd proposed does not preclude adding Java client API
>>>>>support for retrieving it.
>>>>>
>>>>>At a minimum all of this information must be accessible via the Java
>>>>>client API, to enable programmatic monitoring and analysis use cases.
>>>>>I'll add the shell support if nobody else cares about it, that is a
>>>>>relatively small detail, but one I think is important.
>>>>>
>>>>>Best regards,
>>>>>
>>>>>
>>>>>    - Andy
>>>>>
>>>>>
>>>>>Problems worthy of attack prove their worth by hitting back. - Piet
>>>>>Hein
>>>>>(via Tom White)
>>>>>
>>>>>
>>>>>>________________________________
>>>>>>From: Doug Meil <[email protected]>
>>>>>>To: "[email protected]" <[email protected]>;
>>>>>>"[email protected]" <[email protected]>
>>>>>>Sent: Friday, July 29, 2011 11:39 AM
>>>>>>Subject: Re: HBASE-4089 & HBASE-4147 - on the topic of ops output
>>>>>>
>>>>>>
>>>>>>I'd rather see this output being able to be captured by something the
>>>>>>sink
>>>>>>that Todd suggested, rather than focusing on shell access.
>>>>>>HServerLoad
>>>>>>is
>>>>>>super-summary at the RS level, and both the items in 4089 and 4147
>>>>>>are
>>>>>>proposed to be "summarized" but still have reasonable detail (e.g.,
>>>>>>even
>>>>>>table/CF summary there could be dozens of entries given a reasonably
>>>>>>complex system).
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>On 7/29/11 1:15 PM, "Andrew Purtell" <[email protected]> wrote:
>>>>>>
>>>>>>>There is also the matter of HServerLoad and how that is used by the
>>>>>>>shell
>>>>>>>and master UI to report on cluster status.
>>>>>>>
>>>>>>>I'd like the shell to be able to let the user explore all of these
>>>>>>>different reports interactively.
>>>>>>>
>>>>>>>At the very least, they should all be handled the same way.
>>>>>>>
>>>>>>>And then there is Riley's work over at FB on a slow query log. How
>>>>>>>does
>>>>>>>that fit in?
>>>>>>>
>>>>>>>Best regards,
>>>>>>>
>>>>>>>
>>>>>>>   - Andy
>>>>>>>
>>>>>>>Problems worthy of attack prove their worth by hitting back. - Piet
>>>>>>>Hein
>>>>>>>(via Tom White)
>>>>>>>
>>>>>>>
>>>>>>>>________________________________
>>>>>>>>From: Todd Lipcon <[email protected]>
>>>>>>>>To: [email protected]
>>>>>>>>Sent: Friday, July 29, 2011 9:58 AM
>>>>>>>>Subject: Re: HBASE-4089 & HBASE-4147 - on the topic of ops output
>>>>>>>>
>>>>>>>>What I'd prefer is something like:
>>>>>>>>
>>>>>>>>interface BlockCacheReportSink {
>>>>>>>>  public void reportStats(BlockCacheReport report);
>>>>>>>>}
>>>>>>>>
>>>>>>>>class LoggingBlockCacheReportSink {
>>>>>>>>  ... {
>>>>>>>>    log it with whatever formatting you want
>>>>>>>>  }
>>>>>>>>}
>>>>>>>>
>>>>>>>>then a configuration which could default to the logging
>>>>>>>>implementation,
>>>>>>>>but
>>>>>>>>orgs could easily substitute their own implementation. For example,
>>>>>>>>I
>>>>>>>>could
>>>>>>>>see wanting to do an implementation where it keeps local RRD graphs
>>>>>>>>of
>>>>>>>>some
>>>>>>>>stats, or pushes them to a central management server.
>>>>>>>>
>>>>>>>>The assumption is that BlockCacheReport is a fairly straightforward
>>>>>>>>"struct"
>>>>>>>>with the non-formatted information available.
>>>>>>>>
>>>>>>>>-Todd
>>>>>>>>
>>>>>>>>On Fri, Jul 29, 2011 at 4:15 AM, Doug Meil
>>>>>>>><[email protected]>wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hi Folks-
>>>>>>>>>
>>>>>>>>> You probably already my email yesterday on this...
>>>>>>>>>  https://issues.apache.org/jira/browse/HBASE-4089 (block cache
>>>>>>>>>report)
>>>>>>>>>
>>>>>>>>> ...and I just created this one...
>>>>>>>>>  https://issues.apache.org/jira/browse/HBASE-4147 (StoreFile
>>>>>>>>>query
>>>>>>>>> report)
>>>>>>>>>
>>>>>>>>> What I'd like to run past the dev-list is this:  if Hbase had
>>>>>>>>>periodic
>>>>>>>>> summary usage statistics, where should they go?  What I'd like to
>>>>>>>>>throw
>>>>>>>>> out for discussion is that I'm suggesting that it should simply
>>>>>>>>>go
>>>>>>>>>to
>>>>>>>>>the
>>>>>>>>> log files and users can slice and dice this on their own.  No UI
>>>>>>>>>(I.e.,
>>>>>>>>> JSPs), no JMX, etc.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The summary out the output is this:
>>>>>>>>> BlockCacheReport:  on configured interval, print out summary of
>>>>>>>>>blockcache
>>>>>>>>> (at table/CF level) to log file. This one is point-in-time, not
>>>>>>>>>delta.
>>>>>>>>>
>>>>>>>>> StoreFile read report:  on configured interval, print out summary
>>>>>>>>>of
>>>>>>>>> StoreFile accesses and how much time was spent reading each
>>>>>>>>>StoreFile
>>>>>>>>>to
>>>>>>>>> log file.
>>>>>>>>>
>>>>>>>>> Thoughts?
>>>>>>>>>
>>>>>>>>> Doug
>>>>>>>>>
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>--
>>>>>>>>Todd Lipcon
>>>>>>>>Software Engineer, Cloudera
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>
>>>
>>>
>
>

Re: HBASE-4089 & HBASE-4147 - on the topic of ops output

Reply via email to