On Wed, Nov 4, 2009 at 10:39 PM, Greg Cottman <[email protected]>wrote:
> Hi everyone, > > To this end, I would invite people who feel like sharing to give me a > paragraph or two on what they are doing with HBase. Of course, I don't want > anyone to give away their eleven secret herbs and spices or tell me what > Ingredient X is. :-) I am more interested in metadata and semantics. > > > To give you an idea of questions that I wonder about: > > * Are you using a natural or synthetic key? > > Keys are urls that have been md5'd so there is a good spread across the namespace and then base-64'd (I don't know why the latter is done). > * Are you using HBase index tables or maintaining your own? > > No > * Do you have multiple data tables in your HBase server? > > Yes. About 50 tables. > * How many rows of data are in each HBase table? > > Between 3-20 million rows in each. > * What type of data are you storing in each record? > > Some of the tables have wikipedia content and then derivatives; mimetypes, inlinks, alternate urls, etc., etc. Other tables hold other indexing pipeline input and intermediate product. > * Are you using column families to localize data or store name/value > pairs? > > Both. > * Are there columns like name, address, etc., that are present in > each row? > > Sort-of. > * Are you running HBase on your own servers or on Amazon EC2? > > Own. Between 100 and 110. > * Are you using Hadoop to run map/reduce functions against HBase? > > Yes. > * How does your client interact with HBase? Java API, REST, > Stargate, Thrift, other (please specify), etc. > > Java and REST and thrift. Above responses describe a cluster from 6 months ago. St.Ack > > Anyone who is interested in responding can do so to the list or directly to > me. I will keep your responses but not your name or company. Feel free to > answer some or all of the questions, or add your own information that you > feel is pertinent to how you are using HBase. I will give it a week and > then collate the responses into an integrated summary that I will publish > back to this list. > > > > I should declare that I have no official HBase standing. I'm just very > curious about NoSQL databases as an emerging technology, and HBase in > particular. The 'net shows a general consensus is that HBase is an early > NoSQL leader but no-one discusses specifics. Some empirical data would be > very interesting. > > > > Thanks in advance, > > Greg. > > > > > > Greg Cottman > > Technical Architect > > Quest Software, Australia > > Tel: +61 3 9811 8057 > > > > >
