Re: Beginner question about querying records

Stack Sun, 04 Apr 2010 15:16:16 -0700

2010/4/4 Onur AKTAS <[email protected]>:
>
> Thank you very much for your answers.. I'm checking the document that you 
> gave.
> In short words, unless massive traffic and massive data size and massive 
> scale is needed, stick with regular RDBMSs, then if we grow up to terabytes 
> of data to be querried, then we can switch no NO-SQL databases.
> Thanks so much.


Well, the above is basically the case for hbase.  We should be better
at smaller scales than we currently are but that is another story.

But generalizing your lesson to all NOSQL is another matter.  The
category is broad covering myriad database types.

St.Ack


>
>> Date: Sat, 3 Apr 2010 20:56:21 -0700
>> Subject: Re: Beginner question about querying records
>> From: [email protected]
>> To: [email protected]
>>
>> 2010/4/3 Onur AKTAS <[email protected]>:
>> >
>> > Hi all,
>> > I'm thinking of to switch from RDBMS to No-SQL database, but having lots 
>> > of unanswered questions in my mind.
>> > Please correct me if I'm wrong, is Hbase not suitable for small 
>> > environments? Like if we have 1 million records with no cluster or maybe 2 
>> > machines, is it not required?
>>
>> It'll work but thats not what its built for.  You'll be better off
>> sticking with your current RDMBS if your dataset if that size and
>> going by the rest of your questions below.
>>
>> > As far as I know, Hbase does not support querying, but having Pig to 
>> > perform SQL like queries. It is multi dimensional hashmap distributed 
>> > across the network to be accessed fast by key. So if we need to query 
>> > something then we need to index it by ourselves.
>>
>> Yes.
>>
>>
>> > 1) If we have a user list, and a potential "Give me all people 
>> > above/beyond age 30" query, then do we need to create an index from the 
>> > beginning of the first data as:
>> > above_30_list : value: [ A, B, C ]beyond_30_list :value: [ X, Y, Z ]   ?
>>
>> Yes. Or if you can tolerate getting answer offline, run a mapreduce
>> against the table.
>>
>> Or, if this is the only query you'll be running , think about how you
>> could design the primary key so you can answer this question: e.g.
>> userid_age.
>>
>>
>> > 2) What if we need just people at age 45. Then do we need to get all 
>> > above_30 and scan each of them one by one?
>> > 3) If we need so many various queries, then should we create such keys as 
>> > I wrote above for all potential queries? And entering the data to all that 
>> > indexes when inserting.
>>
>> Effectively yes.
>>
>> > 4) Parallelizing across clusters to share scanning is what HBase or Map 
>> > Reduce technique does to solve this issue?
>> > In short words, I'm willing to switch Hbase for my applications, and 
>> > wondering how can I do all these kind of operations in HBase with better 
>> > performance than I do in RDBMSs.
>>
>> HBase is about scaling.  To achieve scale, the model is changed.
>> Moving your RDBMS schema to hbase will take some thought and not all
>> will make it across.  For a considered thesis on nosql vs rdbms
>> modeling, see http://j.mp/2PjPB.
>>
>> St.Acka
>>
>>
>> > Thanks so much.
>> >
>> >
>> >
>> > _________________________________________________________________
>> > Yeni Windows 7: Size en uygun bilgisayarı bulun. Daha fazla bilgi edinin.
>> > http://windows.microsoft.com/shop
>
> _________________________________________________________________
> Yeni Windows 7: Gündelik işlerinizi basitleştirin. Size en uygun bilgisayarı 
> bulun.
> http://windows.microsoft.com/shop

Re: Beginner question about querying records

Reply via email to