Hi Pavel, I am thinking there could be another option to add to list:
Maintain all orders in users table in a single family named 'orders', each order in a separate column member. For each order, have the order id be the column name (e.g. orders:12345). Cell value will be a serialization of the order object. You'll need the order object to implement org.apache.hadoop.io.Writable. Naama On Mon, Jul 14, 2008 at 7:24 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]> wrote: > Answers inline. > > J-D > > On Mon, Jul 14, 2008 at 12:16 PM, Pavel Lysov <[EMAIL PROTECTED]> wrote: > > > Hi all, > > > Hey! > > > > > > > I found that I can not stop thinking in RDBM way while designing tables > for > > the application I am working on, so that I need your help. Can you please > > take a look at the tables below and advice what approach you think is > doable > > and good enough? > > > > There's should be USERS table I think, something simple for now: > > > > USER_ID: > > profile: > > email > > first_name > > last_name > > > > Then we need to store a huge list of user's orders, here's where I am > > starting to doubt. Can it have many orders in the same USERS table? Does > > HBase (bigtable) allow us to have schema like the following: > > USER_ID: > > profile: > > first_name > > last_name > > orders: > > order_1: > > date > > details > > product > > price > > order_2: > > date > > details > > product > > price > > > You can't do that. > > > > > > > > If idea above is bad (I couldn't find API that creates nested column > > families and assume that is not possible), it probably could be another > > table for orders: > > ORDER_ID: > > user: > > id > > first_name > > last_name > > order: > > date > > details > > product > > price > > > > > > This way it will require additional work getting orders for certain user, > > so the third variant would have composite row key, composed of USER_ID > and > > ORDER_ID: > > USER_ID__ORDER_ID: > > order: > > date > > details > > product > > price > > profile: > > first_name > > last_name > > > This last version will effectively group user's orders together. Having the > date in the row key just after the user id would even sort it by date which > is not bad. > > > > > > The last table variant will be scanned using HScannerInterface, so it's > > relatively easy to get all orders for given user I think. How do you > think > > is it fine to create such kind of composite row keys? > > > > Here's where I am. What am I missing? Can you please share your thoughts > on > > tables design, you would probably design them in other way? > > > > Another schema would be to have orders data in your user table like this > (but what you already have isn't bad): > > USER_ID: > profile: > email > first_name > last_name > order_date: > all ORDER_IDs > order_details > all ORDER_IDs > etc > > > > > > > > Thank you! > > Pavel > > > > > > > > > > > -- oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo "If you want your children to be intelligent, read them fairy tales. If you want them to be more intelligent, read them more fairy tales." (Albert Einstein)
