Re: How to de-nomarlize for this situation in HBASE Table

2013-01-18 Thread Doug Meil

Hi there, 

I'd recommend reading the Schema Design chapter in the RefGuide because
there are some good tips and hard-learned lessons.

http://hbase.apache.org/book.html#schema

Also, all your examples use composite row keys (not a surprise, a very
common pattern) and one thing I would like to draw your attention to is
this patch for composite row building.  Feedback appreciated, because
there isn't currently any utility support in Hbase for this.

https://issues.apache.org/jira/browse/HBASE-7221

(Also, WibiData and Sematext have done good work in key-utility generation
utilities tooĊ   )




On 1/18/13 12:18 AM, Ramasubramanian Narayanan
ramasubramanian.naraya...@gmail.com wrote:

Hi,

Is there any other way instead of using HOME/Work/etc? we expect some 10
such types may come in future.. hence asking

regards,
Rams

On Fri, Jan 18, 2013 at 10:24 AM, Sonal Goyal sonalgoy...@gmail.com
wrote:

 A rowkey is associated with the complete row. So you could have client
id
 as the rowkey. Hbase allows different qualifiers within a column
family, so
 you could potentially do the following:

 1. You could have qualifiers like home address street 1, home address
 street 2, home address city, office address street 1 etc kind of
qualifiers
 under physical address column family.
 2. If you access entire address and not city, state individually, you
could
 have the complete address concatenated and saved in one quailifer under
 physical address family using qualifiers like home, office, extra.

 A good link to get started is
 http://hbase.apache.org/book/datamodel.html#conceptual.view

 Best Regards,
 Sonal
 Real Time Analytics for BigData https://github.com/sonalgoyal/crux
 Nube Technologies http://www.nubetech.co

 http://in.linkedin.com/in/sonalgoyal




 On Fri, Jan 18, 2013 at 10:09 AM, Ramasubramanian Narayanan 
 ramasubramanian.naraya...@gmail.com wrote:

  Hi Sonal,
 
  In that case, the problem is how to store multiple physical address
sets
 in
  the same column family.. what rowkey to be used for this scenario..
 
  A Physical address will contain the following fields (need to store
  multiple physical address like this):
  Physical address type : Home/office/other/etc
  Address line1:
  ..
  ..
  Address line 4:
  State :
  City:
  Country:
 
  regards,
  Rams
 
 
  On Fri, Jan 18, 2013 at 10:00 AM, Sonal Goyal sonalgoy...@gmail.com
  wrote:
 
   How about client id as the rowkey, with column families as physical
   address, email address, telephone address? within each cf, you could
 have
   various qualifiers. For eg in physical address, you could have home
  Street,
   office street etc.
  
   Best Regards,
   Sonal
   Real Time Analytics for BigData https://github.com/sonalgoyal/crux
   Nube Technologies http://www.nubetech.co
  
   http://in.linkedin.com/in/sonalgoyal
  
  
  
  
   On Fri, Jan 18, 2013 at 9:46 AM, Ramasubramanian Narayanan 
   ramasubramanian.naraya...@gmail.com wrote:
  
Hi Sonal,
   
1. will fetch all demographic details of customer based on client
ID
2. Fetch the particular type of address along with other
demographic
  for
   a
client.. for example, HOME Physical address or HOME Telephone
address
  or
office Email address etc.,
   
regards,
Rams
   
On Fri, Jan 18, 2013 at 9:29 AM, Sonal Goyal
sonalgoy...@gmail.com
wrote:
   
 What are your data access patterns?

 Best Regards,
 Sonal
 Real Time Analytics for BigData 
 https://github.com/sonalgoyal/crux
 Nube Technologies http://www.nubetech.co

 http://in.linkedin.com/in/sonalgoyal




 On Fri, Jan 18, 2013 at 9:04 AM, Ramasubramanian Narayanan 
 ramasubramanian.naraya...@gmail.com wrote:

  Hi,
 
  I have the following relational tables.. I want to denormalize
 and
bring
 it
  all into single HBASE table... Pls help how it could be done..
 
 
  1. Client Master Table
  2. Physical Address Table (there might be 'n' number of
address
  that
can
 be
  captured against each client ID)
  3. Email Address Table (there might be 'n' number of address
that
  can
be
  captured against each client ID)
  4. Telephone Address Table (there might be 'n' number of
address
  that
can
  be captured against each client ID)
 
 
  For the tables 2 to 4, there are multiple fields like which is
 the
 Address
  type (home/office,etc), bad address, good address,
communication
address,
  time to call etc.,
 
  Please help me to clarify the following :
 
  1. Whether we can bring this to a single HBASE table?
  2. Having fields like phone number1, phone number 2 etc. is
not
 an
   good
  approach for this scenario...
  3. Whether we can have in the same table by populating these
  multiple
 rows
  for the same customer with different rowkey?
 For e.g.
 For Client Records  - Rowkey can be Client 

Re: How to de-nomarlize for this situation in HBASE Table

2013-01-17 Thread Sonal Goyal
What are your data access patterns?

Best Regards,
Sonal
Real Time Analytics for BigData https://github.com/sonalgoyal/crux
Nube Technologies http://www.nubetech.co

http://in.linkedin.com/in/sonalgoyal




On Fri, Jan 18, 2013 at 9:04 AM, Ramasubramanian Narayanan 
ramasubramanian.naraya...@gmail.com wrote:

 Hi,

 I have the following relational tables.. I want to denormalize and bring it
 all into single HBASE table... Pls help how it could be done..


 1. Client Master Table
 2. Physical Address Table (there might be 'n' number of address that can be
 captured against each client ID)
 3. Email Address Table (there might be 'n' number of address that can be
 captured against each client ID)
 4. Telephone Address Table (there might be 'n' number of address that can
 be captured against each client ID)


 For the tables 2 to 4, there are multiple fields like which is the Address
 type (home/office,etc), bad address, good address, communication address,
 time to call etc.,

 Please help me to clarify the following :

 1. Whether we can bring this to a single HBASE table?
 2. Having fields like phone number1, phone number 2 etc. is not an good
 approach for this scenario...
 3. Whether we can have in the same table by populating these multiple rows
 for the same customer with different rowkey?
For e.g.
For Client Records  - Rowkey can be Client Number + DOB
For Physical Address  - Rowkey can be Client Number + PHYSICAL + Type
 of Address
For Email Address  - Rowkey can be Client Number + EMAIL + Type of
 Address
For Telephone Address  - Rowkey can be Client Number + TEL + Type
 of Address

 regards,
 Rams



Re: How to de-nomarlize for this situation in HBASE Table

2013-01-17 Thread Ramasubramanian Narayanan
Hi Sonal,

1. will fetch all demographic details of customer based on client ID
2. Fetch the particular type of address along with other demographic for a
client.. for example, HOME Physical address or HOME Telephone address or
office Email address etc.,

regards,
Rams

On Fri, Jan 18, 2013 at 9:29 AM, Sonal Goyal sonalgoy...@gmail.com wrote:

 What are your data access patterns?

 Best Regards,
 Sonal
 Real Time Analytics for BigData https://github.com/sonalgoyal/crux
 Nube Technologies http://www.nubetech.co

 http://in.linkedin.com/in/sonalgoyal




 On Fri, Jan 18, 2013 at 9:04 AM, Ramasubramanian Narayanan 
 ramasubramanian.naraya...@gmail.com wrote:

  Hi,
 
  I have the following relational tables.. I want to denormalize and bring
 it
  all into single HBASE table... Pls help how it could be done..
 
 
  1. Client Master Table
  2. Physical Address Table (there might be 'n' number of address that can
 be
  captured against each client ID)
  3. Email Address Table (there might be 'n' number of address that can be
  captured against each client ID)
  4. Telephone Address Table (there might be 'n' number of address that can
  be captured against each client ID)
 
 
  For the tables 2 to 4, there are multiple fields like which is the
 Address
  type (home/office,etc), bad address, good address, communication address,
  time to call etc.,
 
  Please help me to clarify the following :
 
  1. Whether we can bring this to a single HBASE table?
  2. Having fields like phone number1, phone number 2 etc. is not an good
  approach for this scenario...
  3. Whether we can have in the same table by populating these multiple
 rows
  for the same customer with different rowkey?
 For e.g.
 For Client Records  - Rowkey can be Client Number + DOB
 For Physical Address  - Rowkey can be Client Number + PHYSICAL +
 Type
  of Address
 For Email Address  - Rowkey can be Client Number + EMAIL + Type
 of
  Address
 For Telephone Address  - Rowkey can be Client Number + TEL +
 Type
  of Address
 
  regards,
  Rams
 



Re: How to de-nomarlize for this situation in HBASE Table

2013-01-17 Thread Sonal Goyal
How about client id as the rowkey, with column families as physical
address, email address, telephone address? within each cf, you could have
various qualifiers. For eg in physical address, you could have home Street,
office street etc.

Best Regards,
Sonal
Real Time Analytics for BigData https://github.com/sonalgoyal/crux
Nube Technologies http://www.nubetech.co

http://in.linkedin.com/in/sonalgoyal




On Fri, Jan 18, 2013 at 9:46 AM, Ramasubramanian Narayanan 
ramasubramanian.naraya...@gmail.com wrote:

 Hi Sonal,

 1. will fetch all demographic details of customer based on client ID
 2. Fetch the particular type of address along with other demographic for a
 client.. for example, HOME Physical address or HOME Telephone address or
 office Email address etc.,

 regards,
 Rams

 On Fri, Jan 18, 2013 at 9:29 AM, Sonal Goyal sonalgoy...@gmail.com
 wrote:

  What are your data access patterns?
 
  Best Regards,
  Sonal
  Real Time Analytics for BigData https://github.com/sonalgoyal/crux
  Nube Technologies http://www.nubetech.co
 
  http://in.linkedin.com/in/sonalgoyal
 
 
 
 
  On Fri, Jan 18, 2013 at 9:04 AM, Ramasubramanian Narayanan 
  ramasubramanian.naraya...@gmail.com wrote:
 
   Hi,
  
   I have the following relational tables.. I want to denormalize and
 bring
  it
   all into single HBASE table... Pls help how it could be done..
  
  
   1. Client Master Table
   2. Physical Address Table (there might be 'n' number of address that
 can
  be
   captured against each client ID)
   3. Email Address Table (there might be 'n' number of address that can
 be
   captured against each client ID)
   4. Telephone Address Table (there might be 'n' number of address that
 can
   be captured against each client ID)
  
  
   For the tables 2 to 4, there are multiple fields like which is the
  Address
   type (home/office,etc), bad address, good address, communication
 address,
   time to call etc.,
  
   Please help me to clarify the following :
  
   1. Whether we can bring this to a single HBASE table?
   2. Having fields like phone number1, phone number 2 etc. is not an good
   approach for this scenario...
   3. Whether we can have in the same table by populating these multiple
  rows
   for the same customer with different rowkey?
  For e.g.
  For Client Records  - Rowkey can be Client Number + DOB
  For Physical Address  - Rowkey can be Client Number + PHYSICAL +
  Type
   of Address
  For Email Address  - Rowkey can be Client Number + EMAIL +
 Type
  of
   Address
  For Telephone Address  - Rowkey can be Client Number + TEL +
  Type
   of Address
  
   regards,
   Rams
  
 



Re: How to de-nomarlize for this situation in HBASE Table

2013-01-17 Thread Ramasubramanian Narayanan
Hi Sonal,

In that case, the problem is how to store multiple physical address sets in
the same column family.. what rowkey to be used for this scenario..

A Physical address will contain the following fields (need to store
multiple physical address like this):
Physical address type : Home/office/other/etc
Address line1:
..
..
Address line 4:
State :
City:
Country:

regards,
Rams


On Fri, Jan 18, 2013 at 10:00 AM, Sonal Goyal sonalgoy...@gmail.com wrote:

 How about client id as the rowkey, with column families as physical
 address, email address, telephone address? within each cf, you could have
 various qualifiers. For eg in physical address, you could have home Street,
 office street etc.

 Best Regards,
 Sonal
 Real Time Analytics for BigData https://github.com/sonalgoyal/crux
 Nube Technologies http://www.nubetech.co

 http://in.linkedin.com/in/sonalgoyal




 On Fri, Jan 18, 2013 at 9:46 AM, Ramasubramanian Narayanan 
 ramasubramanian.naraya...@gmail.com wrote:

  Hi Sonal,
 
  1. will fetch all demographic details of customer based on client ID
  2. Fetch the particular type of address along with other demographic for
 a
  client.. for example, HOME Physical address or HOME Telephone address or
  office Email address etc.,
 
  regards,
  Rams
 
  On Fri, Jan 18, 2013 at 9:29 AM, Sonal Goyal sonalgoy...@gmail.com
  wrote:
 
   What are your data access patterns?
  
   Best Regards,
   Sonal
   Real Time Analytics for BigData https://github.com/sonalgoyal/crux
   Nube Technologies http://www.nubetech.co
  
   http://in.linkedin.com/in/sonalgoyal
  
  
  
  
   On Fri, Jan 18, 2013 at 9:04 AM, Ramasubramanian Narayanan 
   ramasubramanian.naraya...@gmail.com wrote:
  
Hi,
   
I have the following relational tables.. I want to denormalize and
  bring
   it
all into single HBASE table... Pls help how it could be done..
   
   
1. Client Master Table
2. Physical Address Table (there might be 'n' number of address that
  can
   be
captured against each client ID)
3. Email Address Table (there might be 'n' number of address that can
  be
captured against each client ID)
4. Telephone Address Table (there might be 'n' number of address that
  can
be captured against each client ID)
   
   
For the tables 2 to 4, there are multiple fields like which is the
   Address
type (home/office,etc), bad address, good address, communication
  address,
time to call etc.,
   
Please help me to clarify the following :
   
1. Whether we can bring this to a single HBASE table?
2. Having fields like phone number1, phone number 2 etc. is not an
 good
approach for this scenario...
3. Whether we can have in the same table by populating these multiple
   rows
for the same customer with different rowkey?
   For e.g.
   For Client Records  - Rowkey can be Client Number + DOB
   For Physical Address  - Rowkey can be Client Number + PHYSICAL +
   Type
of Address
   For Email Address  - Rowkey can be Client Number + EMAIL +
  Type
   of
Address
   For Telephone Address  - Rowkey can be Client Number + TEL +
   Type
of Address
   
regards,
Rams
   
  
 



Re: How to de-nomarlize for this situation in HBASE Table

2013-01-17 Thread Sonal Goyal
A rowkey is associated with the complete row. So you could have client id
as the rowkey. Hbase allows different qualifiers within a column family, so
you could potentially do the following:

1. You could have qualifiers like home address street 1, home address
street 2, home address city, office address street 1 etc kind of qualifiers
under physical address column family.
2. If you access entire address and not city, state individually, you could
have the complete address concatenated and saved in one quailifer under
physical address family using qualifiers like home, office, extra.

A good link to get started is
http://hbase.apache.org/book/datamodel.html#conceptual.view

Best Regards,
Sonal
Real Time Analytics for BigData https://github.com/sonalgoyal/crux
Nube Technologies http://www.nubetech.co

http://in.linkedin.com/in/sonalgoyal




On Fri, Jan 18, 2013 at 10:09 AM, Ramasubramanian Narayanan 
ramasubramanian.naraya...@gmail.com wrote:

 Hi Sonal,

 In that case, the problem is how to store multiple physical address sets in
 the same column family.. what rowkey to be used for this scenario..

 A Physical address will contain the following fields (need to store
 multiple physical address like this):
 Physical address type : Home/office/other/etc
 Address line1:
 ..
 ..
 Address line 4:
 State :
 City:
 Country:

 regards,
 Rams


 On Fri, Jan 18, 2013 at 10:00 AM, Sonal Goyal sonalgoy...@gmail.com
 wrote:

  How about client id as the rowkey, with column families as physical
  address, email address, telephone address? within each cf, you could have
  various qualifiers. For eg in physical address, you could have home
 Street,
  office street etc.
 
  Best Regards,
  Sonal
  Real Time Analytics for BigData https://github.com/sonalgoyal/crux
  Nube Technologies http://www.nubetech.co
 
  http://in.linkedin.com/in/sonalgoyal
 
 
 
 
  On Fri, Jan 18, 2013 at 9:46 AM, Ramasubramanian Narayanan 
  ramasubramanian.naraya...@gmail.com wrote:
 
   Hi Sonal,
  
   1. will fetch all demographic details of customer based on client ID
   2. Fetch the particular type of address along with other demographic
 for
  a
   client.. for example, HOME Physical address or HOME Telephone address
 or
   office Email address etc.,
  
   regards,
   Rams
  
   On Fri, Jan 18, 2013 at 9:29 AM, Sonal Goyal sonalgoy...@gmail.com
   wrote:
  
What are your data access patterns?
   
Best Regards,
Sonal
Real Time Analytics for BigData https://github.com/sonalgoyal/crux
Nube Technologies http://www.nubetech.co
   
http://in.linkedin.com/in/sonalgoyal
   
   
   
   
On Fri, Jan 18, 2013 at 9:04 AM, Ramasubramanian Narayanan 
ramasubramanian.naraya...@gmail.com wrote:
   
 Hi,

 I have the following relational tables.. I want to denormalize and
   bring
it
 all into single HBASE table... Pls help how it could be done..


 1. Client Master Table
 2. Physical Address Table (there might be 'n' number of address
 that
   can
be
 captured against each client ID)
 3. Email Address Table (there might be 'n' number of address that
 can
   be
 captured against each client ID)
 4. Telephone Address Table (there might be 'n' number of address
 that
   can
 be captured against each client ID)


 For the tables 2 to 4, there are multiple fields like which is the
Address
 type (home/office,etc), bad address, good address, communication
   address,
 time to call etc.,

 Please help me to clarify the following :

 1. Whether we can bring this to a single HBASE table?
 2. Having fields like phone number1, phone number 2 etc. is not an
  good
 approach for this scenario...
 3. Whether we can have in the same table by populating these
 multiple
rows
 for the same customer with different rowkey?
For e.g.
For Client Records  - Rowkey can be Client Number + DOB
For Physical Address  - Rowkey can be Client Number +
 PHYSICAL +
Type
 of Address
For Email Address  - Rowkey can be Client Number + EMAIL +
   Type
of
 Address
For Telephone Address  - Rowkey can be Client Number +
 TEL +
Type
 of Address

 regards,
 Rams

   
  
 



Re: How to de-nomarlize for this situation in HBASE Table

2013-01-17 Thread Ramasubramanian Narayanan
Hi,

Is there any other way instead of using HOME/Work/etc? we expect some 10
such types may come in future.. hence asking

regards,
Rams

On Fri, Jan 18, 2013 at 10:24 AM, Sonal Goyal sonalgoy...@gmail.com wrote:

 A rowkey is associated with the complete row. So you could have client id
 as the rowkey. Hbase allows different qualifiers within a column family, so
 you could potentially do the following:

 1. You could have qualifiers like home address street 1, home address
 street 2, home address city, office address street 1 etc kind of qualifiers
 under physical address column family.
 2. If you access entire address and not city, state individually, you could
 have the complete address concatenated and saved in one quailifer under
 physical address family using qualifiers like home, office, extra.

 A good link to get started is
 http://hbase.apache.org/book/datamodel.html#conceptual.view

 Best Regards,
 Sonal
 Real Time Analytics for BigData https://github.com/sonalgoyal/crux
 Nube Technologies http://www.nubetech.co

 http://in.linkedin.com/in/sonalgoyal




 On Fri, Jan 18, 2013 at 10:09 AM, Ramasubramanian Narayanan 
 ramasubramanian.naraya...@gmail.com wrote:

  Hi Sonal,
 
  In that case, the problem is how to store multiple physical address sets
 in
  the same column family.. what rowkey to be used for this scenario..
 
  A Physical address will contain the following fields (need to store
  multiple physical address like this):
  Physical address type : Home/office/other/etc
  Address line1:
  ..
  ..
  Address line 4:
  State :
  City:
  Country:
 
  regards,
  Rams
 
 
  On Fri, Jan 18, 2013 at 10:00 AM, Sonal Goyal sonalgoy...@gmail.com
  wrote:
 
   How about client id as the rowkey, with column families as physical
   address, email address, telephone address? within each cf, you could
 have
   various qualifiers. For eg in physical address, you could have home
  Street,
   office street etc.
  
   Best Regards,
   Sonal
   Real Time Analytics for BigData https://github.com/sonalgoyal/crux
   Nube Technologies http://www.nubetech.co
  
   http://in.linkedin.com/in/sonalgoyal
  
  
  
  
   On Fri, Jan 18, 2013 at 9:46 AM, Ramasubramanian Narayanan 
   ramasubramanian.naraya...@gmail.com wrote:
  
Hi Sonal,
   
1. will fetch all demographic details of customer based on client ID
2. Fetch the particular type of address along with other demographic
  for
   a
client.. for example, HOME Physical address or HOME Telephone address
  or
office Email address etc.,
   
regards,
Rams
   
On Fri, Jan 18, 2013 at 9:29 AM, Sonal Goyal sonalgoy...@gmail.com
wrote:
   
 What are your data access patterns?

 Best Regards,
 Sonal
 Real Time Analytics for BigData 
 https://github.com/sonalgoyal/crux
 Nube Technologies http://www.nubetech.co

 http://in.linkedin.com/in/sonalgoyal




 On Fri, Jan 18, 2013 at 9:04 AM, Ramasubramanian Narayanan 
 ramasubramanian.naraya...@gmail.com wrote:

  Hi,
 
  I have the following relational tables.. I want to denormalize
 and
bring
 it
  all into single HBASE table... Pls help how it could be done..
 
 
  1. Client Master Table
  2. Physical Address Table (there might be 'n' number of address
  that
can
 be
  captured against each client ID)
  3. Email Address Table (there might be 'n' number of address that
  can
be
  captured against each client ID)
  4. Telephone Address Table (there might be 'n' number of address
  that
can
  be captured against each client ID)
 
 
  For the tables 2 to 4, there are multiple fields like which is
 the
 Address
  type (home/office,etc), bad address, good address, communication
address,
  time to call etc.,
 
  Please help me to clarify the following :
 
  1. Whether we can bring this to a single HBASE table?
  2. Having fields like phone number1, phone number 2 etc. is not
 an
   good
  approach for this scenario...
  3. Whether we can have in the same table by populating these
  multiple
 rows
  for the same customer with different rowkey?
 For e.g.
 For Client Records  - Rowkey can be Client Number + DOB
 For Physical Address  - Rowkey can be Client Number +
  PHYSICAL +
 Type
  of Address
 For Email Address  - Rowkey can be Client Number +
 EMAIL +
Type
 of
  Address
 For Telephone Address  - Rowkey can be Client Number +
  TEL +
 Type
  of Address
 
  regards,
  Rams