RE: short name for columns

2015-01-19 Thread Bulvik, Noam
I opened https://issues.apache.org/jira/browse/PHOENIX-1598
This feature can be used with prefix encoding there is contradiction between 
these two features

-Original Message-
From: James Taylor [mailto:jamestay...@apache.org]
Sent: Monday, January 19, 2015 7:00 PM
To: user
Subject: Re: short name for columns

Good idea. Phoenix doesn't do that today. I'm hoping that HBase can come up 
with better block encodings that factor this kind of information out without 
perf taking a hit. They actually have one (TRIE), but I'm not sure how stable 
it is. Also, I'm not sure how well the existing encodings do for this (maybe 
good enough?).

Please file a JIRA. Thanks,

James

On Mon, Jan 19, 2015 at 7:41 AM, Anil Gupta anilgupt...@gmail.com wrote:
 You mean to have a support for aliases for columns?
 If yes, then +1 for that.

 Sent from my iPhone

 On Jan 19, 2015, at 3:49 AM, Bulvik, Noam noam.bul...@teoco.com wrote:

 Hi,



 Do you plan to support assign short name for columns as part of
 phoenix features. i.e. when creating table using phoenix DDL there
 will be a metadata table that will convert the column name to short
 names (like a,b,c … aa,bb….). each time there will be a query the SQL
 that the user will use will be converted to the short name to query
 the db and will be converted back to the real name in the result set.



 This may save a lot of space because the name of a column is part of
 each row saved in the files.



 Regards,

 Noam

 Information in this e-mail and its attachments is confidential and
 privileged under the TEOCO confidentiality terms that can be reviewed here.
Information in this e-mail and its attachments is confidential and privileged 
under the TEOCO confidentiality terms that can be reviewed 
herehttp://www.teoco.com/email-disclaimer.


RE: short name for columns

2015-01-19 Thread Vasudevan, Ramkrishna S
Hi

Currently the encoding feature tries to avoid as much as duplicates in the row 
keys, family names, column qualifier names.  If there are two cells 

Row1/cf1:qual1/val1
Row1/cf1:qual2/val2 

Then we try to find the common part among both the keys.  The first key is 
stored as it is but in the second key we do not write the common part 'Row1 to 
qual' because the row and Cf are the same.  Even among the qualifier name we 
have 'qual' which is common.  

So if the key values have more repetitive parts we get better encoding.  So may 
be in the Phoenix layer if we find column names bigger and non-repetitive 
naming structure we could rename the column qualifiers to make use of the above 
encoding capability.

Regards
Ram

-Original Message-
From: James Taylor [mailto:jamestay...@apache.org] 
Sent: Monday, January 19, 2015 10:30 PM
To: user
Subject: Re: short name for columns

Good idea. Phoenix doesn't do that today. I'm hoping that HBase can come up 
with better block encodings that factor this kind of information out without 
perf taking a hit. They actually have one (TRIE), but I'm not sure how stable 
it is. Also, I'm not sure how well the existing encodings do for this (maybe 
good enough?).

Please file a JIRA. Thanks,

James

On Mon, Jan 19, 2015 at 7:41 AM, Anil Gupta anilgupt...@gmail.com wrote:
 You mean to have a support for aliases for columns?
 If yes, then +1 for that.

 Sent from my iPhone

 On Jan 19, 2015, at 3:49 AM, Bulvik, Noam noam.bul...@teoco.com wrote:

 Hi,



 Do you plan to support assign short name for columns as part of 
 phoenix features. i.e. when creating table using phoenix DDL there 
 will be a metadata table that will convert the column name to short 
 names (like a,b,c … aa,bb….). each time there will be a query the SQL 
 that the user will use will be converted to the short name to query 
 the db and will be converted back to the real name in the result set.



 This may save a lot of space because the name of a column is part of 
 each row saved in the files.



 Regards,

 Noam

 Information in this e-mail and its attachments is confidential and 
 privileged under the TEOCO confidentiality terms that can be reviewed here.


Re: short name for columns

2015-01-19 Thread James Taylor
Thanks, Noam. I opened HBASE-12883 as well. I think this kind of pure
storage optimization should be done at the HBase level.

James

On Mon, Jan 19, 2015 at 11:07 PM, Bulvik, Noam noam.bul...@teoco.com wrote:
 I opened https://issues.apache.org/jira/browse/PHOENIX-1598
 This feature can be used with prefix encoding there is contradiction between 
 these two features

 -Original Message-
 From: James Taylor [mailto:jamestay...@apache.org]
 Sent: Monday, January 19, 2015 7:00 PM
 To: user
 Subject: Re: short name for columns

 Good idea. Phoenix doesn't do that today. I'm hoping that HBase can come up 
 with better block encodings that factor this kind of information out without 
 perf taking a hit. They actually have one (TRIE), but I'm not sure how stable 
 it is. Also, I'm not sure how well the existing encodings do for this (maybe 
 good enough?).

 Please file a JIRA. Thanks,

 James

 On Mon, Jan 19, 2015 at 7:41 AM, Anil Gupta anilgupt...@gmail.com wrote:
 You mean to have a support for aliases for columns?
 If yes, then +1 for that.

 Sent from my iPhone

 On Jan 19, 2015, at 3:49 AM, Bulvik, Noam noam.bul...@teoco.com wrote:

 Hi,



 Do you plan to support assign short name for columns as part of
 phoenix features. i.e. when creating table using phoenix DDL there
 will be a metadata table that will convert the column name to short
 names (like a,b,c … aa,bb….). each time there will be a query the SQL
 that the user will use will be converted to the short name to query
 the db and will be converted back to the real name in the result set.



 This may save a lot of space because the name of a column is part of
 each row saved in the files.



 Regards,

 Noam

 Information in this e-mail and its attachments is confidential and
 privileged under the TEOCO confidentiality terms that can be reviewed here.
 Information in this e-mail and its attachments is confidential and privileged 
 under the TEOCO confidentiality terms that can be reviewed 
 herehttp://www.teoco.com/email-disclaimer.


Re: short name for columns

2015-01-19 Thread Anil Gupta
You mean to have a support for aliases for columns?
If yes, then +1 for that.

Sent from my iPhone

 On Jan 19, 2015, at 3:49 AM, Bulvik, Noam noam.bul...@teoco.com wrote:
 
 Hi,
  
 Do you plan to support assign short name for columns as part of phoenix 
 features. i.e. when creating table using phoenix DDL there will be a metadata 
 table that will convert the column name to short names (like a,b,c … 
 aa,bb….). each time there will be a query the SQL that the user will use will 
 be converted to the short name to query the db and will be converted back to 
 the real name in the result set.
  
 This may save a lot of space because the name of a column is part of each row 
 saved in the files.
  
 Regards,
 Noam  
 Information in this e-mail and its attachments is confidential and privileged 
 under the TEOCO confidentiality terms that can be reviewed here.


short name for columns

2015-01-19 Thread Bulvik, Noam
Hi,

Do you plan to support assign short name for columns as part of phoenix 
features. i.e. when creating table using phoenix DDL there will be a metadata 
table that will convert the column name to short names (like a,b,c ... 
aa,bb). each time there will be a query the SQL that the user will use will 
be converted to the short name to query the db and will be converted back to 
the real name in the result set.

This may save a lot of space because the name of a column is part of each row 
saved in the files.

Regards,
Noam
Information in this e-mail and its attachments is confidential and privileged 
under the TEOCO confidentiality terms that can be reviewed 
herehttp://www.teoco.com/email-disclaimer.


Re: short name for columns

2015-01-19 Thread James Taylor
Good idea. Phoenix doesn't do that today. I'm hoping that HBase can
come up with better block encodings that factor this kind of
information out without perf taking a hit. They actually have one
(TRIE), but I'm not sure how stable it is. Also, I'm not sure how well
the existing encodings do for this (maybe good enough?).

Please file a JIRA. Thanks,

James

On Mon, Jan 19, 2015 at 7:41 AM, Anil Gupta anilgupt...@gmail.com wrote:
 You mean to have a support for aliases for columns?
 If yes, then +1 for that.

 Sent from my iPhone

 On Jan 19, 2015, at 3:49 AM, Bulvik, Noam noam.bul...@teoco.com wrote:

 Hi,



 Do you plan to support assign short name for columns as part of phoenix
 features. i.e. when creating table using phoenix DDL there will be a
 metadata table that will convert the column name to short names (like a,b,c
 … aa,bb….). each time there will be a query the SQL that the user will use
 will be converted to the short name to query the db and will be converted
 back to the real name in the result set.



 This may save a lot of space because the name of a column is part of each
 row saved in the files.



 Regards,

 Noam

 Information in this e-mail and its attachments is confidential and
 privileged under the TEOCO confidentiality terms that can be reviewed here.