RE: short name for columns
I opened https://issues.apache.org/jira/browse/PHOENIX-1598 This feature can be used with prefix encoding there is contradiction between these two features -Original Message- From: James Taylor [mailto:jamestay...@apache.org] Sent: Monday, January 19, 2015 7:00 PM To: user Subject: Re: short name for columns Good idea. Phoenix doesn't do that today. I'm hoping that HBase can come up with better block encodings that factor this kind of information out without perf taking a hit. They actually have one (TRIE), but I'm not sure how stable it is. Also, I'm not sure how well the existing encodings do for this (maybe good enough?). Please file a JIRA. Thanks, James On Mon, Jan 19, 2015 at 7:41 AM, Anil Gupta anilgupt...@gmail.com wrote: You mean to have a support for aliases for columns? If yes, then +1 for that. Sent from my iPhone On Jan 19, 2015, at 3:49 AM, Bulvik, Noam noam.bul...@teoco.com wrote: Hi, Do you plan to support assign short name for columns as part of phoenix features. i.e. when creating table using phoenix DDL there will be a metadata table that will convert the column name to short names (like a,b,c … aa,bb….). each time there will be a query the SQL that the user will use will be converted to the short name to query the db and will be converted back to the real name in the result set. This may save a lot of space because the name of a column is part of each row saved in the files. Regards, Noam Information in this e-mail and its attachments is confidential and privileged under the TEOCO confidentiality terms that can be reviewed here. Information in this e-mail and its attachments is confidential and privileged under the TEOCO confidentiality terms that can be reviewed herehttp://www.teoco.com/email-disclaimer.
RE: short name for columns
Hi Currently the encoding feature tries to avoid as much as duplicates in the row keys, family names, column qualifier names. If there are two cells Row1/cf1:qual1/val1 Row1/cf1:qual2/val2 Then we try to find the common part among both the keys. The first key is stored as it is but in the second key we do not write the common part 'Row1 to qual' because the row and Cf are the same. Even among the qualifier name we have 'qual' which is common. So if the key values have more repetitive parts we get better encoding. So may be in the Phoenix layer if we find column names bigger and non-repetitive naming structure we could rename the column qualifiers to make use of the above encoding capability. Regards Ram -Original Message- From: James Taylor [mailto:jamestay...@apache.org] Sent: Monday, January 19, 2015 10:30 PM To: user Subject: Re: short name for columns Good idea. Phoenix doesn't do that today. I'm hoping that HBase can come up with better block encodings that factor this kind of information out without perf taking a hit. They actually have one (TRIE), but I'm not sure how stable it is. Also, I'm not sure how well the existing encodings do for this (maybe good enough?). Please file a JIRA. Thanks, James On Mon, Jan 19, 2015 at 7:41 AM, Anil Gupta anilgupt...@gmail.com wrote: You mean to have a support for aliases for columns? If yes, then +1 for that. Sent from my iPhone On Jan 19, 2015, at 3:49 AM, Bulvik, Noam noam.bul...@teoco.com wrote: Hi, Do you plan to support assign short name for columns as part of phoenix features. i.e. when creating table using phoenix DDL there will be a metadata table that will convert the column name to short names (like a,b,c … aa,bb….). each time there will be a query the SQL that the user will use will be converted to the short name to query the db and will be converted back to the real name in the result set. This may save a lot of space because the name of a column is part of each row saved in the files. Regards, Noam Information in this e-mail and its attachments is confidential and privileged under the TEOCO confidentiality terms that can be reviewed here.
Re: short name for columns
Thanks, Noam. I opened HBASE-12883 as well. I think this kind of pure storage optimization should be done at the HBase level. James On Mon, Jan 19, 2015 at 11:07 PM, Bulvik, Noam noam.bul...@teoco.com wrote: I opened https://issues.apache.org/jira/browse/PHOENIX-1598 This feature can be used with prefix encoding there is contradiction between these two features -Original Message- From: James Taylor [mailto:jamestay...@apache.org] Sent: Monday, January 19, 2015 7:00 PM To: user Subject: Re: short name for columns Good idea. Phoenix doesn't do that today. I'm hoping that HBase can come up with better block encodings that factor this kind of information out without perf taking a hit. They actually have one (TRIE), but I'm not sure how stable it is. Also, I'm not sure how well the existing encodings do for this (maybe good enough?). Please file a JIRA. Thanks, James On Mon, Jan 19, 2015 at 7:41 AM, Anil Gupta anilgupt...@gmail.com wrote: You mean to have a support for aliases for columns? If yes, then +1 for that. Sent from my iPhone On Jan 19, 2015, at 3:49 AM, Bulvik, Noam noam.bul...@teoco.com wrote: Hi, Do you plan to support assign short name for columns as part of phoenix features. i.e. when creating table using phoenix DDL there will be a metadata table that will convert the column name to short names (like a,b,c … aa,bb….). each time there will be a query the SQL that the user will use will be converted to the short name to query the db and will be converted back to the real name in the result set. This may save a lot of space because the name of a column is part of each row saved in the files. Regards, Noam Information in this e-mail and its attachments is confidential and privileged under the TEOCO confidentiality terms that can be reviewed here. Information in this e-mail and its attachments is confidential and privileged under the TEOCO confidentiality terms that can be reviewed herehttp://www.teoco.com/email-disclaimer.
Re: short name for columns
You mean to have a support for aliases for columns? If yes, then +1 for that. Sent from my iPhone On Jan 19, 2015, at 3:49 AM, Bulvik, Noam noam.bul...@teoco.com wrote: Hi, Do you plan to support assign short name for columns as part of phoenix features. i.e. when creating table using phoenix DDL there will be a metadata table that will convert the column name to short names (like a,b,c … aa,bb….). each time there will be a query the SQL that the user will use will be converted to the short name to query the db and will be converted back to the real name in the result set. This may save a lot of space because the name of a column is part of each row saved in the files. Regards, Noam Information in this e-mail and its attachments is confidential and privileged under the TEOCO confidentiality terms that can be reviewed here.
short name for columns
Hi, Do you plan to support assign short name for columns as part of phoenix features. i.e. when creating table using phoenix DDL there will be a metadata table that will convert the column name to short names (like a,b,c ... aa,bb). each time there will be a query the SQL that the user will use will be converted to the short name to query the db and will be converted back to the real name in the result set. This may save a lot of space because the name of a column is part of each row saved in the files. Regards, Noam Information in this e-mail and its attachments is confidential and privileged under the TEOCO confidentiality terms that can be reviewed herehttp://www.teoco.com/email-disclaimer.
Re: short name for columns
Good idea. Phoenix doesn't do that today. I'm hoping that HBase can come up with better block encodings that factor this kind of information out without perf taking a hit. They actually have one (TRIE), but I'm not sure how stable it is. Also, I'm not sure how well the existing encodings do for this (maybe good enough?). Please file a JIRA. Thanks, James On Mon, Jan 19, 2015 at 7:41 AM, Anil Gupta anilgupt...@gmail.com wrote: You mean to have a support for aliases for columns? If yes, then +1 for that. Sent from my iPhone On Jan 19, 2015, at 3:49 AM, Bulvik, Noam noam.bul...@teoco.com wrote: Hi, Do you plan to support assign short name for columns as part of phoenix features. i.e. when creating table using phoenix DDL there will be a metadata table that will convert the column name to short names (like a,b,c … aa,bb….). each time there will be a query the SQL that the user will use will be converted to the short name to query the db and will be converted back to the real name in the result set. This may save a lot of space because the name of a column is part of each row saved in the files. Regards, Noam Information in this e-mail and its attachments is confidential and privileged under the TEOCO confidentiality terms that can be reviewed here.