[jira] [Commented] (HBASE-13788) Shell commands do not support column qualifiers containing colon (:)
[ https://issues.apache.org/jira/browse/HBASE-13788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16019540#comment-16019540 ] Manaswini commented on HBASE-13788: --- @stack - Any update on this? > Shell commands do not support column qualifiers containing colon (:) > > > Key: HBASE-13788 > URL: https://issues.apache.org/jira/browse/HBASE-13788 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 0.98.0, 0.96.0, 1.0.0, 1.1.0 >Reporter: Dave Latham >Assignee: Manaswini > Attachments: Hbase-13788-testcases.docx, hbase-13788-v1.patch > > > The shell interprets the colon within the qualifier as a delimiter to a > FORMATTER instead of part of the qualifier itself. > Example from the mailing list: > Hmph, I may have spoken too soon. I know I tested this at one point and > it worked, but now I'm getting different results: > On the new cluster, I created a duplicate test table: > hbase(main):043:0> create 'content3', {NAME => 'x', BLOOMFILTER => > 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => > 'NONE', MIN_VERSIONS => '0', TTL => '2147483647', BLOCKSIZE => '65536', > IN_MEMORY => 'false', BLOCKCACHE => 'true'} > Then I pull some data from the imported table: > hbase(main):045:0> scan 'content', {LIMIT=>1, > STARTROW=>'A:9223370612089311807:twtr:57013379'} > ROW COLUMN+CELL > > A:9223370612089311807:twtr:570133798827921408 > column=x:twitter:username, timestamp=1424775595345, value=BERITA & > INFORMASI! > Then put it: > hbase(main):046:0> put > 'content3','A:9223370612089311807:twtr:570133798827921408', > 'x:twitter:username', 'BERITA & INFORMASI!' > But then when I query it, I see that I've lost the column qualifier > ":username": > hbase(main):046:0> scan 'content3' > ROW COLUMN+CELL > A:9223370612089311807:twtr:570133798827921408 column=x:twitter, > timestamp=1432745301788, value=BERITA & INFORMASI! > Even though I'm missing one of the qualifiers, I can at least filter on > columns in this sample table. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HBASE-13788) Shell commands do not support column qualifiers containing colon (:)
[ https://issues.apache.org/jira/browse/HBASE-13788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15937536#comment-15937536 ] Manaswini edited comment on HBASE-13788 at 3/23/17 1:49 AM: As per the @Stack suggestion, I've added an ordinal option. i.e. FORMATTER just listed conversion per column mentioned in COLUMN? i.e. FORMATTER => {'toInt'} Now the custom formatting can be specified in two ways: 1. Specifying it for each column by column qualifier 2. Without the column qualifier in which case the column qualifier will be derived from COLUMNS specification and applied in the order they appear in COLUMNS specification. Example formatting cf:qualifier1 and cf:qualifier2 both as Integers: hbase> scan 't1', {COLUMN => ['cf:qualifier1','cf:qualifier2'], FORMATTER => {'cf:qualifier1'=> 'toInt','cf:qualifier2'=> 'c(org.apache.hadoop.hbase.util.Bytes).toInt'] } or hbase> scan 't1', {COLUMN => ['cf:qualifier1','cf:qualifier2'], FORMATTER => [ 'toInt','c(org.apache.hadoop.hbase.util.Bytes).toInt'] stack - I've attached the patch and the test cases I have it tested for. Could you review and let me know if any improvements are needed? Thanks! Mansi was (Author: mmaharana): As per the @Stack suggestion, I've added an ordinal option. i.e. FORMATTER just listed conversion per column mentioned in COLUMN? i.e. FORMATTER => {'toInt'}. Now the custom formatting can be specified in two ways: 1. Specifying it for each column by column qualifier 2. Without the column qualifier in which case the column qualifier will be derived from COLUMNS specification and applied in the order they appear in COLUMNS specification. Example formatting cf:qualifier1 and cf:qualifier2 both as Integers: hbase> scan 't1', {COLUMN => ['cf:qualifier1','cf:qualifier2'], FORMATTER => {'cf:qualifier1'=> 'toInt','cf:qualifier2'=> 'c(org.apache.hadoop.hbase.util.Bytes).toInt'] } or hbase> scan 't1', {COLUMN => ['cf:qualifier1','cf:qualifier2'], FORMATTER => [ 'toInt','c(org.apache.hadoop.hbase.util.Bytes).toInt'] stack - I've attached the patch and the test cases I have it tested for. Could you review and let me know if any improvements are needed? Thanks! Mansi > Shell commands do not support column qualifiers containing colon (:) > > > Key: HBASE-13788 > URL: https://issues.apache.org/jira/browse/HBASE-13788 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 0.98.0, 0.96.0, 1.0.0, 1.1.0 >Reporter: Dave Latham >Assignee: Manaswini > Attachments: Hbase-13788-testcases.docx, hbase-13788-v1.patch > > > The shell interprets the colon within the qualifier as a delimiter to a > FORMATTER instead of part of the qualifier itself. > Example from the mailing list: > Hmph, I may have spoken too soon. I know I tested this at one point and > it worked, but now I'm getting different results: > On the new cluster, I created a duplicate test table: > hbase(main):043:0> create 'content3', {NAME => 'x', BLOOMFILTER => > 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => > 'NONE', MIN_VERSIONS => '0', TTL => '2147483647', BLOCKSIZE => '65536', > IN_MEMORY => 'false', BLOCKCACHE => 'true'} > Then I pull some data from the imported table: > hbase(main):045:0> scan 'content', {LIMIT=>1, > STARTROW=>'A:9223370612089311807:twtr:57013379'} > ROW COLUMN+CELL > > A:9223370612089311807:twtr:570133798827921408 > column=x:twitter:username, timestamp=1424775595345, value=BERITA & > INFORMASI! > Then put it: > hbase(main):046:0> put > 'content3','A:9223370612089311807:twtr:570133798827921408', > 'x:twitter:username', 'BERITA & INFORMASI!' > But then when I query it, I see that I've lost the column qualifier > ":username": > hbase(main):046:0> scan 'content3' > ROW COLUMN+CELL > A:9223370612089311807:twtr:570133798827921408 column=x:twitter, > timestamp=1432745301788, value=BERITA & INFORMASI! > Even though I'm missing one of the qualifiers, I can at least filter on > columns in this sample table. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-13788) Shell commands do not support column qualifiers containing colon (:)
[ https://issues.apache.org/jira/browse/HBASE-13788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15937536#comment-15937536 ] Manaswini commented on HBASE-13788: --- As per the @Stack suggestion, I've added an ordinal option. i.e. FORMATTER just listed conversion per column mentioned in COLUMN? i.e. FORMATTER => {'toInt'}. Now the custom formatting can be specified in two ways: 1. Specifying it for each column by column qualifier 2. Without the column qualifier in which case the column qualifier will be derived from COLUMNS specification and applied in the order they appear in COLUMNS specification. Example formatting cf:qualifier1 and cf:qualifier2 both as Integers: hbase> scan 't1', {COLUMN => ['cf:qualifier1','cf:qualifier2'], FORMATTER => {'cf:qualifier1'=> 'toInt','cf:qualifier2'=> 'c(org.apache.hadoop.hbase.util.Bytes).toInt'] } or hbase> scan 't1', {COLUMN => ['cf:qualifier1','cf:qualifier2'], FORMATTER => [ 'toInt','c(org.apache.hadoop.hbase.util.Bytes).toInt'] stack - I've attached the patch and the test cases I have it tested for. Could you review and let me know if any improvements are needed? Thanks! Mansi > Shell commands do not support column qualifiers containing colon (:) > > > Key: HBASE-13788 > URL: https://issues.apache.org/jira/browse/HBASE-13788 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 0.98.0, 0.96.0, 1.0.0, 1.1.0 >Reporter: Dave Latham >Assignee: Manaswini > Attachments: Hbase-13788-testcases.docx, hbase-13788-v1.patch > > > The shell interprets the colon within the qualifier as a delimiter to a > FORMATTER instead of part of the qualifier itself. > Example from the mailing list: > Hmph, I may have spoken too soon. I know I tested this at one point and > it worked, but now I'm getting different results: > On the new cluster, I created a duplicate test table: > hbase(main):043:0> create 'content3', {NAME => 'x', BLOOMFILTER => > 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => > 'NONE', MIN_VERSIONS => '0', TTL => '2147483647', BLOCKSIZE => '65536', > IN_MEMORY => 'false', BLOCKCACHE => 'true'} > Then I pull some data from the imported table: > hbase(main):045:0> scan 'content', {LIMIT=>1, > STARTROW=>'A:9223370612089311807:twtr:57013379'} > ROW COLUMN+CELL > > A:9223370612089311807:twtr:570133798827921408 > column=x:twitter:username, timestamp=1424775595345, value=BERITA & > INFORMASI! > Then put it: > hbase(main):046:0> put > 'content3','A:9223370612089311807:twtr:570133798827921408', > 'x:twitter:username', 'BERITA & INFORMASI!' > But then when I query it, I see that I've lost the column qualifier > ":username": > hbase(main):046:0> scan 'content3' > ROW COLUMN+CELL > A:9223370612089311807:twtr:570133798827921408 column=x:twitter, > timestamp=1432745301788, value=BERITA & INFORMASI! > Even though I'm missing one of the qualifiers, I can at least filter on > columns in this sample table. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-13788) Shell commands do not support column qualifiers containing colon (:)
[ https://issues.apache.org/jira/browse/HBASE-13788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manaswini updated HBASE-13788: -- Attachment: hbase-13788-v1.patch Hbase-13788-testcases.docx > Shell commands do not support column qualifiers containing colon (:) > > > Key: HBASE-13788 > URL: https://issues.apache.org/jira/browse/HBASE-13788 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 0.98.0, 0.96.0, 1.0.0, 1.1.0 >Reporter: Dave Latham >Assignee: Manaswini > Attachments: Hbase-13788-testcases.docx, hbase-13788-v1.patch > > > The shell interprets the colon within the qualifier as a delimiter to a > FORMATTER instead of part of the qualifier itself. > Example from the mailing list: > Hmph, I may have spoken too soon. I know I tested this at one point and > it worked, but now I'm getting different results: > On the new cluster, I created a duplicate test table: > hbase(main):043:0> create 'content3', {NAME => 'x', BLOOMFILTER => > 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => > 'NONE', MIN_VERSIONS => '0', TTL => '2147483647', BLOCKSIZE => '65536', > IN_MEMORY => 'false', BLOCKCACHE => 'true'} > Then I pull some data from the imported table: > hbase(main):045:0> scan 'content', {LIMIT=>1, > STARTROW=>'A:9223370612089311807:twtr:57013379'} > ROW COLUMN+CELL > > A:9223370612089311807:twtr:570133798827921408 > column=x:twitter:username, timestamp=1424775595345, value=BERITA & > INFORMASI! > Then put it: > hbase(main):046:0> put > 'content3','A:9223370612089311807:twtr:570133798827921408', > 'x:twitter:username', 'BERITA & INFORMASI!' > But then when I query it, I see that I've lost the column qualifier > ":username": > hbase(main):046:0> scan 'content3' > ROW COLUMN+CELL > A:9223370612089311807:twtr:570133798827921408 column=x:twitter, > timestamp=1432745301788, value=BERITA & INFORMASI! > Even though I'm missing one of the qualifiers, I can at least filter on > columns in this sample table. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-13788) Shell commands do not support column qualifiers containing colon (:)
[ https://issues.apache.org/jira/browse/HBASE-13788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15851734#comment-15851734 ] Manaswini commented on HBASE-13788: --- Thanks for the review @Stack. Yes, you are right "x:twitter:username" is not there in my dataset especially if I use put to get the data in. With existing code hbase(main):009:0> create 'content3', {NAME => 'x', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'} 0 row(s) in 1.4300 seconds => Hbase::Table - content3 hbase(main):010:0> put 'content3','A:9223370612089311807:twtr:570133798827921409','x:twitter:username', 'BERITA & INFORMASI!' 0 row(s) in 0.1350 seconds hbase(main):011:0> scan 'content3' ROW COLUMN+CELL A:9223370612089311807:twtr:57013 column=x:twitter, timestamp=1486141128707, value=BERITA & INFORMASI! 3798827921409 1 row(s) in 0.0150 seconds Ordinal is a great idea but what if the user has a hundred columns in COLUMNS and only wants to format a few. I think having both the options will be useful. I'll need to dig further into the code to see if that's doable with minimal changes/impacts > Shell commands do not support column qualifiers containing colon (:) > > > Key: HBASE-13788 > URL: https://issues.apache.org/jira/browse/HBASE-13788 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 0.98.0, 0.96.0, 1.0.0, 1.1.0 >Reporter: Dave Latham >Assignee: Manaswini > > The shell interprets the colon within the qualifier as a delimiter to a > FORMATTER instead of part of the qualifier itself. > Example from the mailing list: > Hmph, I may have spoken too soon. I know I tested this at one point and > it worked, but now I'm getting different results: > On the new cluster, I created a duplicate test table: > hbase(main):043:0> create 'content3', {NAME => 'x', BLOOMFILTER => > 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => > 'NONE', MIN_VERSIONS => '0', TTL => '2147483647', BLOCKSIZE => '65536', > IN_MEMORY => 'false', BLOCKCACHE => 'true'} > Then I pull some data from the imported table: > hbase(main):045:0> scan 'content', {LIMIT=>1, > STARTROW=>'A:9223370612089311807:twtr:57013379'} > ROW COLUMN+CELL > > A:9223370612089311807:twtr:570133798827921408 > column=x:twitter:username, timestamp=1424775595345, value=BERITA & > INFORMASI! > Then put it: > hbase(main):046:0> put > 'content3','A:9223370612089311807:twtr:570133798827921408', > 'x:twitter:username', 'BERITA & INFORMASI!' > But then when I query it, I see that I've lost the column qualifier > ":username": > hbase(main):046:0> scan 'content3' > ROW COLUMN+CELL > A:9223370612089311807:twtr:570133798827921408 column=x:twitter, > timestamp=1432745301788, value=BERITA & INFORMASI! > Even though I'm missing one of the qualifiers, I can at least filter on > columns in this sample table. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-13788) Shell commands do not support column qualifiers containing colon (:)
[ https://issues.apache.org/jira/browse/HBASE-13788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828585#comment-15828585 ] Manaswini commented on HBASE-13788: --- Thank you [~busbey] & stack I've played with the code and implemented the idea of adding a formatting directive as a separate argument instead of passing it along the column qualifier. Below are a few examples for existing and changed code - hbase(main):003:0> scan 'content3' ROW COLUMN+CELL A:9223370612089311807:twtr:57013379882792140 column=x:twitter:username, timestamp=1484764832937, value=BERITA & INFORMASI! 9 1 row(s) With existing code (cf:qualifier:[CONVERTER]) hbase(main):003:0> get 'content3','A:9223370612089311807:twtr:570133798827921409',{COLUMN => 'x:twitter:toInt'} COLUMN CELL x:twitter timestamp=1484755000607, value=839305 1 row(s) in 0.0140 seconds hbase(main):003:0> scan 'content3' ,{COLUMN => 'x:twitter:toInt'} ROW COLUMN+CELL A:9223370612089311807:twtr:57013379 column=x:twitter, timestamp=1484755000607, value=839305 8827921409 1 row(s) in 0.4960 seconds hbase(main):003:0> scan 'content3' ,{COLUMN => 'x:twitter:username'} ROW COLUMN+CELL ERROR: undefined method `username' for class `#' - With changed code (separate FORMATTER tag) hbase(main):017:0> get 'content3','A:9223370612089311807:twtr:570133798827921409',{COLUMN => ['x:twitter:username'] , FORMATTER => {'x:twitter:username'=>'toInt'}} COLUMNCELL x:twitter:username timestamp=1484754351711, value=839305 1 row(s) hbase(main):017:0> scan 'content3',{COLUMN => ['x:twitter:username'] , FORMATTER => {'x:twitter:username'=>'toInt'}} ROW COLUMN+CELL A:9223370612089311807:twtr:57013379882792140 column=x:twitter:username, timestamp=1484764832937, value=839305 9 1 row(s) hbase(main):017:0> scan 'content3',{COLUMN => ['x:twitter:username'] } ROW COLUMN+CELL A:9223370612089311807:twtr:57013379882792140 column=x:twitter:username, timestamp=1484764832937, value=BERITA & INFORMASI! 9 Does this look good? Once I get thumbs-up from stack, I shall build the patch, add test cases for get and scan along with proper comments and send it over for code review. > Shell commands do not support column qualifiers containing colon (:) > > > Key: HBASE-13788 > URL: https://issues.apache.org/jira/browse/HBASE-13788 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 0.98.0, 0.96.0, 1.0.0, 1.1.0 >Reporter: Dave Latham >Assignee: Manaswini > > The shell interprets the colon within the qualifier as a delimiter to a > FORMATTER instead of part of the qualifier itself. > Example from the mailing
[jira] [Commented] (HBASE-13788) Shell commands do not support column qualifiers containing colon (:)
[ https://issues.apache.org/jira/browse/HBASE-13788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15815347#comment-15815347 ] Manaswini commented on HBASE-13788: --- I'd like to work on it. > Shell commands do not support column qualifiers containing colon (:) > > > Key: HBASE-13788 > URL: https://issues.apache.org/jira/browse/HBASE-13788 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 0.98.0, 0.96.0, 1.0.0, 1.1.0 >Reporter: Dave Latham >Assignee: Pankaj Kumar > > The shell interprets the colon within the qualifier as a delimiter to a > FORMATTER instead of part of the qualifier itself. > Example from the mailing list: > Hmph, I may have spoken too soon. I know I tested this at one point and > it worked, but now I'm getting different results: > On the new cluster, I created a duplicate test table: > hbase(main):043:0> create 'content3', {NAME => 'x', BLOOMFILTER => > 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => > 'NONE', MIN_VERSIONS => '0', TTL => '2147483647', BLOCKSIZE => '65536', > IN_MEMORY => 'false', BLOCKCACHE => 'true'} > Then I pull some data from the imported table: > hbase(main):045:0> scan 'content', {LIMIT=>1, > STARTROW=>'A:9223370612089311807:twtr:57013379'} > ROW COLUMN+CELL > > A:9223370612089311807:twtr:570133798827921408 > column=x:twitter:username, timestamp=1424775595345, value=BERITA & > INFORMASI! > Then put it: > hbase(main):046:0> put > 'content3','A:9223370612089311807:twtr:570133798827921408', > 'x:twitter:username', 'BERITA & INFORMASI!' > But then when I query it, I see that I've lost the column qualifier > ":username": > hbase(main):046:0> scan 'content3' > ROW COLUMN+CELL > A:9223370612089311807:twtr:570133798827921408 column=x:twitter, > timestamp=1432745301788, value=BERITA & INFORMASI! > Even though I'm missing one of the qualifiers, I can at least filter on > columns in this sample table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)