[jira] [Commented] (HBASE-13788) Shell commands do not support column qualifiers containing colon (:)

2017-05-22 Thread Manaswini (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16019540#comment-16019540
 ] 

Manaswini commented on HBASE-13788:
---

@stack - Any update on this?

> Shell commands do not support column qualifiers containing colon (:)
> 
>
> Key: HBASE-13788
> URL: https://issues.apache.org/jira/browse/HBASE-13788
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 0.98.0, 0.96.0, 1.0.0, 1.1.0
>Reporter: Dave Latham
>Assignee: Manaswini
> Attachments: Hbase-13788-testcases.docx, hbase-13788-v1.patch
>
>
> The shell interprets the colon within the qualifier as a delimiter to a 
> FORMATTER instead of part of the qualifier itself.
> Example from the mailing list:
> Hmph, I may have spoken too soon. I know I tested this at one point and
> it worked, but now I'm getting different results:
> On the new cluster, I created a duplicate test table:
> hbase(main):043:0> create 'content3', {NAME => 'x', BLOOMFILTER =>
> 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION =>
> 'NONE', MIN_VERSIONS => '0', TTL => '2147483647', BLOCKSIZE => '65536',
> IN_MEMORY => 'false', BLOCKCACHE => 'true'}
> Then I pull some data from the imported table:
> hbase(main):045:0> scan 'content', {LIMIT=>1,
> STARTROW=>'A:9223370612089311807:twtr:57013379'}
> ROW  COLUMN+CELL
> 
> A:9223370612089311807:twtr:570133798827921408
> column=x:twitter:username, timestamp=1424775595345, value=BERITA &
> INFORMASI!
> Then put it:
> hbase(main):046:0> put
> 'content3','A:9223370612089311807:twtr:570133798827921408',
> 'x:twitter:username', 'BERITA & INFORMASI!'
> But then when I query it, I see that I've lost the column qualifier
> ":username":
> hbase(main):046:0> scan 'content3'
> ROW  COLUMN+CELL
>  A:9223370612089311807:twtr:570133798827921408 column=x:twitter,
>  timestamp=1432745301788, value=BERITA & INFORMASI!
> Even though I'm missing one of the qualifiers, I can at least filter on
> columns in this sample table.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HBASE-13788) Shell commands do not support column qualifiers containing colon (:)

2017-03-22 Thread Manaswini (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15937536#comment-15937536
 ] 

Manaswini edited comment on HBASE-13788 at 3/23/17 1:49 AM:


As per the @Stack suggestion, I've added an ordinal option. i.e. FORMATTER just 
listed conversion per column mentioned in COLUMN? i.e. FORMATTER => {'toInt'}

Now the custom formatting can be specified in two ways:

 1. Specifying it for each column by column qualifier
 2. Without the column qualifier in which case the column qualifier will be 
derived from COLUMNS specification and applied in the order they appear in 
COLUMNS specification.

Example formatting cf:qualifier1 and cf:qualifier2 both as Integers:
 
 hbase> scan 't1', {COLUMN => ['cf:qualifier1','cf:qualifier2'],
  FORMATTER => {'cf:qualifier1'=> 'toInt','cf:qualifier2'=> 
'c(org.apache.hadoop.hbase.util.Bytes).toInt'] }
   
  or
 
 hbase> scan 't1', {COLUMN => ['cf:qualifier1','cf:qualifier2'],
 FORMATTER => [ 'toInt','c(org.apache.hadoop.hbase.util.Bytes).toInt']

stack - I've attached the patch and the test cases I have it tested for. Could 
you review and let me know if any improvements are needed? 


Thanks!
Mansi 




was (Author: mmaharana):
As per the @Stack suggestion, I've added an ordinal option. i.e. FORMATTER just 
listed conversion per column mentioned in COLUMN? i.e. FORMATTER => {'toInt'}. 

Now the custom formatting can be specified in two ways:

 1. Specifying it for each column by column qualifier
 2. Without the column qualifier in which case the column qualifier will be 
derived from COLUMNS specification and applied in the order they appear in 
COLUMNS specification.

Example formatting cf:qualifier1 and cf:qualifier2 both as Integers:
 
 hbase> scan 't1', {COLUMN => ['cf:qualifier1','cf:qualifier2'],
  FORMATTER => {'cf:qualifier1'=> 'toInt','cf:qualifier2'=> 
'c(org.apache.hadoop.hbase.util.Bytes).toInt'] }
   
  or
 
 hbase> scan 't1', {COLUMN => ['cf:qualifier1','cf:qualifier2'],
 FORMATTER => [ 'toInt','c(org.apache.hadoop.hbase.util.Bytes).toInt']

stack - I've attached the patch and the test cases I have it tested for. Could 
you review and let me know if any improvements are needed? 


Thanks!
Mansi 



> Shell commands do not support column qualifiers containing colon (:)
> 
>
> Key: HBASE-13788
> URL: https://issues.apache.org/jira/browse/HBASE-13788
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 0.98.0, 0.96.0, 1.0.0, 1.1.0
>Reporter: Dave Latham
>Assignee: Manaswini
> Attachments: Hbase-13788-testcases.docx, hbase-13788-v1.patch
>
>
> The shell interprets the colon within the qualifier as a delimiter to a 
> FORMATTER instead of part of the qualifier itself.
> Example from the mailing list:
> Hmph, I may have spoken too soon. I know I tested this at one point and
> it worked, but now I'm getting different results:
> On the new cluster, I created a duplicate test table:
> hbase(main):043:0> create 'content3', {NAME => 'x', BLOOMFILTER =>
> 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION =>
> 'NONE', MIN_VERSIONS => '0', TTL => '2147483647', BLOCKSIZE => '65536',
> IN_MEMORY => 'false', BLOCKCACHE => 'true'}
> Then I pull some data from the imported table:
> hbase(main):045:0> scan 'content', {LIMIT=>1,
> STARTROW=>'A:9223370612089311807:twtr:57013379'}
> ROW  COLUMN+CELL
> 
> A:9223370612089311807:twtr:570133798827921408
> column=x:twitter:username, timestamp=1424775595345, value=BERITA &
> INFORMASI!
> Then put it:
> hbase(main):046:0> put
> 'content3','A:9223370612089311807:twtr:570133798827921408',
> 'x:twitter:username', 'BERITA & INFORMASI!'
> But then when I query it, I see that I've lost the column qualifier
> ":username":
> hbase(main):046:0> scan 'content3'
> ROW  COLUMN+CELL
>  A:9223370612089311807:twtr:570133798827921408 column=x:twitter,
>  timestamp=1432745301788, value=BERITA & INFORMASI!
> Even though I'm missing one of the qualifiers, I can at least filter on
> columns in this sample table.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-13788) Shell commands do not support column qualifiers containing colon (:)

2017-03-22 Thread Manaswini (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15937536#comment-15937536
 ] 

Manaswini commented on HBASE-13788:
---

As per the @Stack suggestion, I've added an ordinal option. i.e. FORMATTER just 
listed conversion per column mentioned in COLUMN? i.e. FORMATTER => {'toInt'}. 

Now the custom formatting can be specified in two ways:

 1. Specifying it for each column by column qualifier
 2. Without the column qualifier in which case the column qualifier will be 
derived from COLUMNS specification and applied in the order they appear in 
COLUMNS specification.

Example formatting cf:qualifier1 and cf:qualifier2 both as Integers:
 
 hbase> scan 't1', {COLUMN => ['cf:qualifier1','cf:qualifier2'],
  FORMATTER => {'cf:qualifier1'=> 'toInt','cf:qualifier2'=> 
'c(org.apache.hadoop.hbase.util.Bytes).toInt'] }
   
  or
 
 hbase> scan 't1', {COLUMN => ['cf:qualifier1','cf:qualifier2'],
 FORMATTER => [ 'toInt','c(org.apache.hadoop.hbase.util.Bytes).toInt']

stack - I've attached the patch and the test cases I have it tested for. Could 
you review and let me know if any improvements are needed? 


Thanks!
Mansi 



> Shell commands do not support column qualifiers containing colon (:)
> 
>
> Key: HBASE-13788
> URL: https://issues.apache.org/jira/browse/HBASE-13788
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 0.98.0, 0.96.0, 1.0.0, 1.1.0
>Reporter: Dave Latham
>Assignee: Manaswini
> Attachments: Hbase-13788-testcases.docx, hbase-13788-v1.patch
>
>
> The shell interprets the colon within the qualifier as a delimiter to a 
> FORMATTER instead of part of the qualifier itself.
> Example from the mailing list:
> Hmph, I may have spoken too soon. I know I tested this at one point and
> it worked, but now I'm getting different results:
> On the new cluster, I created a duplicate test table:
> hbase(main):043:0> create 'content3', {NAME => 'x', BLOOMFILTER =>
> 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION =>
> 'NONE', MIN_VERSIONS => '0', TTL => '2147483647', BLOCKSIZE => '65536',
> IN_MEMORY => 'false', BLOCKCACHE => 'true'}
> Then I pull some data from the imported table:
> hbase(main):045:0> scan 'content', {LIMIT=>1,
> STARTROW=>'A:9223370612089311807:twtr:57013379'}
> ROW  COLUMN+CELL
> 
> A:9223370612089311807:twtr:570133798827921408
> column=x:twitter:username, timestamp=1424775595345, value=BERITA &
> INFORMASI!
> Then put it:
> hbase(main):046:0> put
> 'content3','A:9223370612089311807:twtr:570133798827921408',
> 'x:twitter:username', 'BERITA & INFORMASI!'
> But then when I query it, I see that I've lost the column qualifier
> ":username":
> hbase(main):046:0> scan 'content3'
> ROW  COLUMN+CELL
>  A:9223370612089311807:twtr:570133798827921408 column=x:twitter,
>  timestamp=1432745301788, value=BERITA & INFORMASI!
> Even though I'm missing one of the qualifiers, I can at least filter on
> columns in this sample table.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-13788) Shell commands do not support column qualifiers containing colon (:)

2017-03-22 Thread Manaswini (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manaswini updated HBASE-13788:
--
Attachment: hbase-13788-v1.patch
Hbase-13788-testcases.docx

> Shell commands do not support column qualifiers containing colon (:)
> 
>
> Key: HBASE-13788
> URL: https://issues.apache.org/jira/browse/HBASE-13788
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 0.98.0, 0.96.0, 1.0.0, 1.1.0
>Reporter: Dave Latham
>Assignee: Manaswini
> Attachments: Hbase-13788-testcases.docx, hbase-13788-v1.patch
>
>
> The shell interprets the colon within the qualifier as a delimiter to a 
> FORMATTER instead of part of the qualifier itself.
> Example from the mailing list:
> Hmph, I may have spoken too soon. I know I tested this at one point and
> it worked, but now I'm getting different results:
> On the new cluster, I created a duplicate test table:
> hbase(main):043:0> create 'content3', {NAME => 'x', BLOOMFILTER =>
> 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION =>
> 'NONE', MIN_VERSIONS => '0', TTL => '2147483647', BLOCKSIZE => '65536',
> IN_MEMORY => 'false', BLOCKCACHE => 'true'}
> Then I pull some data from the imported table:
> hbase(main):045:0> scan 'content', {LIMIT=>1,
> STARTROW=>'A:9223370612089311807:twtr:57013379'}
> ROW  COLUMN+CELL
> 
> A:9223370612089311807:twtr:570133798827921408
> column=x:twitter:username, timestamp=1424775595345, value=BERITA &
> INFORMASI!
> Then put it:
> hbase(main):046:0> put
> 'content3','A:9223370612089311807:twtr:570133798827921408',
> 'x:twitter:username', 'BERITA & INFORMASI!'
> But then when I query it, I see that I've lost the column qualifier
> ":username":
> hbase(main):046:0> scan 'content3'
> ROW  COLUMN+CELL
>  A:9223370612089311807:twtr:570133798827921408 column=x:twitter,
>  timestamp=1432745301788, value=BERITA & INFORMASI!
> Even though I'm missing one of the qualifiers, I can at least filter on
> columns in this sample table.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-13788) Shell commands do not support column qualifiers containing colon (:)

2017-02-03 Thread Manaswini (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15851734#comment-15851734
 ] 

Manaswini commented on HBASE-13788:
---

Thanks for the review @Stack. 

Yes, you are right "x:twitter:username" is not there in my dataset especially 
if I use put to get the data in.

With existing code
hbase(main):009:0> create 'content3', {NAME => 'x', BLOOMFILTER => 'NONE', 
REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => 'NONE', MIN_VERSIONS 
=> '0', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', 
BLOCKCACHE => 'true'}
0 row(s) in 1.4300 seconds

=> Hbase::Table - content3
hbase(main):010:0> put 
'content3','A:9223370612089311807:twtr:570133798827921409','x:twitter:username',
 'BERITA & INFORMASI!'
0 row(s) in 0.1350 seconds

hbase(main):011:0> scan 'content3'
ROW   COLUMN+CELL   
  
 A:9223370612089311807:twtr:57013 column=x:twitter, timestamp=1486141128707, 
value=BERITA & INFORMASI!
 3798827921409  
  
1 row(s) in 0.0150 seconds
  
 Ordinal is a great idea but what if the user has a hundred columns in COLUMNS 
and only wants to format a few. I think having both the options will be useful. 
I'll need to dig further into the code to see if that's doable with minimal 
changes/impacts


> Shell commands do not support column qualifiers containing colon (:)
> 
>
> Key: HBASE-13788
> URL: https://issues.apache.org/jira/browse/HBASE-13788
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 0.98.0, 0.96.0, 1.0.0, 1.1.0
>Reporter: Dave Latham
>Assignee: Manaswini
>
> The shell interprets the colon within the qualifier as a delimiter to a 
> FORMATTER instead of part of the qualifier itself.
> Example from the mailing list:
> Hmph, I may have spoken too soon. I know I tested this at one point and
> it worked, but now I'm getting different results:
> On the new cluster, I created a duplicate test table:
> hbase(main):043:0> create 'content3', {NAME => 'x', BLOOMFILTER =>
> 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION =>
> 'NONE', MIN_VERSIONS => '0', TTL => '2147483647', BLOCKSIZE => '65536',
> IN_MEMORY => 'false', BLOCKCACHE => 'true'}
> Then I pull some data from the imported table:
> hbase(main):045:0> scan 'content', {LIMIT=>1,
> STARTROW=>'A:9223370612089311807:twtr:57013379'}
> ROW  COLUMN+CELL
> 
> A:9223370612089311807:twtr:570133798827921408
> column=x:twitter:username, timestamp=1424775595345, value=BERITA &
> INFORMASI!
> Then put it:
> hbase(main):046:0> put
> 'content3','A:9223370612089311807:twtr:570133798827921408',
> 'x:twitter:username', 'BERITA & INFORMASI!'
> But then when I query it, I see that I've lost the column qualifier
> ":username":
> hbase(main):046:0> scan 'content3'
> ROW  COLUMN+CELL
>  A:9223370612089311807:twtr:570133798827921408 column=x:twitter,
>  timestamp=1432745301788, value=BERITA & INFORMASI!
> Even though I'm missing one of the qualifiers, I can at least filter on
> columns in this sample table.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-13788) Shell commands do not support column qualifiers containing colon (:)

2017-01-18 Thread Manaswini (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828585#comment-15828585
 ] 

Manaswini commented on HBASE-13788:
---

Thank you [~busbey] & stack

I've played with the code and implemented the idea of adding a formatting 
directive as a separate argument instead of passing it along the column 
qualifier. Below are a few examples for existing and changed code - 

hbase(main):003:0> scan 'content3'
ROW   COLUMN+CELL   


 A:9223370612089311807:twtr:57013379882792140 column=x:twitter:username, 
timestamp=1484764832937, value=BERITA & INFORMASI!  
   
 9  


1 row(s)

With existing code (cf:qualifier:[CONVERTER])
hbase(main):003:0>  get 
'content3','A:9223370612089311807:twtr:570133798827921409',{COLUMN => 
'x:twitter:toInt'}
COLUMN   CELL   
   
 x:twitter   timestamp=1484755000607, value=839305  
   
1 row(s) in 0.0140 seconds

hbase(main):003:0> scan 'content3' ,{COLUMN => 'x:twitter:toInt'}
ROW  COLUMN+CELL
   
 A:9223370612089311807:twtr:57013379 column=x:twitter, timestamp=1484755000607, 
value=839305   
 8827921409 
   
1 row(s) in 0.4960 seconds

hbase(main):003:0> scan 'content3' ,{COLUMN => 'x:twitter:username'}
ROW  COLUMN+CELL
   

ERROR: undefined method `username' for class `#'

-

With changed code (separate FORMATTER tag)
hbase(main):017:0> get 
'content3','A:9223370612089311807:twtr:570133798827921409',{COLUMN => 
['x:twitter:username'] , FORMATTER => {'x:twitter:username'=>'toInt'}} 
COLUMNCELL  


 x:twitter:username   timestamp=1484754351711, 
value=839305
 
1 row(s)

hbase(main):017:0>  scan 'content3',{COLUMN => ['x:twitter:username'] , 
FORMATTER => {'x:twitter:username'=>'toInt'}} 
ROW   COLUMN+CELL   


 A:9223370612089311807:twtr:57013379882792140 column=x:twitter:username, 
timestamp=1484764832937, value=839305   
   
 9  


1 row(s)

hbase(main):017:0>  scan 'content3',{COLUMN => ['x:twitter:username'] }
ROW   COLUMN+CELL   


 A:9223370612089311807:twtr:57013379882792140 column=x:twitter:username, 
timestamp=1484764832937, value=BERITA & INFORMASI!  
   
 9  



Does this look good? 

Once I get thumbs-up from stack, I shall build the patch, add test cases for 
get and scan along with proper comments and send it over for code review.


> Shell commands do not support column qualifiers containing colon (:)
> 
>
> Key: HBASE-13788
> URL: https://issues.apache.org/jira/browse/HBASE-13788
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 0.98.0, 0.96.0, 1.0.0, 1.1.0
>Reporter: Dave Latham
>Assignee: Manaswini
>
> The shell interprets the colon within the qualifier as a delimiter to a 
> FORMATTER instead of part of the qualifier itself.
> Example from the mailing 

[jira] [Commented] (HBASE-13788) Shell commands do not support column qualifiers containing colon (:)

2017-01-10 Thread Manaswini (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15815347#comment-15815347
 ] 

Manaswini commented on HBASE-13788:
---

I'd like to work on it. 

> Shell commands do not support column qualifiers containing colon (:)
> 
>
> Key: HBASE-13788
> URL: https://issues.apache.org/jira/browse/HBASE-13788
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 0.98.0, 0.96.0, 1.0.0, 1.1.0
>Reporter: Dave Latham
>Assignee: Pankaj Kumar
>
> The shell interprets the colon within the qualifier as a delimiter to a 
> FORMATTER instead of part of the qualifier itself.
> Example from the mailing list:
> Hmph, I may have spoken too soon. I know I tested this at one point and
> it worked, but now I'm getting different results:
> On the new cluster, I created a duplicate test table:
> hbase(main):043:0> create 'content3', {NAME => 'x', BLOOMFILTER =>
> 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION =>
> 'NONE', MIN_VERSIONS => '0', TTL => '2147483647', BLOCKSIZE => '65536',
> IN_MEMORY => 'false', BLOCKCACHE => 'true'}
> Then I pull some data from the imported table:
> hbase(main):045:0> scan 'content', {LIMIT=>1,
> STARTROW=>'A:9223370612089311807:twtr:57013379'}
> ROW  COLUMN+CELL
> 
> A:9223370612089311807:twtr:570133798827921408
> column=x:twitter:username, timestamp=1424775595345, value=BERITA &
> INFORMASI!
> Then put it:
> hbase(main):046:0> put
> 'content3','A:9223370612089311807:twtr:570133798827921408',
> 'x:twitter:username', 'BERITA & INFORMASI!'
> But then when I query it, I see that I've lost the column qualifier
> ":username":
> hbase(main):046:0> scan 'content3'
> ROW  COLUMN+CELL
>  A:9223370612089311807:twtr:570133798827921408 column=x:twitter,
>  timestamp=1432745301788, value=BERITA & INFORMASI!
> Even though I'm missing one of the qualifiers, I can at least filter on
> columns in this sample table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)