[ 
https://issues.apache.org/jira/browse/NIFI-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15432069#comment-15432069
 ] 

ASF GitHub Bot commented on NIFI-2620:
--------------------------------------

GitHub user apsaltis opened a pull request:

    https://github.com/apache/nifi/pull/914

    Adding support for Binary Row Keys 

    Adding support for Binary Row Keys for both PutHBaseCell and PutHBaseJSON. 
This also involved making changes to PutFlowFile and PutColumn to carry around 
byte[] and not all strings. These changes are per JIRA NIFI-2620

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apsaltis/nifi nifi-2620

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/nifi/pull/914.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #914
    
----
commit 34659ab69a4bfa6ca0e21f10d158c5d07f44cf26
Author: Andrew Psaltis <[email protected]>
Date:   2016-08-23T02:34:45Z

    Adding support for Binary Row Keys for both PutHBaseCell and PutHBaseJSON. 
This also involved making changes to PutFlowFile and PutColumn to carry around 
byte[] and not all strings. These changes are per JIRA NIFI-2620

----


> Add ability to write Row Identifier as binary in hbase using the PutHbaseCell
> -----------------------------------------------------------------------------
>
>                 Key: NIFI-2620
>                 URL: https://issues.apache.org/jira/browse/NIFI-2620
>             Project: Apache NiFi
>          Issue Type: Improvement
>    Affects Versions: 1.0.0, 0.7.0, 0.6.1
>            Reporter: Andrew Psaltis
>            Assignee: Andrew Psaltis
>
> Today the PutHBaseCell processor makes the assumption that all row keys are 
> text. However, this does not work when the row key in the HBase table is 
> binary. 
> If the row key is specified in the binary string format, such as:
> \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x 
> 00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 
> \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x 
> 00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 
> \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x 
> 00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 
> \x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x 
> 00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 
> \x00\x00\x00\x00\x00\x00\x00\x01\x01\x00\x 
> 00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 
> \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x 
> 00\x00\x00\x00\x00\x00\x01\x00\x00\x01\x00 
> \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x 
> 00\x00\x01\x00\x00\x01\x00\x00\x00\x00\x01 
> \x01\x01\x00\x01\x00\x01\x01\x01\x00\x00\x 
> 00\x00\x00\x00\x01\x01\x01\x01\x00\x00\x00 
> \x00\x00\x00\x01\x01\x00\x01\x00\x01\x00\x 
> 00\x01\x01\x01\x01\x00\x00\x01\x01\x01\x00 
> \x01\x00\x00
> Which is the textual representation that the HBase CLI would return, NiFi 
> calls getBytes on that string. Appropriately HBase will encode the '\' with 
> the hex code: x5C, resulting in an output string that looks like:
> x5Cx00\x5Cx00\ ...........
> To address this the proposed solution would be to:
> *  Add "toBytesBinary" method to HBaseClientService  similar to the ones 
> already added [1]. 
> * Update the PutFlowFile and PutColumn to pass around mostly byte[] and not 
> strings that they do today.
> For this JIRA only support for Text and Binary will be added for the RowKey
> [1] 
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-services/nifi-hbase_1_1_2-client-service-bundle/nifi-hbase_1_1_2-client-service/src/main/java/org/apache/nifi/hbase/HBase_1_1_2_ClientService.java#L427



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to