[jira] [Commented] (NIFI-12825) Implement processor to get row key ranges for HBase regions

ASF subversion and git services (Jira) Wed, 06 Mar 2024 15:58:45 -0800


    [ 
https://issues.apache.org/jira/browse/NIFI-12825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17824219#comment-17824219
 ]


ASF subversion and git services commented on NIFI-12825:
--------------------------------------------------------

Commit bee65b8447303a49a5a244aed027ea387c96a2d8 in nifi's branch 
refs/heads/main from Emilio Setiadarma
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=bee65b8447 ]

NIFI-12825: implemented ListHBaseRegions processor

Signed-off-by: Matt Burgess <[email protected]>

This closes #8439


> Implement processor to get row key ranges for HBase regions
> -----------------------------------------------------------
>
>                 Key: NIFI-12825
>                 URL: https://issues.apache.org/jira/browse/NIFI-12825
>             Project: Apache NiFi
>          Issue Type: New Feature
>            Reporter: Emilio Setiadarma
>            Assignee: Emilio Setiadarma
>            Priority: Major
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> A common way for parallelizing scan operations to HBase is to scan by row key 
> ranges. In the HBase architecture, HBase splits tables into regions, each 
> with a range of row keys. These row key ranges are mutually exclusive, and 
> they include all the row keys.
> The manual approach currently to parallelize scans to HBase via row key 
> ranges is to go to HBase shell, perform the "list_regions" function to obtain 
> row key ranges. This approach has its downsides, most importantly being the 
> fact that row key ranges are not static. HBase regions may also split, 
> creating two regions with the row key range split in the middle.
> Providing a way for NiFi to obtain these row key ranges per HBase region 
> could help improve the ease of creating a flow that performs scans to HBase 
> parallelized by row key range. Once we know row key ranges, this information 
> could be easily fed into a scanning processor (i.e. ScanHBase).
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (NIFI-12825) Implement processor to get row key ranges for HBase regions

Reply via email to