davidzollo commented on code in PR #9938:
URL: https://github.com/apache/seatunnel/pull/9938#discussion_r2438417944


##########
seatunnel-connectors-v2/connector-hbase/src/main/java/org/apache/seatunnel/connectors/seatunnel/hbase/source/HbaseSourceSplitEnumerator.java:
##########
@@ -152,23 +172,60 @@ private void assignSplit(int taskId) {
         context.signalNoMoreSplits(taskId);
     }
 
-    /** Get all splits of table */
-    private Set<HbaseSourceSplit> getTableSplits() {
-        List<HbaseSourceSplit> splits = new ArrayList<>();
+    @VisibleForTesting
+    public Set<HbaseSourceSplit> getTableSplits() {
 
         try {
             RegionLocator regionLocator = 
hbaseClient.getRegionLocator(hbaseParameters.getTable());
             byte[][] startKeys = regionLocator.getStartKeys();
             byte[][] endKeys = regionLocator.getEndKeys();
-            if (startKeys.length != endKeys.length) {
-                throw new IOException(
-                        "Failed to create Splits for HBase table {}. HBase 
start keys and end keys not equal."
-                                + hbaseParameters.getTable());
-            }
+            List<HbaseSourceSplit> splits = new ArrayList<>();
+            boolean isBinaryRowkey = hbaseParameters.isBinaryRowkey();
+            byte[] userStartRowkey =
+                    HBaseUtil.convertRowKey(hbaseParameters.getStartRowkey(), 
isBinaryRowkey);
+            byte[] userEndRowkey =
+                    HBaseUtil.convertRowKey(hbaseParameters.getEndRowkey(), 
isBinaryRowkey);
+            HBaseUtil.validateRowKeyRange(userStartRowkey, userEndRowkey);
 
             int i = 0;
             while (i < startKeys.length) {
-                splits.add(new HbaseSourceSplit(i, startKeys[i], endKeys[i]));
+                byte[] regionStartKey = startKeys[i];
+                byte[] regionEndKey = endKeys[i];
+                if (userEndRowkey.length > 0
+                        && Bytes.compareTo(userEndRowkey, regionStartKey) <= 0

Review Comment:
   I think it should use `Bytes.compareTo(userEndRowkey, regionStartKey) < 0` , 
 `start key` and `end key` should [start_key, end_key),  end_key should not be 
inclusive.  When the end_rowkey configured by the user is exactly equal to the 
start key of a certain region, the condition `Bytes.compareTo(userEndRowkey, 
regionStartKey) <= 0` will make this region be ignored.
   How do you think?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to