[ 
https://issues.apache.org/jira/browse/HBASE-17905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15965119#comment-15965119
 ] 

Yi Liang commented on HBASE-17905:
----------------------------------

Hi Ted, thanks for reviewing. following is my change 
+ if (partition < 0)
+      partition = partition * -1 + -2
+    if (partition < 0)
+      partition = 0

>From spark doc, I got  info below:
{noformat}
public abstract class Partitioner extends Object
implements scala.Serializable
An object that defines how the elements in a key-value pair RDD are partitioned 
by key. Maps each key to a partition ID, from 0 to numPartitions - 1.
{noformat}
So if the partition return less than 0, this <k,v> pair maybe not sent to any 
partition.

there 2 situation that if will happen
(1) table not exist, the partition=-1, after partition = partition * -1 + -2, 
it still -1. It need return 0 instead
(2) Table exist, and the row key is less than the first region startkey, which 
is HConstants.EMPTY_BYTE_ARRAY, but it seems the rowkey would never less than 
HConstants.EMPTY_BYTE_ARRAY. 



> [hbase-spark]  bulkload does not work when table not exist
> ----------------------------------------------------------
>
>                 Key: HBASE-17905
>                 URL: https://issues.apache.org/jira/browse/HBASE-17905
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Yi Liang
>            Assignee: Yi Liang
>         Attachments: HBASE-17905-V1.patch
>
>
> when using HBase-Spark bulkload api, an argument of tablename is needed, the 
> bulkload can run successfully only if  table exist in HBase.  If table not 
> exist, the bulkload can not run successfully and it even do not report any 
> errors or throw exception. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to