[
https://issues.apache.org/jira/browse/HBASE-17905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15965119#comment-15965119
]
Yi Liang commented on HBASE-17905:
----------------------------------
Hi Ted, thanks for reviewing. following is my change
+ if (partition < 0)
+ partition = partition * -1 + -2
+ if (partition < 0)
+ partition = 0
>From spark doc, I got info below:
{noformat}
public abstract class Partitioner extends Object
implements scala.Serializable
An object that defines how the elements in a key-value pair RDD are partitioned
by key. Maps each key to a partition ID, from 0 to numPartitions - 1.
{noformat}
So if the partition return less than 0, this <k,v> pair maybe not sent to any
partition.
there 2 situation that if will happen
(1) table not exist, the partition=-1, after partition = partition * -1 + -2,
it still -1. It need return 0 instead
(2) Table exist, and the row key is less than the first region startkey, which
is HConstants.EMPTY_BYTE_ARRAY, but it seems the rowkey would never less than
HConstants.EMPTY_BYTE_ARRAY.
> [hbase-spark] bulkload does not work when table not exist
> ----------------------------------------------------------
>
> Key: HBASE-17905
> URL: https://issues.apache.org/jira/browse/HBASE-17905
> Project: HBase
> Issue Type: Bug
> Reporter: Yi Liang
> Assignee: Yi Liang
> Attachments: HBASE-17905-V1.patch
>
>
> when using HBase-Spark bulkload api, an argument of tablename is needed, the
> bulkload can run successfully only if table exist in HBase. If table not
> exist, the bulkload can not run successfully and it even do not report any
> errors or throw exception.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)