[
https://issues.apache.org/jira/browse/HBASE-14340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15010168#comment-15010168
]
Hudson commented on HBASE-14340:
--------------------------------
FAILURE: Integrated in HBase-Trunk_matrix #476 (See
[https://builds.apache.org/job/HBase-Trunk_matrix/476/])
HBASE-14340 Add second bulk load option to Spark Bulk Load to send puts
(apurtell: rev ca1048415bd842bb725357c4005da70788a79b02)
*
hbase-spark/src/main/scala/org/apache/hadoop/hbase/spark/FamiliesQualifiersValues.scala
*
hbase-spark/src/main/scala/org/apache/hadoop/hbase/spark/HBaseRDDFunctions.scala
*
hbase-spark/src/main/scala/org/apache/hadoop/hbase/spark/BulkLoadPartitioner.scala
* hbase-spark/src/test/scala/org/apache/hadoop/hbase/spark/BulkLoadSuite.scala
*
hbase-spark/src/main/scala/org/apache/hadoop/hbase/spark/ByteArrayWrapper.scala
* hbase-spark/src/main/scala/org/apache/hadoop/hbase/spark/HBaseContext.scala
> Add second bulk load option to Spark Bulk Load to send puts as the value
> ------------------------------------------------------------------------
>
> Key: HBASE-14340
> URL: https://issues.apache.org/jira/browse/HBASE-14340
> Project: HBase
> Issue Type: New Feature
> Components: spark
> Reporter: Ted Malaska
> Assignee: Ted Malaska
> Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-14340.1.patch, HBASE-14340.2.patch
>
>
> The initial bulk load option for Spark bulk load sends values over one by one
> through the shuffle. This is the similar to how the original MR bulk load
> worked.
> How ever the MR bulk loader have more then one bulk load option. There is a
> second option that allows for all the Column Families, Qualifiers, and Values
> or a row to be combined in the map side.
> This only works if the row is not super wide.
> But if the row is not super wide this method of sending values through the
> shuffle will reduce the data and work the shuffle has to deal with.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)