[ https://issues.apache.org/jira/browse/HBASE-14340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15009797#comment-15009797 ]
Ted Malaska commented on HBASE-14340: ------------------------------------- Thank u Andrew. Let me know if there r any other jiras u would like me to look at. Thank again On Tuesday, November 17, 2015, Andrew Purtell (JIRA) <j...@apache.org> -- Sent from Gmail Mobile > Add second bulk load option to Spark Bulk Load to send puts as the value > ------------------------------------------------------------------------ > > Key: HBASE-14340 > URL: https://issues.apache.org/jira/browse/HBASE-14340 > Project: HBase > Issue Type: New Feature > Components: spark > Reporter: Ted Malaska > Assignee: Ted Malaska > Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-14340.1.patch, HBASE-14340.2.patch > > > The initial bulk load option for Spark bulk load sends values over one by one > through the shuffle. This is the similar to how the original MR bulk load > worked. > How ever the MR bulk loader have more then one bulk load option. There is a > second option that allows for all the Column Families, Qualifiers, and Values > or a row to be combined in the map side. > This only works if the row is not super wide. > But if the row is not super wide this method of sending values through the > shuffle will reduce the data and work the shuffle has to deal with. -- This message was sent by Atlassian JIRA (v6.3.4#6332)