[jira] [Updated] (HBASE-14340) Add second bulk load option to Spark Bulk Load to send puts as the value
[ https://issues.apache.org/jira/browse/HBASE-14340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balazs Meszaros updated HBASE-14340: Fix Version/s: connector-1.0.0 Component/s: hbase-connectors > Add second bulk load option to Spark Bulk Load to send puts as the value > > > Key: HBASE-14340 > URL: https://issues.apache.org/jira/browse/HBASE-14340 > Project: HBase > Issue Type: New Feature > Components: hbase-connectors, spark >Reporter: Theodore michael Malaska >Assignee: Theodore michael Malaska >Priority: Minor > Fix For: 3.0.0, connector-1.0.0 > > Attachments: HBASE-14340.1.patch, HBASE-14340.2.patch > > > The initial bulk load option for Spark bulk load sends values over one by one > through the shuffle. This is the similar to how the original MR bulk load > worked. > How ever the MR bulk loader have more then one bulk load option. There is a > second option that allows for all the Column Families, Qualifiers, and Values > or a row to be combined in the map side. > This only works if the row is not super wide. > But if the row is not super wide this method of sending values through the > shuffle will reduce the data and work the shuffle has to deal with. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-14340) Add second bulk load option to Spark Bulk Load to send puts as the value
[ https://issues.apache.org/jira/browse/HBASE-14340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Malaska updated HBASE-14340: Attachment: HBASE-14340.2.patch Fixed copy paste issue. It was my mistake. The code was write on my laptop but I had made the patch out of sycn or something. Thanks for finding that. > Add second bulk load option to Spark Bulk Load to send puts as the value > > > Key: HBASE-14340 > URL: https://issues.apache.org/jira/browse/HBASE-14340 > Project: HBase > Issue Type: New Feature > Components: spark >Reporter: Ted Malaska >Assignee: Ted Malaska >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-14340.1.patch, HBASE-14340.2.patch > > > The initial bulk load option for Spark bulk load sends values over one by one > through the shuffle. This is the similar to how the original MR bulk load > worked. > How ever the MR bulk loader have more then one bulk load option. There is a > second option that allows for all the Column Families, Qualifiers, and Values > or a row to be combined in the map side. > This only works if the row is not super wide. > But if the row is not super wide this method of sending values through the > shuffle will reduce the data and work the shuffle has to deal with. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14340) Add second bulk load option to Spark Bulk Load to send puts as the value
[ https://issues.apache.org/jira/browse/HBASE-14340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-14340: --- Fix Version/s: 2.0.0 lgtm, except for a double write (cut-paste?) bug. See RB > Add second bulk load option to Spark Bulk Load to send puts as the value > > > Key: HBASE-14340 > URL: https://issues.apache.org/jira/browse/HBASE-14340 > Project: HBase > Issue Type: New Feature > Components: spark >Reporter: Ted Malaska >Assignee: Ted Malaska >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-14340.1.patch > > > The initial bulk load option for Spark bulk load sends values over one by one > through the shuffle. This is the similar to how the original MR bulk load > worked. > How ever the MR bulk loader have more then one bulk load option. There is a > second option that allows for all the Column Families, Qualifiers, and Values > or a row to be combined in the map side. > This only works if the row is not super wide. > But if the row is not super wide this method of sending values through the > shuffle will reduce the data and work the shuffle has to deal with. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14340) Add second bulk load option to Spark Bulk Load to send puts as the value
[ https://issues.apache.org/jira/browse/HBASE-14340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Malaska updated HBASE-14340: Attachment: HBASE-14340.1.patch Initial patch > Add second bulk load option to Spark Bulk Load to send puts as the value > > > Key: HBASE-14340 > URL: https://issues.apache.org/jira/browse/HBASE-14340 > Project: HBase > Issue Type: New Feature > Components: spark >Reporter: Ted Malaska >Assignee: Ted Malaska >Priority: Minor > Attachments: HBASE-14340.1.patch > > > The initial bulk load option for Spark bulk load sends values over one by one > through the shuffle. This is the similar to how the original MR bulk load > worked. > How ever the MR bulk loader have more then one bulk load option. There is a > second option that allows for all the Column Families, Qualifiers, and Values > or a row to be combined in the map side. > This only works if the row is not super wide. > But if the row is not super wide this method of sending values through the > shuffle will reduce the data and work the shuffle has to deal with. -- This message was sent by Atlassian JIRA (v6.3.4#6332)