[
https://issues.apache.org/jira/browse/CASSANDRA-7776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Paul Pak updated CASSANDRA-7776:
--------------------------------
Fix Version/s: 2.1.1
> Allow multiple MR jobs to concurrently write to the same column family from
> the same node using CqlBulkOutputFormat
> -------------------------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-7776
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7776
> Project: Cassandra
> Issue Type: Improvement
> Components: Hadoop
> Reporter: Paul Pak
> Assignee: Paul Pak
> Priority: Minor
> Labels: cql3, hadoop
> Fix For: 2.1.1
>
> Attachments: trunk-7776-v1.txt
>
>
> After sstable files are written, all files in the specified output directory
> are loaded (transferred) to the remote cassandra cluster. If multiple writes
> occur on a node to the same table (i.e. directory), then the multiple load
> processes end up transferring the same sstable files multiple times.
> Furthermore, if directory cleanup of successful outputs is set to occur
> ([CASSANDRA-7777|https://issues.apache.org/jira/browse/CASSANDRA-7777]), then
> there could be errors caused by write/load contention.
> This can be simply remedied by using unique output directories for each MR
> job.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)