Input and Output column families should be configured independently
-------------------------------------------------------------------
Key: CASSANDRA-1227
URL: https://issues.apache.org/jira/browse/CASSANDRA-1227
Project: Cassandra
Issue Type: Improvement
Components: Hadoop
Affects Versions: 0.7
Reporter: Bryan Tower
Fix For: 0.7
I would like to use a ColumnFamilyInputFormat and a ColumnFamilyRecordReader
to map a bunch of data from Cassandra to a job and then I would like to do some
operations on the data and in the Reducer write out some summary of the work
that I have done. Both the ColumnFamilyInputFormat and the
ColumnFamilyOutputFormat read the column family from the same configuration
property in the job configuration object (they both use the
ConfigHelper.COLUMNFAMILY_CONFIG property). This means that I can not read
from one Cassandra column family and write out to different one in the same job
with the existing code.
I changed the ColumnFamilyOutputFormat to read from
"cassandra.output.columnfamily" instead of the "cassandra.input.columnfamily"
that it was using before.
I changed the COLUMNFAMILY_CONFIG property and related methods to include the
word input. I also added corresponding Output versions of each of the relevant
properties that should be configured for the ColumnFamilyOutputFormat.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.