[
https://issues.apache.org/jira/browse/CASSANDRA-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12902285#action_12902285
]
Stu Hood commented on CASSANDRA-1368:
-------------------------------------
> But ColumnOrSuperColumn isn't a whole lot better
This argument applies equally well to our client API.
Switching to JSON in our client API would arguably have less effect than
switching to JSON here, since client interactions are more frequently
bottlenecked by network latency, while a streaming API should always be
bottlenecked on throughput. Smaller objects are better in both locations, but
gain us more benefit here.
> and has the drawback of inflicting Yet Another Serialization Format
Considering that the entire interaction with Avro is ~20 lines of code (most of
which is simply creating dictionaries, which you would have to do for JSON
serialization anyway), I don't think we're inconveniencing folks.
> Add output support for Hadoop Streaming
> ---------------------------------------
>
> Key: CASSANDRA-1368
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1368
> Project: Cassandra
> Issue Type: New Feature
> Components: Hadoop
> Reporter: Stu Hood
> Fix For: 0.7 beta 2
>
> Attachments: 0001-Switch-to-Cloudera-s-Distribution-of-Hadoop.patch,
> 0002-Add-an-Avro-OutputReader-and-Resolver-for-Hadoop-Str.patch,
> 0003-Apply-the-deprecated-OutputFormat-interface-to-allow.patch,
> 0004-Add-Streaming-example-shell-scripts.patch
>
>
> Hadoop Streaming is a framework that allows mapreduce jobs to be written in
> languages other than Java, by performing simple IPC on stdin/stdout.
> Adding output support for Hadoop Streaming to Cassandra would mean that users
> could write very simple scripts in dynamic languages to load data into
> Cassandra. Once our Hadoop OutputFormat has stabilized a bit, we might also
> be able to this code to provide scalable bulk loading.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.