[ 
https://issues.apache.org/jira/browse/CASSANDRA-342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12745349#action_12745349
 ] 

Jeff Hodges commented on CASSANDRA-342:
---------------------------------------

So, my biggest problem with this patch right now is the boot up code and the 
way it combines with the local-only query code. It forces us into booting a 
brand new cassandra instance that assumes the data is already there and ready 
for the taking but only when a MapReduce task is being done. This is all sorts 
of bad news. 

There does not seem to be a way of getting to the internals of Cassandra we 
need (reading from and writing to the disk and memtable, figuring out what keys 
are on what nodes, etc.) without also having to boot all of the various 
Cassandra services. 

I'm looking for input on how we can get around that. 

FYI, the HBase way is to have HBase running on the machine already and throw up 
a connection to it from another process that is created with the information 
from the InputSplit (on the map task machines) and from the config files (on 
the initial machine that creates the InputSplits).

> hadoop integration
> ------------------
>
>                 Key: CASSANDRA-342
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-342
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>         Attachments: 0001-CASSANDRA-342.-Set-up-for-the-hadoop-commits.patch, 
> 0001-the-stupid-version-of-hadoop-support.patch, 
> 0002-CASSANDRA-342.-Working-hadoop-support.patch, 
> 0003-CASSNADRA-342.-Adding-the-WordCount-example.patch, 
> v2-squashed-commits-for-hadoop-stupid.patch
>
>
> Some discussion on -dev: 
> http://mail-archives.apache.org/mod_mbox/incubator-cassandra-dev/200907.mbox/%[email protected]%3e

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to