[
https://issues.apache.org/jira/browse/CASSANDRA-342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12745349#action_12745349
]
Jeff Hodges commented on CASSANDRA-342:
---------------------------------------
So, my biggest problem with this patch right now is the boot up code and the
way it combines with the local-only query code. It forces us into booting a
brand new cassandra instance that assumes the data is already there and ready
for the taking but only when a MapReduce task is being done. This is all sorts
of bad news.
There does not seem to be a way of getting to the internals of Cassandra we
need (reading from and writing to the disk and memtable, figuring out what keys
are on what nodes, etc.) without also having to boot all of the various
Cassandra services.
I'm looking for input on how we can get around that.
FYI, the HBase way is to have HBase running on the machine already and throw up
a connection to it from another process that is created with the information
from the InputSplit (on the map task machines) and from the config files (on
the initial machine that creates the InputSplits).
> hadoop integration
> ------------------
>
> Key: CASSANDRA-342
> URL: https://issues.apache.org/jira/browse/CASSANDRA-342
> Project: Cassandra
> Issue Type: New Feature
> Components: Core
> Reporter: Jonathan Ellis
> Attachments: 0001-CASSANDRA-342.-Set-up-for-the-hadoop-commits.patch,
> 0001-the-stupid-version-of-hadoop-support.patch,
> 0002-CASSANDRA-342.-Working-hadoop-support.patch,
> 0003-CASSNADRA-342.-Adding-the-WordCount-example.patch,
> v2-squashed-commits-for-hadoop-stupid.patch
>
>
> Some discussion on -dev:
> http://mail-archives.apache.org/mod_mbox/incubator-cassandra-dev/200907.mbox/%[email protected]%3e
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.