[
https://issues.apache.org/jira/browse/CASSANDRA-6704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901463#comment-13901463
]
Sylvain Lebresne commented on CASSANDRA-6704:
---------------------------------------------
bq. Everything CQL is right, and everything else is wrong?
I don't think that's really what people mean here. I believe the concern (maybe
I should say "my" concern, I'm really speaking in my own name here) is that it
would be a bad idea for C* to have 2 API (thrift and CQL) that continue to
evolve with set of features that fundamentally do the same thing but have
different implementations. In practice, the project don't want to maintain 2
APIs, we don't have infinite development resources and this is confusing for
users in the long run.
Thrift is the legacy API. We've promised to maintain it in it's current state
indefinitely (which *is* a non-negligible drain on the project resources btw),
and we are even fine exposing some new features through it when that require
very little maintenance effort (CAS for instance), but the C* API moving
forward, the one we are developing not just maintaining, is CQL.
This ticket seems non trivial and thrift-only by design and so, for the reason
I just expressed, I do not think that it's a good idea for the C* project and
agree that we should focus on tickets like CASSANDRA-4914 instead (and granted
no-one has had the time to focus on that yet, but that's really just proving my
point that development resources are never infinite. As a side note and for
what it's worth, I do intent to make ticket one of my priority for 3.0 (if
no-one else beats me to it of course)).
> Create wide row scanners
> ------------------------
>
> Key: CASSANDRA-6704
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6704
> Project: Cassandra
> Issue Type: New Feature
> Reporter: Edward Capriolo
> Assignee: Edward Capriolo
>
> The BigTable white paper demonstrates the use of scanners to iterate over
> rows and columns.
> http://static.googleusercontent.com/media/research.google.com/en/us/archive/bigtable-osdi06.pdf
> Because Cassandra does not have a primary sorting on row keys scanning over
> ranges of row keys is less useful.
> However we can use the scanner concept to operate on wide rows. For example
> many times a user wishes to do some custom processing inside a row and does
> not wish to carry the data across the network to do this processing.
> I have already implemented thrift methods to compile dynamic groovy code into
> Filters as well as some code that uses a Filter to page through and process
> data on the server side.
> https://github.com/edwardcapriolo/cassandra/compare/apache:trunk...trunk
> The following is a working code snippet.
> {code}
> @Test
> public void test_scanner() throws Exception
> {
> ColumnParent cp = new ColumnParent();
> cp.setColumn_family("Standard1");
> ByteBuffer key = ByteBuffer.wrap("rscannerkey".getBytes());
> for (char a='a'; a < 'g'; a++){
> Column c1 = new Column();
> c1.setName((a+"").getBytes());
> c1.setValue(new byte [0]);
> c1.setTimestamp(System.nanoTime());
> server.insert(key, cp, c1, ConsistencyLevel.ONE);
> }
>
> FilterDesc d = new FilterDesc();
> d.setSpec("GROOVY_CLASS_LOADER");
> d.setName("limit3");
> d.setCode("import org.apache.cassandra.dht.* \n" +
> "import org.apache.cassandra.thrift.* \n" +
> "public class Limit3 implements SFilter { \n " +
> "public FilterReturn filter(ColumnOrSuperColumn col,
> List<ColumnOrSuperColumn> filtered) {\n"+
> " filtered.add(col);\n"+
> " return filtered.size()< 3 ? FilterReturn.FILTER_MORE :
> FilterReturn.FILTER_DONE;\n"+
> "} \n" +
> "}\n");
> server.create_filter(d);
>
>
> ScannerResult res = server.create_scanner("Standard1", "limit3", key,
> ByteBuffer.wrap("a".getBytes()));
> Assert.assertEquals(3, res.results.size());
> }
> {code}
> I am going to be working on this code over the next few weeks but I wanted to
> get the concept our early so the design can see some criticism.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)