[jira] [Commented] (CASSANDRA-6704) Create wide row scanners

Sylvain Lebresne (JIRA) Fri, 14 Feb 2014 10:25:14 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-6704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901737#comment-13901737
 ]


Sylvain Lebresne commented on CASSANDRA-6704:
---------------------------------------------

bq. Which is why I should be able to scratch my own itch.

I'm sorry but I disagree. When you open a ticket on this JIRA, you're not 
really in "scratching my own itch in my own backyard" territory anymore, you're 
saying "I'm suggesting this for the Cassandra project". And the Cassandra 
project is about more that just everyone scratching their own itches in 
isolation because that's a crappy way to develop software: we're trying to 
build a coherent piece of software. Don't get me wrong, itching can be a good 
motivation and the start of new ideas, but itching doesn't give you an inherent 
right to get something committed.

bq. Also no one every said this has to be a thrift only feature. I just chose 
to build the POC in thrift because this was easier for me.

Fair enough, but what I'm saying is "CQL is the Cassandra API moving forward" 
(that's the direction the project has been following for more than a year now) 
and so adding something to CQL and optionally to thrift (the legacy API) if 
that's trivial and relatively maintenance free is fine, but adding something to 
thrit and "maybe later to CQL but it's unclear how" is kind of not ok since it 
goes in the opposite direction of the project direction.

And well, so far, all you've provided us is a thrift-only POC and asked for 
criticism on the design. So I'm saying that, as is, that's kind of not really 
ok.  If you have something to suggest that is a good fit for CQL and just 
started with Thrift out of familiarity with it, then please do go on, but since 
CQL is the important part as far as the Cassandra project is concerned, I'll 
reserve judgement until the important part is here to see. If we then need less 
than 200 lines of additional code on top of that hypothetical solution to 
support Thrift too, then why not, I probably won't object to that.




> Create wide row scanners
> ------------------------
>
>                 Key: CASSANDRA-6704
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6704
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>
> The BigTable white paper demonstrates the use of scanners to iterate over 
> rows and columns. 
> http://static.googleusercontent.com/media/research.google.com/en/us/archive/bigtable-osdi06.pdf
> Because Cassandra does not have a primary sorting on row keys scanning over 
> ranges of row keys is less useful. 
> However we can use the scanner concept to operate on wide rows. For example 
> many times a user wishes to do some custom processing inside a row and does 
> not wish to carry the data across the network to do this processing. 
> I have already implemented thrift methods to compile dynamic groovy code into 
> Filters as well as some code that uses a Filter to page through and process 
> data on the server side.
> https://github.com/edwardcapriolo/cassandra/compare/apache:trunk...trunk
> The following is a working code snippet.
> {code}
>     @Test
>     public void test_scanner() throws Exception
>     {
>       ColumnParent cp = new ColumnParent();
>       cp.setColumn_family("Standard1");
>       ByteBuffer key = ByteBuffer.wrap("rscannerkey".getBytes());
>       for (char a='a'; a < 'g'; a++){
>         Column c1 = new Column();
>         c1.setName((a+"").getBytes());
>         c1.setValue(new byte [0]);
>         c1.setTimestamp(System.nanoTime());
>         server.insert(key, cp, c1, ConsistencyLevel.ONE);
>       }
>       
>       FilterDesc d = new FilterDesc();
>       d.setSpec("GROOVY_CLASS_LOADER");
>       d.setName("limit3");
>       d.setCode("import org.apache.cassandra.dht.* \n" +
>               "import org.apache.cassandra.thrift.* \n" +
>           "public class Limit3 implements SFilter { \n " +
>           "public FilterReturn filter(ColumnOrSuperColumn col, 
> List<ColumnOrSuperColumn> filtered) {\n"+
>           " filtered.add(col);\n"+
>           " return filtered.size()< 3 ? FilterReturn.FILTER_MORE : 
> FilterReturn.FILTER_DONE;\n"+
>           "} \n" +
>         "}\n");
>       server.create_filter(d);
>       
>       
>       ScannerResult res = server.create_scanner("Standard1", "limit3", key, 
> ByteBuffer.wrap("a".getBytes()));
>       Assert.assertEquals(3, res.results.size());
>     }
> {code}
> I am going to be working on this code over the next few weeks but I wanted to 
> get the concept our early so the design can see some criticism.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6704) Create wide row scanners

Reply via email to