[ 
https://issues.apache.org/jira/browse/CASSANDRA-6704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902430#comment-13902430
 ] 

Benedict commented on CASSANDRA-6704:
-------------------------------------

bq. You started the answer with "sorta". You are still allowing a user to put 
code in the execution path. It is the exact same problem, if you let someone 
compile dynamic code or you let the admin put the jar in a folder. You give 
someone the potential to break something. All dynamic compiling does is make 
the result faster to break and faster to fix.

This is very different. You're equating admins setting up the system with users 
querying it, which are not the same. In your system, it may be, as you may 
control all access paths to the database. But this is not the common case, and 
we should not assume it is. Sandboxing seems absolutely essential, or it needs 
to be disabled by default, in which case why not just have them drop in an 
extra jar?

bq. Please do not imply that this feature is not coherent, or bad which has 
been done several times already. This is a good feature.

I am not suggesting it is an incoherent or bad feature in isolation, and I 
don't think anybody is. When I say coherent, I mean how it fits in with the 
overall progress and development of the project, and how users interact with 
the database. This is a pretty left field introduction, that doesn't fit 
cleanly with anything we have currently. I do not intend to give the impression 
I am judging the feature itself negatively; in fact I think it's pretty neat. I 
just think whether neatness is enough is up for discussion.

bq. I am not asking other developers who make features with no votes to put 
changes and in forks you should not ask me to do the same.

I am sorry, but I don't follow this?

The other developers here are expressing concern that this new feature will 
place a future burden on them that you will not be able to alleviate. I think 
given the nature and scope of the change that is a reasonable concern, that is 
not brushed over lightly. It is not that anybody is singling out "your changes" 
- this concern would be true regardless of the person suggesting it and, 
frankly, were it not for your position in the community there probably would 
not have been anywhere near this level of serious engagement with the 
discussion.

They are also expressing concern that it will create a polluted vision of the 
future of C*, with multiple conflicting ways to achieve something, both of 
which nontrivial to understand, creating further demands on them and the wider 
community in trying to explain all of these features to newcomers, with the 
confusion potentially further exacerbating the negative perception of 
Cassandra's ease of use.



> Create wide row scanners
> ------------------------
>
>                 Key: CASSANDRA-6704
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6704
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>
> The BigTable white paper demonstrates the use of scanners to iterate over 
> rows and columns. 
> http://static.googleusercontent.com/media/research.google.com/en/us/archive/bigtable-osdi06.pdf
> Because Cassandra does not have a primary sorting on row keys scanning over 
> ranges of row keys is less useful. 
> However we can use the scanner concept to operate on wide rows. For example 
> many times a user wishes to do some custom processing inside a row and does 
> not wish to carry the data across the network to do this processing. 
> I have already implemented thrift methods to compile dynamic groovy code into 
> Filters as well as some code that uses a Filter to page through and process 
> data on the server side.
> https://github.com/edwardcapriolo/cassandra/compare/apache:trunk...trunk
> The following is a working code snippet.
> {code}
>     @Test
>     public void test_scanner() throws Exception
>     {
>       ColumnParent cp = new ColumnParent();
>       cp.setColumn_family("Standard1");
>       ByteBuffer key = ByteBuffer.wrap("rscannerkey".getBytes());
>       for (char a='a'; a < 'g'; a++){
>         Column c1 = new Column();
>         c1.setName((a+"").getBytes());
>         c1.setValue(new byte [0]);
>         c1.setTimestamp(System.nanoTime());
>         server.insert(key, cp, c1, ConsistencyLevel.ONE);
>       }
>       
>       FilterDesc d = new FilterDesc();
>       d.setSpec("GROOVY_CLASS_LOADER");
>       d.setName("limit3");
>       d.setCode("import org.apache.cassandra.dht.* \n" +
>               "import org.apache.cassandra.thrift.* \n" +
>           "public class Limit3 implements SFilter { \n " +
>           "public FilterReturn filter(ColumnOrSuperColumn col, 
> List<ColumnOrSuperColumn> filtered) {\n"+
>           " filtered.add(col);\n"+
>           " return filtered.size()< 3 ? FilterReturn.FILTER_MORE : 
> FilterReturn.FILTER_DONE;\n"+
>           "} \n" +
>         "}\n");
>       server.create_filter(d);
>       
>       
>       ScannerResult res = server.create_scanner("Standard1", "limit3", key, 
> ByteBuffer.wrap("a".getBytes()));
>       Assert.assertEquals(3, res.results.size());
>     }
> {code}
> I am going to be working on this code over the next few weeks but I wanted to 
> get the concept our early so the design can see some criticism.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to