[ https://issues.apache.org/jira/browse/CASSANDRA-7395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056100#comment-14056100 ]
Sylvain Lebresne commented on CASSANDRA-7395: --------------------------------------------- I'm sorry to disagree but I'd rather not depend on AbstractType/TypeParser, even at first. We shouldn't redo the mistakes of the past by making internal stuff part of the API. And AbstractType/TypeParser are very much internal classes. Also, regarding using ByteBuffer for arguments, while I've mentioned it myself initially, I don't think it's a good idea anymore, we should have proper java types right away if we can help it. Which leads me to the following suggestion: reuse the java driver (we probably won't use a whole lot of stuff, probably mostly the DataType class and the related ones). It already knows about all CQL types with a well defined mapping to java types, handles collections, UDT, ... And its APIs are meant to be stable/public. And though we should ignore that for this ticket, we will be able to reuse the mapper for UDTs, which is neat. The other point I'd like us to consider is the fact that this ticket is only a first step, but I'm strongly hoping we can get CASSANDRA-7526 not too long after this. So I think we should make as many concepts consistent between the two as possible. Which kind of mean not relying on java annotations but rather keep concepts in CQL as much as possible. Typically, we could support a syntax like: {noformat} CREATE FUNCTION sum (a bigint, b bigint) AS my.company.Functions.sum; {noformat} >From that, we'll just make sure the function pointed takes two integers and >return one. Granted it's slightly less quick to define each functions that have them automatically defined from the class itself if you have crap-tons of them, but I don't think that's a big deal. If one wants to update a function definition, we can have a specific syntax (<hint>follow-up ticket</hint>): {noformat} UPDATE FUNCTION sum (a bigint, b bigint) AS my.company.Functions.sum2; {noformat} Again, the advantages are that it's explicit and will neatly extend to CASSANDRA-7526. Having function definitions not be extraneous to CQL also mean that we can enforce some security rules (that is to have per-user rights to define/update/remove functions). This also makes it clear when say notifications should be sent for newly added functions, etc... Regarding bundle/namespaces, I agree it's good to have some and I'm not sure what would be the best way to make them fit with what's above. Maybe it's enough to say that if you define a function {{Math.sum}} it's part of the {{Math}} namespace without having to define namespaces explicitely. Or maybe we want a specific syntax to create them, I'm not sure. But here again, I'd rather have us think about CASSANDRA-7526. Defining bundles by using java annotation will not work there and so I'd rather define the notion in CQL directly. bq. Additionally, it allows for some optimizations. For example, a collectionLength() function could simply deserialize the first four bytes. While true, the number of optimizations that can be done without deserializing is not that numerous. I don't see very many outside of the length of collections in fact. So I'd be fine just saying that we provide those out of the box (I'll note that for functions that want to work on the raw bytes of a value, the proper way to do it is to declare the function on blob, and to use the textAsBlob, intAsBlob, ... functions). bq. @UDF(deterministic = false) I'll note that imo we should ignore this entirely for UDFs: let's just execute every UDF at execution time (i.e. always use isPure() == false internally). The fact we have a distinction internally is questionable in the first place anyway. > Support for pure user-defined functions (UDF) > --------------------------------------------- > > Key: CASSANDRA-7395 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7395 > Project: Cassandra > Issue Type: New Feature > Components: API, Core > Reporter: Jonathan Ellis > Labels: cql > Fix For: 3.0 > > Attachments: 7395-v2.diff, 7395.diff > > > We have some tickets for various aspects of UDF (CASSANDRA-4914, > CASSANDRA-5970, CASSANDRA-4998) but they all suffer from various degrees of > ocean-boiling. > Let's start with something simple: allowing pure user-defined functions in > the SELECT clause of a CQL query. That's it. > By "pure" I mean, must depend only on the input parameters. No side effects. > No exposure to C* internals. Column values in, result out. > http://en.wikipedia.org/wiki/Pure_function -- This message was sent by Atlassian JIRA (v6.2#6252)