We don't have aggregate functions with multiple arguments, so I can't provide any pointers. It's unclear what semantics you're trying to achieve with the multiple arguments. Can you give a concrete example? Based on your other example, you'd want to do a GROUP BY, like this:
select sum(col) from table group by country; On Mon, May 9, 2016 at 4:57 PM, Swapna Swapna <[email protected]> wrote: > Hi James/Team, > > > myaggFunc(col1,col2) > > I tried implementing this new aggregate function with multiple (2, to start > with) columns , expressions as arguments. > > And its giving me this error: > > index (1) must be less than size (1) > > > My function definition: > > @FunctionParseNode.BuiltInFunction(name = myaggFunc.NAME, nodeClass = > MyAggParseNode.class, args = { > @FunctionParseNode.Argument(), > @FunctionParseNode.Argument()}) > > is there any example that I can refer, which accepts multiple fields > as arguments to function. > > Any pointers would really help. > > Thank you, > > > > > > On Wed, May 4, 2016 at 12:05 AM, Swapna Swapna <[email protected]> > wrote: > > > Hi James, > > > > the new ones are in similar lines to existing aggregate functions: > > > > I misinterpreted this definition, thanks for clarifying : > > *A reference to a column is also an expression* > > > > Regards > > Swapna > > > > On Tue, May 3, 2016 at 11:39 PM, James Taylor <[email protected]> > > wrote: > > > >> Hi Swapna, > >> All our aggregate functions allow expressions as arguments and it > wouldn't > >> make sense to have these new ones be different. A reference to a column > is > >> also an expression. It doesn't change the HBase data model being sparse. > >> > >> I think the next step should be for you to submit a patch that the > >> community can take a look at, as it's too difficult to discuss this > >> without > >> that. > >> > >> Thanks, > >> James > >> > >> On Tuesday, May 3, 2016, Swapna Swapna <[email protected]> wrote: > >> > >> > Hi James, > >> > > >> > Thanks for your swift response. > >> > > >> > I wouldn't be able to use the expression in the below query rather I > >> would > >> > have to provide the columns (as arguments) which I'm interested in to > >> > perform the aggregation on respective provided columns. > >> > > >> > myaggFunc(col1,col2) > >> > > >> > the reason being, the hbase data is sparsed and I would not know the > >> column > >> > values. Data fetch is based on a row key. > >> > > >> > expression example: > >> > > >> > ID=1 OR NAME='Hi' > >> > > >> > Regards > >> > > >> > Swapna > >> > > >> > > >> > > >> > On Tue, May 3, 2016 at 7:17 PM, James Taylor <[email protected] > >> > <javascript:;>> wrote: > >> > > >> > > Hi Swapna, > >> > > The return type is typically derived from looking at the return > types > >> of > >> > > each of the input arguments and choosing what'll work without losing > >> > > precision. For example, take a look at this loop in > ExpressionCompiler > >> > that > >> > > determines this for expressions that are added together: > >> > > > >> > > new ArithmeticExpressionFactory() { > >> > > @Override > >> > > public Expression create(ArithmeticParseNode node, > >> > > List<Expression> children) throws SQLException { > >> > > boolean foundDate = false; > >> > > Determinism determinism = Determinism.ALWAYS; > >> > > PDataType theType = null; > >> > > for(int i = 0; i < children.size(); i++) { > >> > > > >> > > Your probably already doing this, but make sure you don't assume the > >> > > arguments are column references, but allow them to be any > expression. > >> > > > >> > > Also, it'd be great to see what you've got so far without handling > >> > multiple > >> > > arguments to your function (in the form of a pull request) so folks > >> can > >> > get > >> > > you feedback on your work so far. > >> > > > >> > > Thanks, and we appreciate the contributions! > >> > > > >> > > James > >> > > > >> > > On Tue, May 3, 2016 at 12:59 PM, Swapna Swapna < > >> [email protected] > >> > <javascript:;>> > >> > > wrote: > >> > > > >> > > > Sure, > >> > > > > >> > > > Hbase data that I have is: > >> > > > > >> > > > rowkey us uk > >> > > > 20161001 3 4 > >> > > > 20161002 1 2 > >> > > > > >> > > > > >> > > > select myaggFunc(us) from table : // this is returning output > as > >> : > >> > > > 4 > >> > > > select myaggFunc(uk) from table : // this is returning output > as > >> : > >> > > > 6 > >> > > > > >> > > > In similar to that, i'm visualizing the query like: select > >> > > > myaggFunc1(us,uk) > >> > > > from table; //with multiple columns > >> > > > > >> > > > to return output: (based on the aggregation logic, below results > >> are > >> > > for > >> > > > sum aggregation) > >> > > > us 4 > >> > > > uk 6 > >> > > > > >> > > > > >> > > > > >> > > > On Tue, May 3, 2016 at 11:33 AM, James Taylor < > >> [email protected] > >> > <javascript:;>> > >> > > > wrote: > >> > > > > >> > > > > Removing user list (please don't cross post) > >> > > > > > >> > > > > Can you give us a full example of the query you have in mind? > >> > > > > > >> > > > > Thanks, > >> > > > > James > >> > > > > > >> > > > > On Tue, May 3, 2016 at 11:14 AM, Swapna Swapna < > >> > [email protected] <javascript:;> > >> > > > > >> > > > > wrote: > >> > > > > > >> > > > > > Hi, > >> > > > > > > >> > > > > > I'm trying to implement aggregate function on multiple columns > >> (as > >> > an > >> > > > > > arguments) like: > >> > > > > > > >> > > > > > myaggFunc(col1,col2) > >> > > > > > > >> > > > > > And I would want to return the results by each column after > >> > applying > >> > > > > > aggregate operation. > >> > > > > > > >> > > > > > The output would be something like: > >> > > > > > > >> > > > > > col1, count ( aggregate of all records for col1) > >> > > > > > col2, count > >> > > > > > > >> > > > > > Inorder to return the results in the above format, what is the > >> > return > >> > > > > data > >> > > > > > type (of the method) should I have to choose? > >> > > > > > > >> > > > > > Thanks > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > > > > >
