Hi James, the new ones are in similar lines to existing aggregate functions:
I misinterpreted this definition, thanks for clarifying : *A reference to a column is also an expression* Regards Swapna On Tue, May 3, 2016 at 11:39 PM, James Taylor <[email protected]> wrote: > Hi Swapna, > All our aggregate functions allow expressions as arguments and it wouldn't > make sense to have these new ones be different. A reference to a column is > also an expression. It doesn't change the HBase data model being sparse. > > I think the next step should be for you to submit a patch that the > community can take a look at, as it's too difficult to discuss this without > that. > > Thanks, > James > > On Tuesday, May 3, 2016, Swapna Swapna <[email protected]> wrote: > > > Hi James, > > > > Thanks for your swift response. > > > > I wouldn't be able to use the expression in the below query rather I > would > > have to provide the columns (as arguments) which I'm interested in to > > perform the aggregation on respective provided columns. > > > > myaggFunc(col1,col2) > > > > the reason being, the hbase data is sparsed and I would not know the > column > > values. Data fetch is based on a row key. > > > > expression example: > > > > ID=1 OR NAME='Hi' > > > > Regards > > > > Swapna > > > > > > > > On Tue, May 3, 2016 at 7:17 PM, James Taylor <[email protected] > > <javascript:;>> wrote: > > > > > Hi Swapna, > > > The return type is typically derived from looking at the return types > of > > > each of the input arguments and choosing what'll work without losing > > > precision. For example, take a look at this loop in ExpressionCompiler > > that > > > determines this for expressions that are added together: > > > > > > new ArithmeticExpressionFactory() { > > > @Override > > > public Expression create(ArithmeticParseNode node, > > > List<Expression> children) throws SQLException { > > > boolean foundDate = false; > > > Determinism determinism = Determinism.ALWAYS; > > > PDataType theType = null; > > > for(int i = 0; i < children.size(); i++) { > > > > > > Your probably already doing this, but make sure you don't assume the > > > arguments are column references, but allow them to be any expression. > > > > > > Also, it'd be great to see what you've got so far without handling > > multiple > > > arguments to your function (in the form of a pull request) so folks can > > get > > > you feedback on your work so far. > > > > > > Thanks, and we appreciate the contributions! > > > > > > James > > > > > > On Tue, May 3, 2016 at 12:59 PM, Swapna Swapna <[email protected] > > <javascript:;>> > > > wrote: > > > > > > > Sure, > > > > > > > > Hbase data that I have is: > > > > > > > > rowkey us uk > > > > 20161001 3 4 > > > > 20161002 1 2 > > > > > > > > > > > > select myaggFunc(us) from table : // this is returning output as : > > > > 4 > > > > select myaggFunc(uk) from table : // this is returning output as : > > > > 6 > > > > > > > > In similar to that, i'm visualizing the query like: select > > > > myaggFunc1(us,uk) > > > > from table; //with multiple columns > > > > > > > > to return output: (based on the aggregation logic, below results > are > > > for > > > > sum aggregation) > > > > us 4 > > > > uk 6 > > > > > > > > > > > > > > > > On Tue, May 3, 2016 at 11:33 AM, James Taylor < > [email protected] > > <javascript:;>> > > > > wrote: > > > > > > > > > Removing user list (please don't cross post) > > > > > > > > > > Can you give us a full example of the query you have in mind? > > > > > > > > > > Thanks, > > > > > James > > > > > > > > > > On Tue, May 3, 2016 at 11:14 AM, Swapna Swapna < > > [email protected] <javascript:;> > > > > > > > > > wrote: > > > > > > > > > > > Hi, > > > > > > > > > > > > I'm trying to implement aggregate function on multiple columns > (as > > an > > > > > > arguments) like: > > > > > > > > > > > > myaggFunc(col1,col2) > > > > > > > > > > > > And I would want to return the results by each column after > > applying > > > > > > aggregate operation. > > > > > > > > > > > > The output would be something like: > > > > > > > > > > > > col1, count ( aggregate of all records for col1) > > > > > > col2, count > > > > > > > > > > > > Inorder to return the results in the above format, what is the > > return > > > > > data > > > > > > type (of the method) should I have to choose? > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > >
