RE: Query regarding Scalar Functions implementation

Harri Kinnunen Wed, 02 Oct 2013 01:43:57 -0700

Hi,

I'm not sure if we're talking about implementation specifics or what would be 
visible to end user.

But if this is about end user experience, I'd say the functionality should 
reflect the one in Oracle: 
SELECT 5/2 FROM DUAL;
Returns
2.5

Even:
select CAST(10 AS INTEGER)/CAST(3 AS INTEGER) FROM DUAL
returns:
3.33333333333333

Doing:
select DUMP(CAST(10 AS INTEGER)/CAST(2 AS INTEGER)) FROM DUAL
Reports the datatype as "NUMBER(precision, scale)".

So ... I guess we don't have the luxury of "NUMBER(p,s). But somehow we should 
not force users to do explicit casts to achieve the "obvious results" (as 
defined above :) ).

Cheers,
Harri

-----Original Message-----
From: Jason Altekruse [mailto:[email protected]] 
Sent: 2. lokakuuta 2013 1:57
To: drill-dev
Subject: Re: Query regarding Scalar Functions implementation

Hello All,

I would assume we would want to follow the conventions of most programming 
languages. If users are interested in a decimal result, they would have to 
explicitly cast one of the arguments to a float or float8.

In regards to mismatched types, there are two ways I can think if doing it.
We could define a bunch of overloaded methods for each combination, but it 
seems like we have to define each twice for different arrangements of the 
types, such as with  mult(float, float8) and mult(float*, float).

I think the way we will want to do it is add additional logic to the code 
generation portion of the query, rather than define a bunch of different 
functions.

For example, as new batches arrive at an operator, if they have a new schema we 
generate code to process the particular types of value vectors involved in the 
operation. I think at this step we should be able to add a cast to one of the 
parameters to direct to a function that defines an operation between two 
operands of the same type.

Example:
incoming types int, float
- cast first parameter to a float

Deciding which one to cast seems to be pretty standard, as seen here in the sql 
server documentation. They just define a strict hierarchy of types.

http://technet.microsoft.com/en-us/library/ms190309.aspx

The only problem I could see with this approach is that the Drill Funcs take 
the value holders as parameters, so we will have to define casting rules 
between the various types. Not sure what this will do for code inlining. A 
major goal of the templates and code generation was allowing UDFs while keeping 
the whole system fast.

It would also be possible to define additional methods on the various value 
vectors to allow extraction of values directly into different types, such as a 
double extraction method on the float vectors. This might aid inlining, as we 
handle a bit more of the logic while dealing with primitives (rather than 
pulling out a value, sticking it in a holder object and then casting the holder 
to a different object type).

-Jason

On Tue, Oct 1, 2013 at 1:06 AM, Yash Sharma <[email protected]>wrote:

> Hi Team,
> I had two  questions regarding the  implementation of  Scalar Functions.
>
> 1. What would be the Output type of Division func (given: Input types 
> are all Integers)
>
> Currently I have provided an implementation of the DIVISION func which 
> has input/output params as :
>         @Param  IntHolder left;
>         @Param  IntHolder right;
>          @Output IntHolder out;
>
> now, the issue is the data type of output field:
> output type will be integer if left & right are divisible integers, while..
> output type would be decimal if left & right are non-divisible 
> integers (i.e.  have a remainder)
>
> So my question is,
> Do I have to provide 3 overloaded methods for division with different 
> @output types, (IntHolder, Float4Holder, Float8Holder) ?
> or shall I have a  Float8 output type irrespective of the inputs?
>
> Other functions like add/multiple & subtract won't be having this issue..
> . It's only the issue with division.
>
>
> 2. What would be the input type for any Scalar func (given: Input 
> types might not always be Integers).
>
> Inputs would also be of different data types as Float4Holder & 
> Float8Holder, so would we have to provide overloaded methods for 
> different combinations of input types?
> This would be the case with all scalar functions (+_*/).
>
> Any Suggestions?
> Thanks,
> Yash Sharma
>
>
>
> ________________________________
>
>
>
>
>
>
> NOTE: This message may contain information that is confidential, 
> proprietary, privileged or otherwise protected by law. The message is 
> intended solely for the named addressee. If received in error, please 
> destroy and notify the sender. Any use of this email is prohibited 
> when received in error. Impetus does not represent, warrant and/or 
> guarantee, that the integrity of this communication has been 
> maintained nor that the communication is free of errors, virus, interception 
> or interference.
>

RE: Query regarding Scalar Functions implementation

Reply via email to