[ 
https://issues.apache.org/jira/browse/LENS-788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15177377#comment-15177377
 ] 

Amareshwari Sriramadasu commented on LENS-788:
----------------------------------------------

Looking at this again, I feel the filter should not be pushed. 

For ex : cube select usersport.name, revenue where time_range_in(dt,x,y) and 
usersport.name in ('Cricket').

If filter is pushed, it would actually remove entries corresponding to 
usersport.name in ('Cricket'), but put out other values. I feel that is not the 
user intention. query should do array contains instead of simple in check. So, 
if dimension flattening is enabled, the filters "in" and "=" should get 
converted to array_contains.

Query : select usersports.name, sum(msr2) from basecube where " + 
TWO_DAYS_RANGE + " and usersports.name = 'CRICKET'
Rewritten query would like the following :
SELECT ( usersports . balias0 ), sum(( basecube . msr2 )) FROM 
TestQueryRewrite.c1_testfact1_base basecube join TestQueryRewrite.c1_usertable 
userdim on basecube.userid = userdim.id join  (select user_interests.user_id as 
user_id,collect_set(( usersports . name )) as balias0 from 
TestQueryRewrite.c1_user_interests_tbl user_interests join 
TestQueryRewrite.c1_sports_tbl usersports on user_interests.sport_id = 
usersports.id group by user_interests.user_id) usersports on userdim.id = 
usersports.user_id WHERE ((((( basecube . dt ) = '2016-02-29-19' ) or (( 
basecube . dt ) = '2016-02-29-20' ) or (( basecube . dt ) = '2016-02-29-21' ) 
or (( basecube . dt ) = '2016-02-29-22' ) or (( basecube . dt ) = 
'2016-02-29-23' ) or (( basecube . dt ) = '2016-03-01' ) or (( basecube . dt ) 
= '2016-03-02-00' ) or (( basecube . dt ) = '2016-03-02-01' ) or (( basecube . 
dt ) = '2016-03-02-02' ) or (( basecube . dt ) = '2016-03-02-03' ) or (( 
basecube . dt ) = '2016-03-02-04' ) or (( basecube . dt ) = '2016-03-02-05' ) 
or (( basecube . dt ) = '2016-03-02-06' ) or (( basecube . dt ) = 
'2016-03-02-07' ) or (( basecube . dt ) = '2016-03-02-08' ) or (( basecube . dt 
) = '2016-03-02-09' ) or (( basecube . dt ) = '2016-03-02-10' ) or (( basecube 
. dt ) = '2016-03-02-11' ) or (( basecube . dt ) = '2016-03-02-12' ) or (( 
basecube . dt ) = '2016-03-02-13' ) or (( basecube . dt ) = '2016-03-02-14' ) 
or (( basecube . dt ) = '2016-03-02-15' ) or (( basecube . dt ) = 
'2016-03-02-16' ) or (( basecube . dt ) = '2016-03-02-17' ) or (( basecube . dt 
) = '2016-03-02-18' ))))  and (arraycontains( usersports . name, 'CRICKET' )) 
GROUP BY ( usersports . balias0 )

Makes sense? 


> Option to do flattening of columns on bridge tables later
> ---------------------------------------------------------
>
>                 Key: LENS-788
>                 URL: https://issues.apache.org/jira/browse/LENS-788
>             Project: Apache Lens
>          Issue Type: Improvement
>          Components: cube
>    Affects Versions: 2.4
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 2.6
>
>
> With support to flatten (aggregate) columns from bridge table joins added in 
> LENS-752, this is enhancement to apply aggregate later to applying filters or 
> expression over the columns.
> For ex :
> With schema example given LENS-752, if user queries revenue with a filter :
> cube select usersport.name, revenue where time_range_in(dt,x,y) and 
> usersport.name in ('Cricket').
> The query should have an option to apply filter before aggregate or not.
> Similarly :
> cube select (case when usersport.name='cricket' then 'CKT' when 
> usersport.name='football' then 'FB' else 'NA' end), revenue where 
> time_range_in(dt,x,y).
> The query should have option to apply expression before aggregate or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to