[
https://issues.apache.org/jira/browse/LENS-788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15177377#comment-15177377
]
Amareshwari Sriramadasu commented on LENS-788:
----------------------------------------------
Looking at this again, I feel the filter should not be pushed.
For ex : cube select usersport.name, revenue where time_range_in(dt,x,y) and
usersport.name in ('Cricket').
If filter is pushed, it would actually remove entries corresponding to
usersport.name in ('Cricket'), but put out other values. I feel that is not the
user intention. query should do array contains instead of simple in check. So,
if dimension flattening is enabled, the filters "in" and "=" should get
converted to array_contains.
Query : select usersports.name, sum(msr2) from basecube where " +
TWO_DAYS_RANGE + " and usersports.name = 'CRICKET'
Rewritten query would like the following :
SELECT ( usersports . balias0 ), sum(( basecube . msr2 )) FROM
TestQueryRewrite.c1_testfact1_base basecube join TestQueryRewrite.c1_usertable
userdim on basecube.userid = userdim.id join (select user_interests.user_id as
user_id,collect_set(( usersports . name )) as balias0 from
TestQueryRewrite.c1_user_interests_tbl user_interests join
TestQueryRewrite.c1_sports_tbl usersports on user_interests.sport_id =
usersports.id group by user_interests.user_id) usersports on userdim.id =
usersports.user_id WHERE ((((( basecube . dt ) = '2016-02-29-19' ) or ((
basecube . dt ) = '2016-02-29-20' ) or (( basecube . dt ) = '2016-02-29-21' )
or (( basecube . dt ) = '2016-02-29-22' ) or (( basecube . dt ) =
'2016-02-29-23' ) or (( basecube . dt ) = '2016-03-01' ) or (( basecube . dt )
= '2016-03-02-00' ) or (( basecube . dt ) = '2016-03-02-01' ) or (( basecube .
dt ) = '2016-03-02-02' ) or (( basecube . dt ) = '2016-03-02-03' ) or ((
basecube . dt ) = '2016-03-02-04' ) or (( basecube . dt ) = '2016-03-02-05' )
or (( basecube . dt ) = '2016-03-02-06' ) or (( basecube . dt ) =
'2016-03-02-07' ) or (( basecube . dt ) = '2016-03-02-08' ) or (( basecube . dt
) = '2016-03-02-09' ) or (( basecube . dt ) = '2016-03-02-10' ) or (( basecube
. dt ) = '2016-03-02-11' ) or (( basecube . dt ) = '2016-03-02-12' ) or ((
basecube . dt ) = '2016-03-02-13' ) or (( basecube . dt ) = '2016-03-02-14' )
or (( basecube . dt ) = '2016-03-02-15' ) or (( basecube . dt ) =
'2016-03-02-16' ) or (( basecube . dt ) = '2016-03-02-17' ) or (( basecube . dt
) = '2016-03-02-18' )))) and (arraycontains( usersports . name, 'CRICKET' ))
GROUP BY ( usersports . balias0 )
Makes sense?
> Option to do flattening of columns on bridge tables later
> ---------------------------------------------------------
>
> Key: LENS-788
> URL: https://issues.apache.org/jira/browse/LENS-788
> Project: Apache Lens
> Issue Type: Improvement
> Components: cube
> Affects Versions: 2.4
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Fix For: 2.6
>
>
> With support to flatten (aggregate) columns from bridge table joins added in
> LENS-752, this is enhancement to apply aggregate later to applying filters or
> expression over the columns.
> For ex :
> With schema example given LENS-752, if user queries revenue with a filter :
> cube select usersport.name, revenue where time_range_in(dt,x,y) and
> usersport.name in ('Cricket').
> The query should have an option to apply filter before aggregate or not.
> Similarly :
> cube select (case when usersport.name='cricket' then 'CKT' when
> usersport.name='football' then 'FB' else 'NA' end), revenue where
> time_range_in(dt,x,y).
> The query should have option to apply expression before aggregate or not.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)