[ 
https://issues.apache.org/jira/browse/LENS-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14578715#comment-14578715
 ] 

Rajat Khandelwal commented on LENS-582:
---------------------------------------

The first queried range 19 to 21 can't be answered by weekly facts, sunday or 
not. 

As far as the "Sunday" issue is concerned, lens just requires you to roll up 
your data on this boundary, and that is all. There are no other restrictions. 

I hope your data model is not creating one fact for each update period. While I 
understand that's a valid case, but a more practical case is that the weekly 
data is just roll up of daily data. Then, the fact would have both daily and 
weekly update periods. Then any query on day boundaries would be queryable. 

The error is coming because lens first checks whether fact can cover the given 
range or not. A fact can cover the range only if its update periods can cover 
the range. A fact which has only weekly update period can't be expected to 
cover any arbitrary day-boundary ranges. A fact which has both daily and weekly 
update periods *can*. 

So if you create the fact with two update periods, it can cover any 
day-boundary ranges. Then, partitions will be picked accordingly. Weekly 
partitions will be picked as much as possible, the rest will be covered by 
daily. 

With this knowledge, you can even game the system by registering daily as an 
update period and not adding any actual daily partitions. You'd have to set 
fail on partial = false while querying, and your weekly sub-range will be 
formed for you by lens. anything before first sunday and after last sunday will 
be ignored. 



> in lens query fact table update period weekly throws error if start and end 
> date is not sunday
> ----------------------------------------------------------------------------------------------
>
>                 Key: LENS-582
>                 URL: https://issues.apache.org/jira/browse/LENS-582
>             Project: Apache Lens
>          Issue Type: Bug
>          Components: build
>            Reporter: Biru Kumar
>            Assignee: Rajat Khandelwal
>
> Lens query
> {noformat}
> lens-shell>query execute cube select  avg(servedImpressions) from 
> user_activity where time_range_in(dt, '2015-01-19', '2015-01-21')
> Launching query failed cause: Driver :org.apache.lens.driver.hive.HiveDriver 
> Cause :No candidate fact table available to answer the query, because 
> {"brief":"No fact update periods for given 
> range","details":{"user_attributestore_er_fact_supply_site_burn,user_attributestore_er_fact_demandcategory_click,user_attributestore_er_fact_supplycategory_visits,user_attributestore_er_fact_supply_site_impressions_rendered,user_attributestore_er_fact_adgroup_click,user_attributestore_er_fact_adgroup_impression_time_install,user_attributestore_er_fact_app_impression_time_install,user_attributestore_er_fact_supply_site_impressions_served,user_attributestore_er_fact_adgroup_burn,hive_fact_user_curation_good_traffic,user_attributestore_er_fact_app_visits,user_attributestore_er_fact_app_click,user_attributestore_er_fact_supply_site_click,user_attributestore_er_fact_adgroup_impressions_rendered":[{"cause":"COLUMN_NOT_FOUND","missingColumns":["servedimpressions"]}],"user_attributestore_er_fact_adgroup_view":[{"cause":"NO_FACT_UPDATE_PERIODS_FOR_GIVEN_RANGE"}]}}
> {noformat}
> fact table user_attributestore_er_fact_adgroup_view has the coloumn 
> servedImpressions there and its update period is weekly.
> in the above query i have selected start and end date that does not fall on 
> sunday.
> below the definition of fact table user_attributestore_er_fact_adgroup_view
> {noformat}
> lens-shell>describe fact user_attributestore_er_fact_adgroup_view
> columns :
> column :
>    name : userid  type : string
>    name : timestamp  type : timestamp
>    name : adgroupguid  type : string
>    name : servedimpressions  type : bigint
> properties :
> property :
>    name : cube.table.user_attributestore_er_fact_adgroup_view.weight  value : 
> 0.1
>    name : cube.table.type  value : FACT
>    name : 
> cube.fact.user_attributestore_er_fact_adgroup_view.uh1_hdfs.updateperiods  
> value : WEEKLY
>    name : cube.fact.is.aggregated  value : false
>    name : cube.fact.user_attributestore_er_fact_adgroup_view.cubename  value 
> : user_activity
>    name : transient_lastDdlTime  value : 1431973737
>    name : cube.fact.user_attributestore_er_fact_adgroup_view.storages  value 
> : uh1_hdfs
> storageTables :
> storageTable :
>    updatePeriods :   updatePeriod :  WEEKLY
>     storageName : uh1_hdfs  tableDesc :   partCols :   column :
>    name : dt  type : string  comment : Date partition field
>     tableParameters :   property :
>    name : cube.storagetable.partition.timeline.cache.WEEKLY.dt.latest  value 
> : 2015-W08     name : conf.attributestore.schema  value : 
> \granularity\:\weekly_dailycumulative\\erinfo\:\source\:null\erid\:\8589934594\\entityType\:\ADGROUP\\entityAlias\:\adGroupGuid\\relationshipType\:\VIEW\\fields\:\alias\:\servedImpressions\\name\:\count\\type\:\LONG\
> \stores\:\viewvisit\
>      name : cube.storagetable.partition.timeline.cache.present  value : true  
>    name : EXTERNAL  value : TRUE     name : cube.storagetable.time.partcols  
> value : dt     name : 
> cube.storagetable.partition.timeline.cache.WEEKLY.dt.first  value : 2015-W07  
>    name : transient_lastDdlTime  value : 1432301397     name : 
> cube.storagetable.partition.timeline.cache.WEEKLY.dt.holes.size  value : 0    
>  name : conf.debugmode  value : true     name : 
> cube.storagetable.partition.timeline.cache.WEEKLY.dt.storage.class  value : 
> org.apache.lens.cube.metadata.timeline.EndsAndHolesPartitionTimeline
>     serdeParameters :   property :
>    name : serialization.format  value : 1
>     timePartCols :  dt
>   external : true  tableLocation : 
> hdfs://hostname:8020/user/hive/warehouse/user.db/uh1_hdfs_user_attributestore_er_fact_adgroup_view
>   inputFormat : 
> com.inmobi.user.analytics.storage.inputformat.UserAttributeThriftInputFormat  
> outputFormat : org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat  
> serdeClassName : 
> com.inmobi.user.analytics.storage.serde.UserAttributeThriftSerde  
> storageHandlerName :   numBuckets : 0  compressed : false
> name : user_attributestore_er_fact_adgroup_view
> cubeName : user_activity
> weight : 0.1
> {noformat} 
> however below query runs successfully because start and end date mentioned 
> below are sunday.
> {noformat}
> lens-shell>query execute cube select avg(servedImpressions) from 
> user_activity where time_range_in(dt, '2015-02-08', '2015-03-01')
> _c0
> 2.7083333333333335
> 1 rows process in (20) seconds.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to