[
https://issues.apache.org/jira/browse/LENS-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14578715#comment-14578715
]
Rajat Khandelwal commented on LENS-582:
---------------------------------------
The first queried range 19 to 21 can't be answered by weekly facts, sunday or
not.
As far as the "Sunday" issue is concerned, lens just requires you to roll up
your data on this boundary, and that is all. There are no other restrictions.
I hope your data model is not creating one fact for each update period. While I
understand that's a valid case, but a more practical case is that the weekly
data is just roll up of daily data. Then, the fact would have both daily and
weekly update periods. Then any query on day boundaries would be queryable.
The error is coming because lens first checks whether fact can cover the given
range or not. A fact can cover the range only if its update periods can cover
the range. A fact which has only weekly update period can't be expected to
cover any arbitrary day-boundary ranges. A fact which has both daily and weekly
update periods *can*.
So if you create the fact with two update periods, it can cover any
day-boundary ranges. Then, partitions will be picked accordingly. Weekly
partitions will be picked as much as possible, the rest will be covered by
daily.
With this knowledge, you can even game the system by registering daily as an
update period and not adding any actual daily partitions. You'd have to set
fail on partial = false while querying, and your weekly sub-range will be
formed for you by lens. anything before first sunday and after last sunday will
be ignored.
> in lens query fact table update period weekly throws error if start and end
> date is not sunday
> ----------------------------------------------------------------------------------------------
>
> Key: LENS-582
> URL: https://issues.apache.org/jira/browse/LENS-582
> Project: Apache Lens
> Issue Type: Bug
> Components: build
> Reporter: Biru Kumar
> Assignee: Rajat Khandelwal
>
> Lens query
> {noformat}
> lens-shell>query execute cube select avg(servedImpressions) from
> user_activity where time_range_in(dt, '2015-01-19', '2015-01-21')
> Launching query failed cause: Driver :org.apache.lens.driver.hive.HiveDriver
> Cause :No candidate fact table available to answer the query, because
> {"brief":"No fact update periods for given
> range","details":{"user_attributestore_er_fact_supply_site_burn,user_attributestore_er_fact_demandcategory_click,user_attributestore_er_fact_supplycategory_visits,user_attributestore_er_fact_supply_site_impressions_rendered,user_attributestore_er_fact_adgroup_click,user_attributestore_er_fact_adgroup_impression_time_install,user_attributestore_er_fact_app_impression_time_install,user_attributestore_er_fact_supply_site_impressions_served,user_attributestore_er_fact_adgroup_burn,hive_fact_user_curation_good_traffic,user_attributestore_er_fact_app_visits,user_attributestore_er_fact_app_click,user_attributestore_er_fact_supply_site_click,user_attributestore_er_fact_adgroup_impressions_rendered":[{"cause":"COLUMN_NOT_FOUND","missingColumns":["servedimpressions"]}],"user_attributestore_er_fact_adgroup_view":[{"cause":"NO_FACT_UPDATE_PERIODS_FOR_GIVEN_RANGE"}]}}
> {noformat}
> fact table user_attributestore_er_fact_adgroup_view has the coloumn
> servedImpressions there and its update period is weekly.
> in the above query i have selected start and end date that does not fall on
> sunday.
> below the definition of fact table user_attributestore_er_fact_adgroup_view
> {noformat}
> lens-shell>describe fact user_attributestore_er_fact_adgroup_view
> columns :
> column :
> name : userid type : string
> name : timestamp type : timestamp
> name : adgroupguid type : string
> name : servedimpressions type : bigint
> properties :
> property :
> name : cube.table.user_attributestore_er_fact_adgroup_view.weight value :
> 0.1
> name : cube.table.type value : FACT
> name :
> cube.fact.user_attributestore_er_fact_adgroup_view.uh1_hdfs.updateperiods
> value : WEEKLY
> name : cube.fact.is.aggregated value : false
> name : cube.fact.user_attributestore_er_fact_adgroup_view.cubename value
> : user_activity
> name : transient_lastDdlTime value : 1431973737
> name : cube.fact.user_attributestore_er_fact_adgroup_view.storages value
> : uh1_hdfs
> storageTables :
> storageTable :
> updatePeriods : updatePeriod : WEEKLY
> storageName : uh1_hdfs tableDesc : partCols : column :
> name : dt type : string comment : Date partition field
> tableParameters : property :
> name : cube.storagetable.partition.timeline.cache.WEEKLY.dt.latest value
> : 2015-W08 name : conf.attributestore.schema value :
> \granularity\:\weekly_dailycumulative\\erinfo\:\source\:null\erid\:\8589934594\\entityType\:\ADGROUP\\entityAlias\:\adGroupGuid\\relationshipType\:\VIEW\\fields\:\alias\:\servedImpressions\\name\:\count\\type\:\LONG\
> \stores\:\viewvisit\
> name : cube.storagetable.partition.timeline.cache.present value : true
> name : EXTERNAL value : TRUE name : cube.storagetable.time.partcols
> value : dt name :
> cube.storagetable.partition.timeline.cache.WEEKLY.dt.first value : 2015-W07
> name : transient_lastDdlTime value : 1432301397 name :
> cube.storagetable.partition.timeline.cache.WEEKLY.dt.holes.size value : 0
> name : conf.debugmode value : true name :
> cube.storagetable.partition.timeline.cache.WEEKLY.dt.storage.class value :
> org.apache.lens.cube.metadata.timeline.EndsAndHolesPartitionTimeline
> serdeParameters : property :
> name : serialization.format value : 1
> timePartCols : dt
> external : true tableLocation :
> hdfs://hostname:8020/user/hive/warehouse/user.db/uh1_hdfs_user_attributestore_er_fact_adgroup_view
> inputFormat :
> com.inmobi.user.analytics.storage.inputformat.UserAttributeThriftInputFormat
> outputFormat : org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> serdeClassName :
> com.inmobi.user.analytics.storage.serde.UserAttributeThriftSerde
> storageHandlerName : numBuckets : 0 compressed : false
> name : user_attributestore_er_fact_adgroup_view
> cubeName : user_activity
> weight : 0.1
> {noformat}
> however below query runs successfully because start and end date mentioned
> below are sunday.
> {noformat}
> lens-shell>query execute cube select avg(servedImpressions) from
> user_activity where time_range_in(dt, '2015-02-08', '2015-03-01')
> _c0
> 2.7083333333333335
> 1 rows process in (20) seconds.
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)