"value" is required in a "property" tag. A property tag is name and value. This special property defines that whenever queried on order_time time dimension, try to answer with facts that have "ot" as partition column. If not, it falls back to delivery_time time dimension. There again, the mapping is checked, which is "dt", so facts that have "dt" as partition column are preferred. This can happen in a chain until a valid combination is found which can answer the query.
On Wed, Aug 10, 2016 at 12:13 AM Tao Yan <[email protected]> wrote: > So, the field 'value' can be used to define both alias and real values? > > On Tue, Aug 9, 2016 at 11:40 AM, Rajat Khandelwal <[email protected]> > wrote: > >> Yes, that is correct. >> >> On Wed, Aug 10, 2016, 00:09 Tao Yan <[email protected]> wrote: >> >>> Thanks for the explanation. I got the idea of tentative partitions. >>> >>> Is the property >>> *<property name="cube.timedim.partition.order_time" value="ot" />* >>> referring to the dim_attribute >>> *<dim_attribute name="order_time" _type="TIMESTAMP" /> *? >>> >>> Is the value 'ot' an alias of the partition column name? >>> >>> >>> On Tue, Aug 9, 2016 at 12:47 AM, Rajat Khandelwal <[email protected]> >>> wrote: >>> >>>> The idea of partitions is that there is a stream of data that keeps >>>> flowing in, each line of which has delivery time and order time both. The >>>> data is partitioned and stored in partitions later and facts are created on >>>> top of that partitioned data. So if data is partitioned in order time and >>>> the time range queried is also order time, then we can pick the partitions >>>> directly. But if the data is partitioned on delivery time, as is the case >>>> for this fact, and the query is on order time, then you pick some tentative >>>> partitions of delivery time, but you still have to filter over order time. >>>> Here the purpose of picking partitions is to reduce the amount of data >>>> read. When you pick delivery time partitions, it doesn't guarantee what >>>> order times data has, it only provides a tentative range. Hence, the >>>> partitions are picked based on the relation of time dimensions, but the >>>> extra filter still has to be supplied. >>>> >>>> Hope it's clear. >>>> >>>> On Tue, Aug 9, 2016 at 10:30 AM Rajat Khandelwal <[email protected]> >>>> wrote: >>>> >>>>> That's the original time range asked in the cube query. >>>>> >>>>> On Tue, Aug 9, 2016, 01:46 Tao Yan <[email protected]> wrote: >>>>> >>>>>> Hi Rajat, >>>>>> >>>>>> Thanks for the explain. I have a clear understanding about dim >>>>>> relation now, however, how does the following condition generated: >>>>>> >>>>>> and ((sales.order_time) >= '2015-04-13 03:00:00') and >>>>>> ((sales.order_time) < '2015-04-13 04:00:00'))) >>>>>> >>>>>> order_time is a dim attributes defined in sales cube, and it does not >>>>>> have any relationships with properties or other dim attributes (The >>>>>> property cube.timedim.partition.order_time does, but is it the same >>>>>> thing?), I wonder how the query could pick up this column. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> >>>>>> On Mon, Aug 8, 2016 at 4:12 AM, Rajat Khandelwal <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> I've created a jira for this: >>>>>>> https://issues.apache.org/jira/browse/LENS-1269 >>>>>>> >>>>>>> You should be able to make change in your data locally and see the >>>>>>> non-empty results: >>>>>>> >>>>>>> cat >>>>>>> /Users/rajat.khandelwal/Git/lens/lens-dist/target/apache-lens-2.7.0-SNAPSHOT-bin/apache-lens-2.7.0-SNAPSHOT-bin/client/examples/data/sales_aggr_fact2_local3/sales-aggr-fact2-local3.data >>>>>>> 2015-04-13 03:00:00,2015-04-13 04:00:00,1,1,1,1,1,1,5,0 >>>>>>> 2015-04-13 03:00:00,2015-04-13 04:00:00,2,1,2,2,2,1,8,2 >>>>>>> >>>>>>> >>>>>>> On Mon, Aug 8, 2016 at 2:46 PM Rajat Khandelwal <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> More details on how it converts to the fallback range: >>>>>>>> >>>>>>>> Range asked: >>>>>>>> (order_time, '2015-04-13-03', '2015-04-13-04') >>>>>>>> >>>>>>>> Timedim Relation : >>>>>>>> >>>>>>>> <property name="cube.timedim.relation.order_time" >>>>>>>> value="delivery_time+[-20 days,-1 hour]" /> >>>>>>>> >>>>>>>> <property name="cube.timedim.relation.order_time" >>>>>>>> value="delivery_time+[-2 hours,-1hour]" /> >>>>>>>> >>>>>>>> The cube defines the first relation, the fact defines the second >>>>>>>> relation. A fact can optionally override the timedim relation defined >>>>>>>> by cube. So the second one is picked : >>>>>>>> >>>>>>>> Derivation: >>>>>>>> >>>>>>>> 2015-04-13-03 <=ot < 2015-04-13-04 >>>>>>>> dt - 2 hours <= ot <= dt - 1 hour => ot + 1 hour <= dt <= ot + 2 >>>>>>>> hour >>>>>>>> => 2015-04-13-04 <= dt < 2015-04-13-06. >>>>>>>> >>>>>>>> Hope it's clearer now. >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Aug 8, 2016 at 1:29 PM Rajat Khandelwal <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi >>>>>>>>> >>>>>>>>> So it seems right now the query is designed to return 0 rows. The >>>>>>>>> query is there just to demonstrate the fallback feature. >>>>>>>>> >>>>>>>>> Cube query is >>>>>>>>> >>>>>>>>> cube select customer_city_name, store_cost from sales where >>>>>>>>> time_range_in(order_time, >>>>>>>>> '2015-04-13-03', '2015-04-13-04') >>>>>>>>> Driver query comes out to be >>>>>>>>> >>>>>>>>> SELECT (customer_city.name), sum((sales.store_cost)) FROM >>>>>>>>> a.local_sales_aggr_fact2 sales join a.local_city_table customer_city >>>>>>>>> on sales.customer_city_id = customer_city.id and (customer_city.dt = >>>>>>>>> 'latest') WHERE (((((sales.dt) = '2015-04-13-04') or ((sales.dt) = >>>>>>>>> '2015-04-13-05')) and ((sales.order_time) >= '2015-04-13 03:00:00') >>>>>>>>> and ((sales.order_time) < '2015-04-13 04:00:00'))) GROUP BY >>>>>>>>> (customer_city.name) >>>>>>>>> >>>>>>>>> Then I modified the driver query and ran: >>>>>>>>> >>>>>>>>> SELECT (customer_city.name), sales.order_time, sales.store_cost FROM >>>>>>>>> a.local_sales_aggr_fact2 sales join a.local_city_table customer_city >>>>>>>>> on sales.customer_city_id = customer_city.id and (customer_city.dt = >>>>>>>>> 'latest') WHERE (((((sales.dt) = '2015-04-13-04') or ((sales.dt) = >>>>>>>>> '2015-04-13-05')))) >>>>>>>>> >>>>>>>>> Found the following results: >>>>>>>>> >>>>>>>>> Bangalore^A2015-04-13 00:00:00^A0.0 >>>>>>>>> Hyderabad^A2015-04-13 00:00:00^A2.0 >>>>>>>>> >>>>>>>>> It's apparent that both the rows will be filtered out by the >>>>>>>>> translated query. >>>>>>>>> >>>>>>>>> I'll look deeper and replace the example query with something that >>>>>>>>> demonstrates the fallback feature and also returns non-empty results. >>>>>>>>> >>>>>>>>> Let us know if you have any confusion understanding the feature >>>>>>>>> itself. I think the video should have covered it. >>>>>>>>> >>>>>>>>> >>>>>>>>> Regards >>>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> *Tao Yan* >>>>>> Software Engineer >>>>>> Data Analytics Infrastructure Tools and Services >>>>>> >>>>>> >>>>>> >>>>>> 206.250.5345 >>>>>> [email protected] >>>>>> https://www.linkedin.com/in/taousc >>>>>> >>>>> >>> >>> >>> -- >>> >>> *Tao Yan* >>> Software Engineer >>> Data Analytics Infrastructure Tools and Services >>> >>> >>> >>> 206.250.5345 >>> [email protected] >>> https://www.linkedin.com/in/taousc >>> >> > > > -- > > *Tao Yan* > Software Engineer > Data Analytics Infrastructure Tools and Services > > > > 206.250.5345 > [email protected] > https://www.linkedin.com/in/taousc >
