Problem with that is, that the meaning of range changes with granularity. While with approach 1, it's not the case.
A Range of 2015-01-01 to 2015-01-02 would mean 2 days, 2015-01-01-00 to 2015-01-02-00 would mean 2 days and 1 hour, 2015-01-01-00-00 to 2015-01-02-00-00 would mean 2 days and 1 minute, and 2015-01-01-00-00-00 to 2015-01-02-00-00-00 would mean 2 days and 1 second. While approach 1 has no such problem. A given range doesn't change with a change in granularity. Every one of these examples would mean 1 day. Here I've begun the argument with day granularity and went deeper. The same argument can be started from a Yearly granularity and there the different would be more prominent. Regards On Wed, Aug 5, 2015 at 5:36 PM Himanshu Gahlaut <[email protected]> wrote: > Say granularity is an ordered list of nodes where every node in list is > parent of its next node. > > For a System X: all granularity could be (Year => Month => Day). > For System Y: all granularity could be (Year => Month => Day => Hour => > Minutes => Seconds => Milliseconds => Microseconds => Picosecond => > Femtosecond => Attosecond) > > As long as every node has a logic defined to compute the lower limit and > upper limit value, implementation should be able to fetch all the data > under its parent node. > > E.G: > > To get all the data for 2015-10-31, implementation needs to convert this > into 2015-10-31-LowerLimitHourNode to 2015-10-31-UpperLimitHourNode to > fetch data from hourly partitions, where lower limit hour = 00 and upper > limit hour = 23. Similarly any granularity can be handled at implementation > level. > > > I guess a banking system in real world would be worried about very fine > granularities of time. > > Even if we assume banking system has day as the finest granularity. A user > can still ask for statement from the month of April, 2015 to October, 2015 > without specifying the specific date fields. In this case bank will have to > execute the logic to fetch all transactions from April to October with both > inclusive and also handling its finest granularity which is assumed as day. > In this case a banking system would convert 2015-04 to 2015-10 into > 2015-04-01 to 2015-10-31 to fetch all data from daily partitions, where 01 > is the lower limit for day node in month of April and 31 is the higher > limit for day node in month of October. > > If data is not available in daily partitions and bank's underlying system > supports further fine time granularities, it will convert 2015-04 to > 2015-10 into 2015-04-01-00 to 2015-10-31-23 to fetch data from hourly > partitions. > > Similarly if data is not available at hourly partitions, it will go to the > next level. As long as an upper limit and lower limit is defined for each > node, implementation can handle any time granularity. > > Meta point is that in real world, when a user will communicate to get > statement of April to October. His intentions will never be to exclude > October. Exclusion of October will be a surprise for him. > > Regards, > Himanshu > > On Wed, Aug 5, 2015 at 4:56 PM, amareshwarisr . <[email protected]> > wrote: > > > When a system has to understand time at all granularities, and user can > > give time at all granularities, shouldn't we chose a specification which > is > > easy and error prone. > > > > The example of banking system is not worried about finer granularities of > > time. > > > > On Wed, Aug 5, 2015 at 4:33 PM, Himanshu Gahlaut < > > [email protected]> wrote: > > > > > This has two parts: > > > > > > (1) Interfacing with user shall be cognitive. Something which user can > > > relate to his natural thought process. > > > > > > E.g.: In a banking system, if a user goes to a banker to get his bank > > > statement. He will say "give me statement from 11th October 2015 to > 23rd > > > October 2015". This clearly communicates in simplified language that > user > > > is interested in bank statement from 11th October to 23rd October. If > > > banker now does not return data of 23rd October, it will be a surprise > > for > > > user. > > > > > > Banker can inform every user that To date is exclusive, so if you need > > > statement till 23rd October you should come and tell me "give me > > statement > > > from 11th October to 24th October". I believe a user will immediately > > reply > > > => If I need statement only till 23rd October, why shall I communicate > > 11th > > > October to 24th October. If you still insist, next time the user will > > come > > > and say "give me statement from 11th October to 24th October with 24th > > > exclusive". System just made it harder for user to communicate with > > banker. > > > > > > (2) Implementation to achieve cognitive behaviour > > > > > > With To date inclusive, banker needs to make sure that it gives all > > > transaction records for 23rd October 2015 (To date) at the finest > > > granularity available with the bank. This will satisfy users > requirement > > > without any element of surprise. > > > > > > If the finest granularity is millisecond. It will have a higher limit > to > > it > > > which is 999 in most of the systems. Banker will fetch all partitions > > until > > > higher limit of finest granularity for 23rd October and return the > > > statement to user. > > > > > > If there was a leap second/millisecond, it is upto the banker to handle > > and > > > give data for it. When a user wants statement from 11th October to 23rd > > > October, user should not worry about the leap second/millisecond which > > > happened on any of the days from 11th October to 23rd October > > (inclusive). > > > > > > Regards, > > > Himanshu > > > > > > On Wed, Aug 5, 2015 at 3:16 PM, amareshwarisr . <[email protected] > > > > > wrote: > > > > > > > I prefer time-range end date to be exclusive of the end date, because > > of > > > > the following reasons: > > > > > > > > 1. Giving a month range end as start of next month is easy, as you > need > > > not > > > > remember and get confused with number of days in month : 30, 31, > 28/29 > > > (in > > > > february leap year) - wont miss ending days because of mistakes. > > > > > > > > 2. When you are giving a any range, you need not worry about what > > > > granularity should the time be specified. > > > > For ex, for including time upto 2015-08-01 05th. > > > > > > > > Should user specify the following with inclusive ranges? > > > > > 2015-08-01-04 (would be correct, assuming whole of 4 th hour will > be > > > > included) > > > > > 2015-08-01-04-59 (would be correct, but what if there is leap > second > > > :p ) > > > > > 2015-08-01-04-59-000 (wrong, would include only 000th milli second > > and > > > > not others) > > > > > 2015-08-01-04-59-999 (not sure if we will have leap milli seconds) > > > > > > > > Looking forward to see other arguments on making timerange end date > > > > inclusive. > > > > > > > > Thanks > > > > Amareshwari > > > > > > > > On Wed, Aug 5, 2015 at 2:50 PM, Rajat Khandelwal <[email protected]> > > > > wrote: > > > > > > > > > Currently the time_range_in clause is inclusive of the start date > and > > > > > exclusive of the end date. i.e. time_range_in(time_dimension, > date1, > > > > date2) > > > > > translates to date1<=time_dimension<date2 > > > > > > > > > > Recently there have been multiple discussions regarding this > approach > > > in > > > > > our internal forums at InMobi. There have been arguments that as an > > > > > analytics system, it should accept end date inclusive. Starting > this > > > > thread > > > > > to know opinions of other people regarding both approaches. I'm > > > defining > > > > > and briefly discussing both approaches below: > > > > > > > > > > Approach 1: End date exclusive. > > > > > Disclaimer: I personally prefer this approach and would shamelessly > > > argue > > > > > in its favor. I'll give links to some of the articles that I very > > much > > > > > agree with (no need to reinvent the wheel of argument) > > > > > > > > > > 1. Views of Anders Kaseorg, MIT PhD student in CS; Cofounder of > > > > Ksplice, > > > > > Inc > > > > > < > > > > > > > > > > > > > > > https://www.quora.com/Why-are-Python-ranges-half-open-exclusive-instead-of-closed-inclusive/answer/Anders-Kaseorg > > > > > > > > > > > 2. Views of Edgar Dijkstra, Creator of Dijkstra's algorithm > > > > > <http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF> > > > > > 3. Both arguments are for generic ranges. I believe time range > is > > > > also a > > > > > range and don't see why we need to handle that as a special > case. > > > > > > > > > > > > > > > Approach 2: End date inclusive. > > > > > > > > > > 1. Lens query language is a bit like sql and the BETWEEN clause > in > > > sql > > > > > is inclusive of both start and end date. > > > > > 2. Most analytic systems --- including google analytics --- > follow > > > > this > > > > > convention > > > > > > > > > > > > > > > I'm posting the question on a bigger forum > > > > > < > > > > > > > > > > > > > > > https://www.quora.com/Should-Time-ranges-in-an-analytical-system-be-half-open-or-closed > > > > > > > > > > > as well in hopes of getting more opinions. I request everyone to > post > > > > their > > > > > views here regarding both approaches. > > > > > > > > > > Regards > > > > > > > > > > > > > > > -- > > > _____________________________________________________________ > > > The information contained in this communication is intended solely for > > the > > > use of the individual or entity to whom it is addressed and others > > > authorized to receive it. It may contain confidential or legally > > privileged > > > information. If you are not the intended recipient you are hereby > > notified > > > that any disclosure, copying, distribution or taking any action in > > reliance > > > on the contents of this information is strictly prohibited and may be > > > unlawful. If you have received this communication in error, please > notify > > > us immediately by responding to this email and then delete it from your > > > system. The firm is neither liable for the proper and complete > > transmission > > > of the information contained in this communication nor for any delay in > > its > > > receipt. > > > > > > > -- > _____________________________________________________________ > The information contained in this communication is intended solely for the > use of the individual or entity to whom it is addressed and others > authorized to receive it. It may contain confidential or legally privileged > information. If you are not the intended recipient you are hereby notified > that any disclosure, copying, distribution or taking any action in reliance > on the contents of this information is strictly prohibited and may be > unlawful. If you have received this communication in error, please notify > us immediately by responding to this email and then delete it from your > system. The firm is neither liable for the proper and complete transmission > of the information contained in this communication nor for any delay in its > receipt. >
