I submitted a code review for the external (phase 1) changes. The
change
includes a interval constructor that takes two arguments of the same
type.
For example:
let $v1 := interval(date("2013-01-01"), date("20130505"))
let $v2 := interval(time("00:01:01"), time("213901049+0800"))
let $v3 := interval(datetime("2013-01-01T00:01:01"),
datetime("20130505T213901049+0800"))
return { "v1": $v1, "v2": $v2, "v3": $v3 }
Do we want to keep the previous functions for declaring an interval?
The
these functions are unique for each interval type and support string
arguments in addition to the specific type (date, time, datetime).
let $v1 := interval-from-date(date("2013-01-01"), date("20130505"))
let $v2 := interval-from-time(time("00:01:01"),
time("213901049+0800"))
let $v3 := interval-from-datetime(datetime("2013-01-01T00:01:01"),
datetime("20130505T213901049+0800"))
return { "v1": $v1, "v2": $v2, "v3": $v3 }
They could be helpful, but I am don't think they are necessary.
Removing
them would consolidate the code path for intervals. Thoughts?
On Fri, Jan 29, 2016 at 12:15 AM, Mike Carey <[email protected]>
wrote:
+1 on all points below! :-)
On 1/28/16 11:00 PM, Till Westmann wrote:
Sounds good to me. It will change the binary representation of
intervals, so it’s not backwards compatible.
But it seems that the tag-first representation is the way it should
have
been anyway and it’s a better way to go forward to support more
generic
intervals. (And I prefer layout changes rather now than later ..)
Cheers,
Till
On 28 Jan 2016, at 21:46, Eldon Carman wrote:
Based on the discussion, I have suggestion for a two phase
approach to
support generic intervals.
Phase 1: External
The first phase will focus on areas seen by a user, either though
ADM or
AQL. The new format will also be open to supporting other interval
types,
although currently will only support date, time and datetime. Here
are a
few of the action items:
- ADM printer and parser changed to support the new generic style
format
- interval(date("2012-01-01”), date(”2013-04-01”))
- Add an interval AQL constructor to support the above format
- Alter interval byte structure to support any interval type
- byte tag, T start, T end
- where T is currently only date, time and datetime
I created a ticket to track status for Phase 1:
https://issues.apache.org/jira/browse/ASTERIXDB-1281
Phase 2: Internal
Phase two will focus on items under the hood. These items are only
seen
by
a developer and need to be update to support new interval types.
The
complete task list will require some investigation. In addition to
picking
what types of intervals should be supported.
Phase 1 can start immediately while Phase 2 can wait until needed.
Thoughts?
On Tue, Jan 26, 2016 at 2:14 PM, Eldon Carman <[email protected]>
wrote:
While its a little more work to implement up front, the following
format
would be generic and support alternate interval types in the
future. It
would be nice to have consistency between AQL and ADM, although
not
required.
interval(date("2012-01-01”), date(”2013-04-01”))
As we move to supporting generic intervals, the byte storage
format
will
need to be updated. Currently an interval is represented by:
start
(long),
end (long), type tag (byte). To support other types, the type tag
should be
at the beginning of the byte sequence. This way the tag can be
used to
determine the data length of each item in the interval.
Should the changes to AQL and ADM include this interval storage
change
(moving the type tag to the first byte of the interval storage
format)?
On Tue, Jan 26, 2016 at 12:42 PM, Till Westmann
<[email protected]>
wrote:
That’s actually a nice and generic serialization.
I think that we should do this similarly in ADM and AQL.
I.e. instead of using
interval-from-date("2012-01-01”, ”2013-04-01”)
(note the two parameters) in AQL and
interval-date("2012-01-01, 2013-04-01")
(not the single parameter) in ADM we should use
interval(date("2012-01-01”), date(”2013-04-01”))
for both. That would have a number of advantages:
1) It is consistent between AQL and ADM.
2) It is consistent with the JSON serialization.
3) It reduces the number of magic parsers.
4) It keeps the interval orthogonal to the type used in the
interval.
On 4): While we don’t support intervals of other types than
date,
time,
and datetime so far, I think that we should change that and so
this
would
be a good step in that direction as well.
The disadvantages are
1) Incompatible AQL change
2) Incompatible ADM change
Thoughts?
Cheers,
Till
On 26 Jan 2016, at 11:46, Eldon Carman wrote:
I found that the lossless-JSON and clean-JSON printers were not
being
used.
After connecting them to the respective JSON printer, I ran the
query
again.
lossless-JSON result:
{ "orderedlist": [ { "date-interval": { "interval": { "start":
{
"date":
"2012-01-01" }, "end": { "date": "2013-04-01" }}} }, {
"time-interval": {
"interval": { "start": { "time": "12:23:34.456Z" }, "end": {
"time":
"15:34:45.567Z" }}} }, { "datetime-interval": { "interval": {
"start": {
"datetime": "2012-01-01T04:23:34.456Z" }, "end": { "datetime":
"2013-04-01T15:34:45.567Z" }}} } ] }
clean-JSON result:
[ { "date-interval": { "interval": { "start": "2012-01-01",
"end":
"2013-04-01"}} }, { "time-interval": { "interval": { "start":
"12:23:34.456Z", "end": "15:34:45.567Z"}} }, {
"datetime-interval": {
"interval": { "start": "2012-01-01T04:23:34.456Z", "end":
"2013-04-01T15:34:45.567Z"}} } ]
Is this what you would have expected?
On Mon, Jan 25, 2016 at 7:07 PM, Eldon Carman
<[email protected]>
wrote:
Thanks Chris for adding a fourth option. This option would
focus our
updates to only the ADM output.
Yes, both lossless-JSON and clean-JSON outputs would need to
be
check
also.
On Mon, Jan 25, 2016 at 5:58 PM, Chris Hillery
<[email protected]>
wrote:
I would vote for:
d. Update the serialized format to output
"interval-from-date" and
put
both
dates in quotes.
I like the function name interval-from-date() better, and I
don't
think
there's any need to maintain backwards compatibility with the
old
name
which clearly never worked.
Couple thoughts, though: The serialized format really should
be
"ADM",
not
"AQL". As such I don't think it should reference functions at
all.
We
already do this for many datatypes, such as uuid("...") and
datetime("..."). Are those truly "Functions"? Are they
"constructors",
and
is that different? In any case, the answer for interval types
should be
consistent with that.
Final note: quite possibly the lossless-JSON and clean-JSON
outputs for
intervals are broken as well, and should be fixed.
Ceej
aka Chris Hillery
On Mon, Jan 25, 2016 at 5:36 PM, Till Westmann
<[email protected]>
wrote:
Voting for a. Seems to be the least redundant option.
Cheers,
Till
On 25 Jan 2016, at 16:47, Eldon Carman wrote:
The interval field value printed in the ADM results can not
be
used to
create an interval.
Intervals have several functions that are used to construct
an
interval:
interval-from-date/time/datetime
and interval-start-from-date/time/datetime. It appears that
this
is
the
only way to create an interval. Thus, a user must use one of
these
function
to create an interval.
The following query shows how to create three intervals.
Query:
let $di := {"date-interval":
interval-from-date("2012-01-01",
"2013-04-01")}
let $ti := {"time-interval":
interval-from-time("12:23:34.456Z",
"233445567+0800")}
let $dti := {"datetime-interval":
interval-from-datetime("2012-01-01T12:23:34.456+08:00",
"20130401T153445567Z")}
return [$di, $ti, $dti];
Result:
{ "date-interval": interval-date("2012-01-01, 2013-04-01")
}, {
"time-interval": interval-time("12:23:34.456Z,
15:34:45.567Z")
}, {
"datetime-interval":
interval-datetime("2012-01-01T04:23:34.456Z,
2013-04-01T15:34:45.567Z") } ]
Notice the results show interval-date("date, date") which
is
different
than
the functions that are used to create a date interval.
Notice
that
interval-date does not exists in AsterixDB and that the
input is
a
single
string of dates separated by a comma. Below are some ideas on
how to
create
a round-trip for intervals.
Options for round tripping:
a: Rename "interval-from-date" to "interval-date" and
update the
output to
put both dates in quotes.
b: Add alias for "interval-from-date" to "interval-date"
and
update
the
output to put both dates in quotes.
c: Create an interval date constructor (called
interval-date)
that
can
parse the string "date, date".
The same process should be used for intervals with time and
datetime.
Thoughts?