True. I was mainly thinking of the web console; usually one will want
one format for an app, I totally agree.
For the console it might be fun to be able to switch around.
I was thinking that either way the total effort put into computing the
final (formatted) result would be the same, and also that the per-node
investment in computing it would be the same - so that the time
complexity would be the same.
However, I agree that this would potentially increase the overall
retrieval latency since we'd be doing serial just-in-time formatting as
the pickups occur.
I'm fine either way - I was just thinking maybe things would be easier
(from a boundary-finding standpoint) the binary way.
(Not sure!)
Cheers,
Mike
On 4/15/16 8:25 PM, Till Westmann wrote:
I think that it’s a trade-off. Either we do the work when the job is
evaluated or when the job is picked up. If we did it on pick-up, we could
pick it up more than once in different formats, but I don't think that
many
applications would need that (the web console might as somebody
sitting in
front of it might want to look at it). The nice thing about the current
solution is, that we can do the serialization easily in parallel and the
pickup can happen sequentially and we don't have to interleave that with
more computation.
On 15 Apr 2016, at 17:39, Mike Carey wrote:
In a more perfect world, the query results would perhaps be persisted
in binary ADM form still, and would be just-in-time reformatted when
they are picked up for delivery back to the requester. At least that
seems like it would be better... No?
On 4/15/16 5:22 PM, Ildar Absalyamov wrote:
I agree that the example where CSV is embedded into return JSON
looks quirky (and I am not the big fan of it either).
I believe the tradeoff here is following: do we want to keep number
of API calls just to get the data minimum, or logically separate
metadata (like plans, execution time metrics, etc) from the data on
the endpoint level.
I have tried to address the former case, however left an option to
make this logical separation if the user is wiling to do that (via
include-results parameter). There is no real way to do it other way
around, since the plans, etc are generated before query is scheduled
and any results could be returned.
On Apr 15, 2016, at 17:13, Till Westmann <ti...@apache.org> wrote:
Yes, this API is not ideal for "just getting the data". However,
Ildar’s
goal was to separate the data from the HTML and to build an API
that can be
the basis for the Web-interface - and I think that the API looks
good for
that :)
I'm wondering if an endpoint to get the data should be an option on
this one
or a different endpoint. The reason is, that all of the additional
request
metadata that we can ask for (plan, metrics, warnings, ..) cannot
easily be
returned with such an API. An API that play well with curl might
even put
the format into the URI, e.g.:
curl http://host:19100/query/csv?statment=select+element+1+as+one;
> one.csv
Thoughts? Trade-offs?
Cheers,
Till
On 15 Apr 2016, at 16:48, Cameron Samak wrote:
That hop is exactly what I think should be (optionally) avoidable
though
because
1. The user still needs to parse both JSON (to get the URL)
along with
the other format (i.e. CSV)
Consider curl {myquery} > myoutput.csv. That's harder with the
proposed
API.
2. It's an unnecessary round trip back to the server (which,
depending
on the environment, can be significant esp. with quick queries).
Understood for the result distribution + serialization.
Cameron
On Fri, Apr 15, 2016 at 4:24 PM, Till Westmann <ti...@apache.org>
wrote:
I had a misunderstanding that I think I clarified now. I believed
that we
don’t have the separation into tuples anymore after result
distribution and
that we only have bytes that we pass to the client. In that case
limiting
in
the HTTP server would have had to choose between
a) limiting based on the number of bytes or
b) re-establishing tuple boundaries.
However, even though result distribution has serialized the
tuples to
whatever format (ADM, JSON, CSV), we still send frames and so we
should be
able to separate the tuples (and limit the number that we return).
So I think that it should be feasible to add that (feature creep
is coming
... :) )
Cheers,
Till
On 15 Apr 2016, at 14:55, Mike Carey wrote:
I read this much more simply: Can we enhance the API, in the
case where
you start with a handle and know that the results are ready now,
to fetch
the results in blocks instead of as one giant result? So still
computing
the giant result - just not pushing it all back at once - seems
like it
might help?
On 4/15/16 2:48 PM, Till Westmann wrote:
Hi Wail,
I’m not completely sure that I understand how to implement the
idea. If
we
do this only in the API, it might be tricky to get the
boundaries between
records right (e.g. if we do indentation on the server).
However, if we
want
to push this into the query engine, we need to understand
enough of the
query/statements to put the limit clause in.
Both approaches don't look great to me.
What did you have in mind?
Cheers,
Till
On 15 Apr 2016, at 13:19, Wail Alkowaileet wrote:
Hi Ildar,
I think if there's something I would love to have is getting
partial
result
instead of all result at once. This can be beneficial for result
pagination. When I use AsterixDB UI, 50% of the time my tab
crashes (I
forget to limit the result).
Thanks...
On Fri, Apr 15, 2016 at 1:23 AM, Ildar Absalyamov <
ildar.absalya...@gmail.com> wrote:
Hi Devs,
Recently there have been a number of conversations about the
future of
our
REST (aka HTTP) API. I summarized these discussions in an
outline of
the
new API design:
https://cwiki.apache.org/confluence/display/ASTERIXDB/New+HTTP+API+Design
<
https://cwiki.apache.org/confluence/display/ASTERIXDB/New+HTTP+API+Design
.
The need to refactor existing API came from different
directions (and
from
different people), and is explained in motivation section.
Thus I
believe
it’s about the time to take an effort and improve existing
API, so
that it
will not drag us down in the future. However during the
transition
step I
believe it would be better to keep exiting API endpoints, so
that we
would
not break people’s current experimental setup.
It would be good to know feedback from the folks, who have been
contributing to that part of the systems recently.
Best regards,
Ildar
--
*Regards,*
Wail Alkowaileet
Best regards,
Ildar