LakshSingla commented on code in PR #13952:
URL: https://github.com/apache/druid/pull/13952#discussion_r1203307982
##########
processing/src/main/java/org/apache/druid/query/QueryToolChest.java:
##########
@@ -330,4 +331,32 @@ public Sequence<Object[]> resultsAsArrays(QueryType query,
Sequence<ResultType>
{
throw new UOE("Query type '%s' does not support returning results as
arrays", query.getType());
}
+
+ /**
+ * Converts a sequence of this query's ResultType into a sequence of {@link
FrameSignaturePair}. The array signature
+ * is the one give by {@link #resultArraySignature(Query)}. If the toolchest
doesn't support this method, then it can
+ * return an empty optional. It is the duty of the callees to throw an
appropriate exception in that case or use an
+ * alternative fallback approach
+ *
+ * Check documentation of {@link #resultsAsArrays(Query, Sequence)} as the
behaviour of the rows represented by the
+ * frame sequence is identical.
+ *
+ * Each Frame has a separate {@link RowSignature} because for some query
types like the Scan query, every
+ * column in the final result might not be present in the individual
ResultType (and subsequently Frame). Therefore,
+ * this is done to preserve the space by not populating the column in that
particular Frame and omitting it from its
+ * signature
+ *
+ * @param query Query being executed by the toolchest. Used to determine the
rowSignature of the Frames
+ * @param resultSequence results of the form returned by {@link
#mergeResults(QueryRunner)}
+ * @param memoryLimitBytes Limit the memory results. Throws {@link
ResourceLimitExceededException} if the result exceed
+ * the memoryLimitBytes
+ */
+ public Optional<Sequence<FrameSignaturePair>> resultsAsFrames(
Review Comment:
I have updated the code with the suggestion here.
While this is a much better way to tackle this and give a lot more autonomy
to the callers, this suffers from one issue that we see with MSQ ingestions
containing large data sketches, that Frame size is insufficient to hold a
single row. We can set the limit higher here to something like 128MB, however,
then it would mean that we would potentially be materializing 128MBs more than
what we require (instead of fine graning it at a singular row level).
Any ways to counteract this?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]