Hi Morrisa,

Can you share more information regarding what type of function "formatDate"
is and how did you configure the return type of that function?
For the question on the first query If the return type is String, then ASC
on a string value should be on alphabetical ordering.

However on the third query, if the GROUP BY and ORDER BY are both operating
on the same input to your UDF it shouldn't be part of the output columns.
This looks like a bug to me.

--
Rong

On Thu, Jul 11, 2019 at 11:45 AM Morrisa Brenner <
morrisa.bren...@klaviyo.com> wrote:

> Hi Flink folks,
>
> We have a custom date formatting function that we use to format the output
> of columns containing dates. Ideally what we want is to format the output
> in the select statement but be able to order by the underlying datetime (so
> that and output with formatted dates "February 2019" and "April 2019" is
> guaranteed to have the rows sorted in time order rather than alphabetical
> order).
>
> When I go to add the unformatted column to the order by, however, that
> gets appended as an extra column to the select statement during the query
> planning process within Calcite. (In the order by parsing, it's considering
> this a different column from the one in the select statement.) When the
> group by column is different in the same way but there's no order by
> column, the extra column isn't added. I've included a couple of simple
> examples below.
>
> Is this the intended behavior of the query planner? Does anyone know of a
> way around this without needing to change the formatting so that it makes
> the output dates correctly sortable?
>
> Thanks for your help!
>
> Morrisa
>
>
>
> Example query and output with order by using formatted date:
>
> SELECT
>
> formatDate(floor(`testTable`.`timestamp` TO MONTH), 'MONTH'),
>
> sum(`testTable`.`count`)
>
> FROM `testTable`
>
> GROUP BY formatDate(floor(`testTable`.`timestamp` TO MONTH), 'MONTH')
>
> ORDER BY formatDate(floor(`testTable`.`timestamp` TO MONTH), 'MONTH') ASC
>
> Month
>
> SUM VALUE
>
> April 2019
>
> 1052
>
> February 2019
>
> 1
>
>
> Example query and output without order by but group by using unformatted
> date:
>
> SELECT
>
> formatDate(floor(`testTable`.`timestamp` TO MONTH), 'MONTH'),
>
> sum(`testTable`.`count`)
>
> FROM `testTable`
>
> GROUP BY floor(`testTable`.`timestamp` TO MONTH)
>
> Month
>
> SUM VALUE
>
> February 2019
>
> 1
>
> April 2019
>
> 1052
>
> We would like to enforce the ordering, so although this output is what we
> want, I don't think we can use this solution.
>
> Example query and output with order by using unformatted date:
>
> SELECT
>
> formatDate(floor(`testTable`.`timestamp` TO MONTH), 'MONTH'),
>
> sum(`testTable`.`count`)
>
> FROM `testTable`
>
> GROUP BY floor(`testTable`.`timestamp` TO MONTH)
>
> ORDER BY floor(`testTable`.`timestamp` TO MONTH) ASC
>
> Month
>
> SUM VALUE
>
> February 2019
>
> 1
>
> 2/1/2019 12:00 AM
>
> April 2019
>
> 1052
>
> 4/1/2019 12:00 AM
>
>
> --
> Morrisa Brenner
> Software Engineer
> 225 Franklin St, Boston, MA 02110
> klaviyo.com <https://www.klaviyo.com>
> [image: Klaviyo Logo]
>

Reply via email to