alamb commented on issue #4462:
URL:
https://github.com/apache/arrow-datafusion/issues/4462#issuecomment-1374538949
Hi @melgenek
> Hi there! I am new to Datafusion and would like to learn more about it.
That is great! Welcome!
> My main question is: what level of compatibility between Postgres and
Datafusion would you like to check?
I think for this ticket we should strive for "the same level as is currently
verified using the python integration tests" as much as possible.
> The values are close, but not in a text form representation.
yes this is a classic "floating point rounding error" type situation and
why it is typically not a great idea to directly compare floating point values.
In terms of how to implement this, would something like this work
(initially):
1. For some specific files (maybe those that start with
`postgres_vaildated_` the datafusion sqllogic-rs test runner would run the file
both against postgres and against datafusion and compare the results. This file
might initially only contain the existing queries in the python based
integration test but we could expand it over time.
2. For both datafusion and postgres, apply some sort of normalization to
floating point values prior to printing (round them to a smaller number of
significant figures
https://github.com/apache/arrow-datafusion/blob/3cc607de4ce6e9e1fd537091e471858c62f58653/datafusion/core/tests/sqllogictests/src/normalize.rs#L78-L95,
for example). Maybe we could add that to the compare directive
I also left some feedback on
https://github.com/apache/arrow-datafusion/pull/4834#pullrequestreview-1239686856
-- does that make sense?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]