[
https://issues.apache.org/jira/browse/IMPALA-10662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17326181#comment-17326181
]
ASF subversion and git services commented on IMPALA-10662:
----------------------------------------------------------
Commit 9355b25e118bd057396bdc36aee47d6e54c7d5cf in impala's branch
refs/heads/master from Csaba Ringhofer
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=9355b25 ]
IMPALA-10662: Change EE tests to return the same results for HS2 as Beeswax
In EE tests HS2 returned results with smaller precision than Beeswax for
FLOAT/DOUBLE/TIMESTAMP types. These differences are not inherent to the
HS2 protocol - the results are returned with full precision in Thrift
and lose precision during conversion in client code.
This patch changes to conversion in HS2 to match Beeswax and removes
test section DBAPI_RESULTS that was used to handle the differences:
- float/double: print method is changed from str() to ":.16".format()
- timestamp: impyla's cursor is created with convert_types=False to
avoid conversion to datetime.datetime (which has only
microsec precision)
Note that FLOAT/DOUBLE are still different in impala-shell, this change
only deals with EE tests.
Testing:
- ran the changed tests
Change-Id: If69ae90c6333ff245c2b951af5689e3071f85cb2
Reviewed-on: http://gerrit.cloudera.org:8080/17325
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Remove HS2 vs Beeswax differences is EE tests
> ---------------------------------------------
>
> Key: IMPALA-10662
> URL: https://issues.apache.org/jira/browse/IMPALA-10662
> Project: IMPALA
> Issue Type: Improvement
> Components: Infrastructure
> Reporter: Csaba Ringhofer
> Assignee: Csaba Ringhofer
> Priority: Major
>
> Beeswax and HS2 results sets have some differences in precision:
> TIMESTAMP:
> - HS2 rounds to microsecond precision, while Beeswax returns nanosec
> - only differs in EE tests, impala-shell returns timestamps the same way for
> both protocols
> - the cause is that impyla (and dpapi in general) converts timestamps to
> datetime.datetime by default, which has only microsec precision
> DOUBLE/FLOAT:
> - HS2 rounds to less digits
> - differs both in EE tests and impala-shell
> - the cause is that by default Python formats strings with less precision
> than the stringstream we use in c++
> - there is no clear good way to format a double, string<->double conversions
> are inherently lossy
> - probably there is a standard way to do this in SQL, but I don't know about
> it
> This difference is handled in .test files with the DBAPI_RESULTS, which lets
> us specify different outputs for the protocols.
> I think that it would be better to remove DBAPI_RESULTS and simply format
> these types in HS2 to match with the beeswax output. The default conversions
> can be tested in impyla, while the EE tests should focus on Impala and use
> the maximum precision possible.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]