yaooqinn commented on pull request #32037:
URL: https://github.com/apache/spark/pull/32037#issuecomment-814782208
> I think the propose itself looks reasonable, so could you add comments
somewhere (TPCDSBase) about how we map the TPCDS doc's `Datatype`s (e.g.,
`identifier` and `integer`) into Spark types?
```java
/**
* Base trait for TPC-DS related tests.
*
* Datatype mapping for TPC-DS and Spark SQL,
* see more
http://www.tpc.org/tpc_documents_current_versions/pdf/tpc-ds_v2.9.0.pdf
*
* |---------------|---------------|
* | TPC-DS | Spark SQL |
* |---------------|---------------|
* | Identifier | INT |
* |---------------|---------------|
* | Integer | INT |
* |---------------|---------------|
* | Decimal(d, f) | Decimal(d, f) |
* |---------------|---------------|
* | Char(N) | Char(N) |
* |---------------|---------------|
* | Varchar(N) | Varchar(N) |
* |---------------|---------------|
* | Date | Date |
* |---------------|---------------|
*
* Remarks:
* The TPC-DS spec requires benchmark implementer may employ any internal
representation or SQL
* datatype that meets the following requirements:
* 1) Identifier means that the column shall be able to hold any key value
generated for that
* column.
* 2) Integer means that the column shall be able to exactly represent
integer values (i.e.,
* values in increments of 1) in the range of at least −2^(n − 1) to 2^(n
− 1) − 1,
* where n is 64.
* 3) Decimal(d, f) means that the column shall be able to represent decimal
values up to and
* including d digits,of which f shall occur to the right of the decimal
place; the values can be
* either represented exactly or interpreted to be in this range.
* 4) Char(N) means that the column shall be able to hold any string of
characters of a fixed
* length of N.
* 5) Varchar(N) means that the column shall be able to hold any string of
characters of a
* variable length with a maximum length of N. Columns defined as
"varchar(N)" may optionally
* be implemented as "char(N)".
* 6) Date means that the column shall be able to express any calendar day
* between January 1, 1900 and December 31, 2199.
*/
```
Added these comment to the class head
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]