maropu commented on a change in pull request #32037:
URL: https://github.com/apache/spark/pull/32037#discussion_r609157680



##########
File path: sql/core/src/test/scala/org/apache/spark/sql/TPCDSBase.scala
##########
@@ -21,6 +21,49 @@ import org.apache.spark.sql.catalyst.TableIdentifier
 import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.test.SharedSparkSession
 
+
+/**
+ * Base trait for TPC-DS related tests.
+ *
+ * Datatype mapping for TPC-DS and Spark SQL, see more at:
+ *   http://www.tpc.org/tpc_documents_current_versions/pdf/tpc-ds_v2.9.0.pdf
+ *
+ *    |---------------|---------------|
+ *    |    TPC-DS     |  Spark  SQL   |
+ *    |---------------|---------------|
+ *    |  Identifier   |      INT      |
+ *    |---------------|---------------|
+ *    |    Integer    |      INT      |
+ *    |---------------|---------------|
+ *    | Decimal(d, f) | Decimal(d, f) |
+ *    |---------------|---------------|
+ *    |    Char(N)    |    Char(N)    |
+ *    |---------------|---------------|
+ *    |  Varchar(N)   |  Varchar(N)   |
+ *    |---------------|---------------|
+ *    |     Date      |     Date      |
+ *    |---------------|---------------|
+ *
+ *
+ * Remarks:
+ * The TPC-DS spec requires benchmark implementer may employ any internal 
representation or SQL
+ * datatype that meets the following requirements:
+ * 1. Identifier means that the column shall be able to hold any key value 
generated for that
+ *    column.
+ * 2. Integer means that the column shall be able to exactly represent integer 
values (i.e.,
+ *    values in increments of 1) in the range of [-2<sup>63</sup>, 
2<sup>63</sup>-1]
+ * 3. Decimal(d, f) means that the column shall be able to represent decimal 
values up to and
+ *    including d digits,of which f shall occur to the right of the decimal 
place; the values can be
+ *    either represented exactly or interpreted to be in this range.
+ * 4. Char(N) means that the column shall be able to hold any string of 
characters of a fixed
+ *    length of N.
+ * 5. Varchar(N) means that the column shall be able to hold any string of 
characters of a
+ *    variable length with a maximum length of N. Columns defined as 
"varchar(N)" may optionally
+ *    be implemented as "char(N)".
+ * 6. Date means that the column shall be able to express any calendar day
+ *    between January 1, 1900 and December 31, 2199.
+ *

Review comment:
       nit: remove this blank linke.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to