[GitHub] [iceberg] findepi commented on a change in pull request #1611: DOCS: describe type compatibility between Spark and Iceberg

GitBox Wed, 28 Jul 2021 04:08:48 -0700


findepi commented on a change in pull request #1611:
URL: https://github.com/apache/iceberg/pull/1611#discussion_r678201399




##########
File path: site/docs/spark.md
##########
@@ -728,3 +730,82 @@ 
spark.read.format("iceberg").load("db.table.files").show(truncate = false)
 // Hadoop path table
 
spark.read.format("iceberg").load("hdfs://nn:8020/path/to/table#files").show(truncate
 = false)
 ```
+
+## Type compatibility
+
+Spark and Iceberg support different set of types. Iceberg does the type 
conversion automatically, but not for all combinations,
+so you may want to understand the type conversion in Iceberg in prior to 
design the types of columns in your tables.
+
+### Spark type to Iceberg type on creating table
+
+This type conversion table describes how Spark types are converted to the 
Iceberg types. The conversion applies on creating Iceberg table via Spark 
without using Iceberg core API.
+
+| Spark           | Iceberg                 | Notes |
+|-----------------|-------------------------|-------|
+| boolean         | boolean                 |       |
+| integer         | integer                 |       |
+| short           | integer                 |       |
+| byte            | integer                 |       |
+| long            | long                    |       |
+| float           | float                   |       |
+| double          | double                  |       |
+| date            | date                    |       |
+| timestamp       | timestamp with timezone |       |
+| string          | string                  |       |
+| char            | string                  |       |
+| varchar         | string                  |       |
+| binary          | binary                  |       |
+| decimal         | decimal                 |       |
+| struct          | struct                  |       |
+| array           | list                    |       |
+| map             | map                     |       |
+
+The type conversion is asymmetric: this table doesn't represent the types of 
Iceberg Spark can "read" from, or "write" to.
+The following sections describe the feasibility on read/write for Iceberg type 
from Spark.
+
+### Iceberg to Spark on reading from Iceberg table
+
+| Iceberg                    | Spark                   | Note  |
+|----------------------------|-------------------------|-------|
+| boolean                    | boolean                 |       |
+| integer                    | integer                 |       |
+| long                       | long                    |       |
+| float                      | float                   |       |
+| double                     | double                  |       |
+| date                       | date                    |       |
+| time                       | <N/A>                   |       |
+| timestamp with timezone    | timestamp               |       |
+| timestamp without timezone | <N/A>                   |       |
+| string                     | string                  |       |
+| uuid                       | string                  |       |

Review comment:
       Followed up at https://github.com/trinodb/trino/issues/6663 and on the 
mailing list.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] findepi commented on a change in pull request #1611: DOCS: describe type compatibility between Spark and Iceberg

Reply via email to