iamluan commented on code in PR #3098:
URL: https://github.com/apache/iceberg-python/pull/3098#discussion_r2905248418


##########
mkdocs/docs/api.md:
##########
@@ -2039,3 +2039,82 @@ DataFrame()
 | 3 | 6 |
 +---+---+
 ```
+
+## Type mapping
+
+### PyArrow
+
+The Iceberg specification only specifies type mapping for Avro, Parquet, and 
ORC:
+
+- [Iceberg to Avro](https://iceberg.apache.org/spec/#avro)
+
+- [Iceberg to Parquet](https://iceberg.apache.org/spec/#parquet)
+
+- [Iceberg to ORC](https://iceberg.apache.org/spec/#orc)
+
+The following tables describe the type mappings between PyIceberg and PyArrow. 
In the tables below, `pa` refers to the `pyarrow` module:
+
+```python
+import pyarrow as pa
+```
+
+#### PyIceberg to PyArrow type mapping
+
+| PyIceberg type class            | PyArrow type                        | 
Notes                                  |
+|---------------------------------|-------------------------------------|----------------------------------------|
+| `BooleanType`                   | `pa.bool_()`                        |      
                                  |
+| `IntegerType`                   | `pa.int32()`                        |      
                                  |
+| `LongType`                      | `pa.int64()`                        |      
                                  |
+| `FloatType`                     | `pa.float32()`                      |      
                                  |
+| `DoubleType`                    | `pa.float64()`                      |      
                                  |
+| `DecimalType(p, s)`             | `pa.decimal128(p, s)`               |      
                                  |
+| `DateType`                      | `pa.date32()`                       |      
                                  |
+| `TimeType`                      | `pa.time64("us")`                   |      
                                  |
+| `TimestampType`                 | `pa.timestamp("us")`                |      
                                  |
+| `TimestampNanoType`             | `pa.timestamp("ns")`                |      
                                  |
+| `TimestamptzType`               | `pa.timestamp("us", tz="UTC")`      |      
                                  |
+| `TimestamptzNanoType`           | `pa.timestamp("ns", tz="UTC")`      |      
                                  |
+| `StringType`                    | `pa.large_string()`                 |      
                                  |
+| `UUIDType`                      | `pa.uuid()`                         |      
                                  |
+| `BinaryType`                    | `pa.large_binary()`                 |      
                                  |
+| `FixedType(L)`                  | `pa.binary(L)`                      |      
                                  |
+| `StructType`                    | `pa.struct()`                       |      
                                  |
+| `ListType(e)`                   | `pa.large_list(e)`                  |      
                                  |
+| `MapType(k, v)`                 | `pa.map_(k, v)`                     |      
                                  |
+| `UnknownType`                   | `pa.null()`                         |      
                                  |
+
+---
+
+#### PyArrow to PyIceberg type mapping
+
+| PyArrow type                       | PyIceberg type class        | Notes     
                     |
+|------------------------------------|-----------------------------|--------------------------------|
+| `pa.bool_()`                       | `BooleanType`               |           
                     |
+| `pa.int32()`                       | `IntegerType`               |           
                     |
+| `pa.int64()`                       | `LongType`                  |           
                     |
+| `pa.float32()`                     | `FloatType`                 |           
                     |
+| `pa.float64()`                     | `DoubleType`                |           
                     |
+| `pa.decimal128(p, s)`              | `DecimalType(p, s)`         |           
                     |
+| `pa.decimal256(p, s)`              | Unsupported                 |           
                     |
+| `pa.date32()`                      | `DateType`                  |           
                     |
+| `pa.date64()`                      | Unsupported                 |           
                     |
+| `pa.time64("us")`                  | `TimeType`                  |           
                     |
+| `pa.timestamp("us")`               | `TimestampType`             |           
                     |
+| `pa.timestamp("ns")`               | `TimestampNanoType`         |           
                     |
+| `pa.timestamp("us", tz="UTC")`     | `TimestamptzType`           |           
                     |
+| `pa.timestamp("ns", tz="UTC")`     | `TimestamptzNanoType`       |           
                     |

Review Comment:
   Thanks for pointing it out. I realize that PyIceberg currently only supports 
UTC for the timestamp type, and writing tables in format version 3 for `ns` has 
not been implemented (#1551). I will add notes to the table.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to