chrisqiqiu opened a new issue, #2663:
URL: https://github.com/apache/iceberg-python/issues/2663
### Apache Iceberg version
0.10.0 (latest release)
### Please describe the bug 🐞
My python version is 3.12. When I use pyiceberg to read a dataset table
created Dremio using below code, i got "Cannot promote timestamp to
timestamptz" error but if i use the same dataset to create table in spark, the
table can be loaded correctly.
namespace = "traws"
table_name = "FeedFiles_Created_by_Dremio"
table = nessie_catalog.load_table(f"{namespace}.{table_name}")
print(table.name())
print(table.schema())
table.scan().to_pandas()
('traws', 'FeedFiles_Created_by_Dremio')
table {
1: ID: optional long
2: Name: optional string
3: FeedFileImportConfigID: optional int
4: ReceivedDateTime: optional timestamptz
5: IsMoved: optional boolean
6: MoveAttempt: optional int
7: IsProcessed: optional boolean
8: ProcessingAttempt: optional int
9: TotalDataRows: optional int
10: TotalProcessedDataRows: optional int
11: TotalExcludedDataRows: optional int
12: TotalAggregatedRows: optional int
13: IsArchived: optional boolean
14: ArchivalAttempt: optional int
15: IsMovedToTransaction: optional boolean
16: MovedToTransactionAttempt: optional int
17: IsISDEnriched: optional boolean
18: ISDEnrichmentAttempt: optional int
19: IsDDBSEnriched: optional boolean
20: DDBSEnrichmentAttempt: optional int
21: IsCCEnriched: optional boolean
22: CCEnrichmentAttempt: optional int
23: IsCTIEnriched: optional boolean
24: CTIEnrichmentAttempt: optional int
25: HasPendingISDEnrichment: optional boolean
26: HasPendingDDBSEnrichment: optional boolean
27: HasPendingCCEnrichment: optional boolean
28: HasPendingCTIEnrichment: optional boolean
29: FileBusinessDate: optional timestamptz
30: IsUploaded: optional boolean
31: IsValid: optional boolean
32: Status: optional string
33: StatusMessage: optional string
34: LastActionDateTime: optional timestamptz
35: ActedBy: optional int
36: CreatedOn: optional timestamptz
37: IsIPSAttempedAll: optional boolean
38: IsWMAttempedAll: optional boolean
39: IsOutofDateProjTxErrorsClosed: optional boolean
40: SparkUniqueID: optional string
}
---------------------------------------------------------------------------
ResolveError Traceback (most recent call last)
Cell In[3], line 12
8 print(table.location())
9 print(table.schema())
---> 12 table.scan().to_pandas()
ResolveError: Cannot promote timestamp to timestamptz
1465 def to_pandas(self, **kwargs: Any) -> pd.DataFrame:
1466 """Read a Pandas DataFrame eagerly from this Iceberg table.
1467
1468 Returns:
1469 pd.DataFrame: Materialized Pandas Dataframe from the Iceberg
table
1470 """
-> 1471 return self.to_arrow().to_pandas(**kwargs)
1427 """Read an Arrow table eagerly from this DataScan.
1428
1429 All rows will be loaded into memory at once.
(...) 1432 pa.Table: Materialized Arrow Table from the Iceberg
table's DataScan
1433 """
1434 from pyiceberg.io.pyarrow import ArrowScan
1436 return ArrowScan(
1437 self.table_metadata, self.io, self.projection(),
self.row_filter, self.case_sensitive, self.limit
-> 1438 ).to_table(self.plan_files())
The tables i created on Dremio and spark are actually from the same dataset.
in dremio:
CREATE TABLE nessie.traws.FeedFiles_Created_by_Dremio AS
select * from nessie.traws.FeedFiles
in spark
CREATE TABLE nessie.traws.FeedFiles_Created_by_Spark AS
select * from nessie.traws.FeedFiles
### Willingness to contribute
- [ ] I can contribute a fix for this bug independently
- [x] I would be willing to contribute a fix for this bug with guidance from
the Iceberg community
- [ ] I cannot contribute a fix for this bug at this time
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]