asheeshgarg opened a new issue, #226:
URL: https://github.com/apache/iceberg-python/issues/226

   ### Feature Request / Improvement
   
   @Fokko
   
   I have a polars dataframe I have save it to parquet file using the  and 
using the Java API
   DataFile dataFile = 
DataFiles.builder(table.spec()).withPartition(partitionKey).withInputFile(inputFile).withFormat("parquet").withRecordCount(recordCount).build();
   
   
   When reading the data back using pyiceberg gettng following error
   File /usr/lib64/python3.9/functools.py:888, in 
singledispatch.<locals>.wrapper(*args, **kw)
       884 if not args:
       885     raise TypeError(f'{funcname} requires at least '
       886                     '1 positional argument')
   --> 888 return dispatch(args[0].__class__)(*args, **kw)
   
   File ~/code/iceberg-python/pyiceberg/io/pyarrow.py:695, in _(obj, visitor)
       693 if pa.types.is_nested(obj):
       694     raise TypeError(f"Expected primitive type, got: {type(obj)}")
   --> 695 return visitor.primitive(obj)
   
   File ~/code/iceberg-python/pyiceberg/io/pyarrow.py:808, in 
_ConvertToIceberg.primitive(self, primitive)
       805     primitive = cast(pa.FixedSizeBinaryType, primitive)
       806     return FixedType(primitive.byte_width)
   --> 808 raise TypeError(f"Unsupported type: {primitive}")
   
   TypeError: Unsupported type: large_string
   
   This is the schema of the table
   Schema(NestedField(field_id=1, name='id', field_type=IntegerType(), 
required=False), NestedField(field_id=2, name='name', field_type=StringType(), 
required=False), NestedField(field_id=3, name='dept', field_type=IntegerType(), 
required=False), schema_id=0, identifier_field_ids=[])
   
   
   Same read work fine for the Spark. Do i need to use some utils to convert 
Polars datatype to be compatible to Pyiceberg?
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to