[GitHub] [arrow] micomahesh1982 opened a new issue, #13125: parquet conversion failed,Bool column has NA values in column boolean__v

GitBox Wed, 11 May 2022 15:17:55 -0700


micomahesh1982 opened a new issue, #13125:
URL: https://github.com/apache/arrow/issues/13125


   I'm using the below code , while input data has boolean column with null and 
not null data however it's failing at the parquet conversion i'e "parquet 
conversion failed,Bool column has NA values in column boolean__v". kindly let 
me know what could be the issue
   
   for chunk_number, chunk in enumerate(pd.read_csv(**read_csv_args), 1):
               fields = []
                       for col,dtypes in sessionSchema.items():
                           fields.append(pa.field(col, dtypes, True)) # 
nullable=True, pass a DataFrame which in fact has nulls it appears the schema 
is ignored
                       glue_schema = pa.schema(fields)
   
                   table = pa.Table.from_pandas(chunk, preserve_index=False, 
schema=glue_schema)
                   if chunk_number == 1:
                       schema = table.schema
                       # Open a Parquet file for writing
                       pq_writer = pq.ParquetWriter(targetKey, schema, 
compression='snappy')
                  # Write CSV chunk to the parquet file
                   pq_writer.write_table(table)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] micomahesh1982 opened a new issue, #13125: parquet conversion failed,Bool column has NA values in column boolean__v

Reply via email to