On Wed, Jan 22, 2020 at 12:10 PM Rollo Konig-Brock <[email protected]> wrote:
>
> Hey all,
>
> I am unable to write an array (without any nulls) if the schema explicitly 
> sets nullable to false.
>
> A quick example:
>
> ```
> import io
> import pyarrow
>
> schema = pyarrow.schema([
>     pyarrow.field('_strangle', 'string', nullable=False)]
> )
>
> batch_array = pyarrow.array(['abc', 'åbc', None])
>
> output = io.BytesIO()
>
> writer = pyarrow.RecordBatchStreamWriter(output, schema)
>
> batch = pyarrow.RecordBatch.from_arrays(
>     [batch_array], schema.names
> )

You need to pass the schema with the non-nullable field here instead
of just the names

>
> # This fails with pyarrow.lib.ArrowInvalid: Tried to write record batch with 
> different schema
> writer.write_batch(batch)
> ```
>
> Any ideas here? It seems quite strange that I can't find an interface for 
> writing not-nullable arrays.
>
> All the best,
> Rollo

Reply via email to