[
https://issues.apache.org/jira/browse/ARROW-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773135#comment-16773135
]
Antoine Pitrou commented on ARROW-2392:
---------------------------------------
Ideally we would compare schemas for equality at each write call. Though we
might not care if metadata differs. [~wesmckinn] Any thoughts?
> [Python] pyarrow RecordBatchStreamWriter allows writing batches with
> different schemas
> --------------------------------------------------------------------------------------
>
> Key: ARROW-2392
> URL: https://issues.apache.org/jira/browse/ARROW-2392
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Reporter: Ernesto Ocampo
> Priority: Minor
> Fix For: 0.13.0
>
>
> A RecordBatchStreamWriter initialised with a given schema will still allow
> writing RecordBatches that have different schemas. Example:
>
> {code:java}
> schema = pa.schema([pa.field('some_field', pa.int64())])
> stream = pa.BufferOutputStream()
> writer = pa.RecordBatchStreamWriter(stream, schema)
> data = [pa.array([1.234])]
> batch = pa.RecordBatch.from_arrays(data, ['some_field'])
> # batch does not conform to schema
> assert batch.schema != schema
> writer.write_batch(batch) # no exception raised
> writer.close()
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)