Hi Antoine,

Here is a repro for this issue:

import pyarrow

fn = '/tmp/foo'

# Data
data = [
    pyarrow.array(range(1000)),
    pyarrow.array(range(1000))
]
batch = pyarrow.record_batch(data, names=['f0', 'f1'])

# File Prep
writer = pyarrow.ipc.RecordBatchStreamWriter(fn, batch.schema)
writer.write_batch(batch)
writer.close()

# Read
reader = pyarrow.open_stream(fn)
tbl = reader.read_all()

# Rewrite
writer = pyarrow.ipc.RecordBatchStreamWriter(fn, tbl.schema)
batches = tbl.to_batches(max_chunksize=200)
writer.write_table(pyarrow.Table.from_batches(batches))
writer.close()


> python3 foo.py
Traceback (most recent call last):
  File "foo.py", line 24, in <module>
    writer.write_table(pyarrow.Table.from_batches(batches))
  File "pyarrow/ipc.pxi", line 237, in
pyarrow.lib._CRecordBatchWriter.write_table
  File "pyarrow/error.pxi", line 97, in pyarrow.lib.check_status
OSError: [Errno 14] Error writing bytes to file. Detail: [errno 14] Bad
address

Cheers,
Rares


On Mon, Dec 14, 2020 at 12:30 AM Antoine Pitrou <anto...@python.org> wrote:

>
> Hello Rares,
>
> Is there a complete reproducer that we may try out?
>
> Regards
>
> Antoine.
>
>
> Le 14/12/2020 à 06:52, Rares Vernica a écrit :
> > Hello,
> >
> > As part of a test, I'm reading a record batch from an Arrow file,
> > re-batching the data in smaller batches, and writing back the result to
> the
> > same file. I'm getting an unexpected Bad address error and I wonder what
> am
> > I doing wrong?
> >
> > reader = pyarrow.open_stream(fn)
> > tbl = reader.read_all()
> >
> > writer = pyarrow.ipc.RecordBatchStreamWriter(fn, tbl.schema)
> > batches = tbl.to_batches(max_chunksize=200)
> > writer.write_table(pyarrow.Table.from_batches(batches))
> > writer.close()
> >
> > Traceback (most recent call last):
> >   File "tests/foo.py", line 10, in <module>
> >     writer.write_table(pyarrow.Table.from_batches(batches))
> >   File "pyarrow/ipc.pxi", line 237, in
> > pyarrow.lib._CRecordBatchWriter.write_table
> >   File "pyarrow/error.pxi", line 97, in pyarrow.lib.check_status
> > OSError: [Errno 14] Error writing bytes to file. Detail: [errno 14] Bad
> > address
> >
> > Do I need to "close" the reader or open the writer differently?
> >
> > I'm using PyArrow 0.16.0 and Python 3.8.2.
> >
> > Thank you!
> > Rares
> >
>

Reply via email to