Re: RecordBatchFile with no batches, Error: Pyarrow.lib.ArrowInvalid: File is smaller than indicated metadata size.

Wes McKinney Tue, 08 Jan 2019 12:16:26 -0800

I think I fixed this in master. Are you able to build from source to try it out?


I am hopeful that sometime this year my team and I can provide a conda
channel with nightly Arrow builds to help with testing and development

On Tue, Jan 8, 2019 at 1:49 PM White4, Ryan (STATCAN)
<ryan.whi...@canada.ca> wrote:
>
> Hi,
>
> I get an error when writing a file with no record batches. I came across this 
> when implementing a simple way to spill the buffer to disk automatically 
> (this is potentially coming in release 0.12???).
>
> I'm using pyarrow 0.11.
> Is there a JIRA related to this, or is there a problem in this simple example 
> below:
>
> my_schema = pa.schema([('field0', pa.int32())])
> sink = pa.BufferOutputStream()
> writer = pa.RecordBatchFileWriter(sink, my_schema)
> writer.close()
> buf = sink.getvalue()
>
> reader = pa.open_file(buf)
> print(reader.schema)
> print(reader.num_record_batches)
>
> Traceback...
> Reader = pa.open_file(buf)
> Pyarrow/ipc.py, line142, in open_file
> Return RecordBatchFileReader(source, fotter_offset=footer_offset)
> Pyarrow/ipc.py, line 89, in __init__
> Self._open(source, footer_offset=fotter_offset)
> Pyarrow/ipc.pxi, line 352
> Pyarrow/error.pxi, line 81
> Pyarrow.lib.ArrowInvalid: File is smaller than indicated metadata size.
>
> Thanks,
> Ryan
>
>
> Ryan Mackenzie White, Ph. D.
>
> Senior Research Analyst - Administrative Data Division, Analytical Studies, 
> Methodology and Statistical Infrastructure Field
> Statistics Canada / Government of Canada
> ryan.whi...@canada.ca<mailto:ryan.whi...@canada.ca> / Tel: 613-608-0015
>
> Analyste principal de recherche- Division des données administratives, 
> Secteur des études analytiques, de la méthodologie et de l'infrastructure 
> statistique
> Statistique Canada / Gouvernement du Canada
> ryan.whi...@canada.ca<mailto:ryan.whi...@canada.ca> / Tél. : 613-608-0015
>
>
>
>

Re: RecordBatchFile with no batches, Error: Pyarrow.lib.ArrowInvalid: File is smaller than indicated metadata size.

Reply via email to