Hi,
Oh, we forgot to integrate saver interface with the Parquet
compression option.
You can use the feature by the following code with 0.17.0:
--
require "parquet"
table = Arrow::Table.new({"count" => [1, 2, 3]})
Arrow::FileOutputStream.open("test.parquet", false) do |output|
properties = Parquet::WriterProperties.new
properties.set_compression(:snappy)
Parquet::ArrowFileWriter.open(table.schema, output, properties) do |writer|
chunk_size = 1024
writer.write_table(table, chunk_size)
end
end
--
You'll be able to write the following code with the next release:
--
require "parquet"
table = Arrow::Table.new({"count" => [1, 2, 3]})
table.save("test.parquet", compression: :snappy)
--
Thanks,
--
kou
In <[email protected]>
"Snappy Compression with red-parquet Ruby Gem" on Thu, 23 Apr 2020 20:13:25
+0000,
David Lahn <[email protected]> wrote:
> Hi,
>
> Does anyone have any examples of how to output a Parquet file with Snappy
> compression using the Ruby gem?
>
> We have tested trying to set compression to “snappy” on the TableSaver, but
> we get the following:
>
> [compressed-output-stream][new]: NotImplemented: Streaming compression
> unsupported with Snappy (Arrow::Error::NotImplemented)
>
> Example:
>
> Arrow::TableSaver.new(table, 'test.parquet', {compression: 'snappy'}).save
>
> Or are we completely turned around on how to accomplish this?
>
> Dave
>
> David Lahn
> DevOps Lead
> Development
>
> ForwardPMX
> Privacy Policy
>
>
>
>
> This e-mail is confidential to ForwardPMX intended for use by the recipient.
> If you received this in error or are not the intended recipient, you are
> hereby notified that any review, retransmission, copying or other use of, or
> taking of any action in reliance upon this information is strictly prohibited.
>