Hi,

Oh, we forgot to integrate saver interface with the Parquet
compression option.

You can use the feature by the following code with 0.17.0:

--
require "parquet"

table = Arrow::Table.new({"count" => [1, 2, 3]})
Arrow::FileOutputStream.open("test.parquet", false) do |output|
  properties = Parquet::WriterProperties.new
  properties.set_compression(:snappy)
  Parquet::ArrowFileWriter.open(table.schema, output, properties) do |writer|
    chunk_size = 1024
    writer.write_table(table, chunk_size)
  end
end
--

You'll be able to write the following code with the next release:

--
require "parquet"

table = Arrow::Table.new({"count" => [1, 2, 3]})
table.save("test.parquet", compression: :snappy)
--


Thanks,
--
kou

In <[email protected]>
  "Snappy Compression with red-parquet Ruby Gem" on Thu, 23 Apr 2020 20:13:25 
+0000,
  David Lahn <[email protected]> wrote:

> Hi,
> 
> Does anyone have any examples of how to output a Parquet file with Snappy 
> compression using the Ruby gem?
> 
> We have tested trying to set compression to “snappy” on the TableSaver, but 
> we get the following:
> 
> [compressed-output-stream][new]: NotImplemented: Streaming compression 
> unsupported with Snappy (Arrow::Error::NotImplemented)
> 
> Example:
> 
> Arrow::TableSaver.new(table, 'test.parquet', {compression: 'snappy'}).save
> 
> Or are we completely turned around on how to accomplish this?
> 
> Dave
>     
> David Lahn
> DevOps Lead
> Development
>        
> ForwardPMX 
> Privacy Policy
> 
>  
>   
> 
> This e-mail is confidential to ForwardPMX intended for use by the recipient. 
> If you received this in error or are not the intended recipient, you are 
> hereby notified that any review, retransmission, copying or other use of, or 
> taking of any action in reliance upon this information is strictly prohibited.
> 

Reply via email to