Does scalding-parquet library support reading in snappy compressed Parquet files?

ybrovman via Scalding Development Wed, 24 Apr 2019 17:05:26 -0700

Does scalding-parquet library support reading in snappy compressed Parquet 
files?


I are trying to read in Parquet files of the form:
> hadoop jar parquet-tools-1.10.1.jar schema 
/my/path/part-00000.snappy.parquet
message spark_schema {
  optional fixed_len_byte_array(8) fieldName1 (DECIMAL(18,0));
  optional fixed_len_byte_array(2) fieldName2 (DECIMAL(4,0));
  optional binary fieldName3 (UTF8);
}

I are using the following code:
val fields = new Fields("fieldName1","fieldName2","fieldName3")
ParquetTupleSource(fields, inputPath)
  .read
  .write(Tsv(outputPath))

The fieldName3 column output produces normal output that matches the input 
string, however, fieldName1 and fieldName2 columns produce garbage output. 
Does scalding-parquet library support snappy compressed Parquet files? Does 
it support reading fixed_len_byte_array type, how do I specify this in 
the TypedParquet setting?

Thank you for your help!
Best, 
Yuri

-- 
You received this message because you are subscribed to the Google Groups 
"Scalding Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Does scalding-parquet library support reading in snappy compressed Parquet files?

Reply via email to