Re: A question about ParquetWriter schema

Wei Yan Fri, 03 Apr 2015 10:35:24 -0700

Thanks for the reply, Ryan.




Yes, you’re right. I try different schemas to create/read/write.

I met that issue because our data schema is evolving, and we have data 
constructed by different versions of schema. I’ll try to make the read/write 
schemas matched.




thanks,

Wei

On Thu, Apr 2, 2015 at 6:52 PM, Ryan Blue <b...@cloudera.com> wrote:

> Hi Wei,
> It looks like you are using a writer with one schema to write records 
> created with another. That doesn't work because Avro deconstructs 
> generic records by position. Is there a way you could change your code 
> so that you use the final write schema as the read schema for the file 
> where you data comes from? You could also construct the records using 
> the final schema, too. You just have to make sure that schema matches.
> What are you trying to do?
> rb
> On 04/02/2015 05:46 PM, Julien Le Dem wrote:
>> Maybe Ryan or Tom can help
>>
>> On Wed, Mar 18, 2015 at 4:59 PM, Wei Yan <ywsk...@gmail.com
>> <mailto:ywsk...@gmail.com>> wrote:
>>
>>     Hi, devs,
>>
>>
>>     I’m a newbie of using parquet. I met a ParquetWriter problem and
>>     wish anyone can help me.
>>
>>
>>     I use ParquetWriter to write a GenericRecord to a file. And the
>>     schema used to define the ParquetWriter has fewer fields than the
>>     GenericRecord. e.g., The schema for the ParquetWriter:
>>     
>> {"type":"record","name":"r","fields":[{"name":"f1","type":"double","default":0}]}
>>     , which only has one field "f1”. And the GenericRecord has two
>>     fields: {"f2": null, "“"f1": 1.0}.
>>
>>
>>     When I use that ParquetWriter to write that record, I thought it
>>     would only write field “f1” and skip “f2”. However, I got this
>>     exception “Null-value for required field: f1”. It looks like the
>>     ParquetWriter considered the field sequence, and tried to match the
>>     “f2” in the record to the “f1” to the schema. Is this by design?
>>
>>
>>     Very appreciate for any help.
>>
>>
>>     thanks,
>>     Wei
>>
>>
> -- 
> Ryan Blue
> Software Engineer
> Cloudera, Inc.

Re: A question about ParquetWriter schema

Reply via email to