Hi Stefán, Could you please raise a Jira with sample schema and sample input to reproduce it. I will look into this.
On Tue, Nov 10, 2015 at 7:55 PM, Stefán Baxter <[email protected]> wrote: > Hi, > > I have an Avro file that support the following data/schema: > > {"field":"some", "classification":{"variant":"Gæst"}} > > When I select 10 rows from this file I get: > > +---------------------+ > | EXPR$0 | > +---------------------+ > | Gæst | > | Voksen | > | Voksen | > | Invitation KIF KBH | > | Invitation KIF KBH | > | Ordinarie pris KBH | > | Ordinarie pris KBH | > | Biljetter 200 krBH | > | Biljetter 200 krBH | > | Biljetter 200 krBH | > +---------------------+ > > The bug is that the field values are incorrectly de-serialized and the > value from the previous row is retained if the subsequent row is shorter. > > The sql query: > > "select s.classification.variant variant from dfs.<some> as s limit 10;" > > > That way the "Ordinarie pris" becomes "Ordinarie pris KBH" because the > previous row had the value "Invitation KIF KBH". > > Regards, > -Stefán > -- Kamesh.
