Thank you Jason! I will report back as soon as I have tried this.
On Fri, Nov 13, 2015 at 11:56 PM, Jason Altekruse <[email protected]> wrote: > Stefan, > > I took a look at the issue and I think I have a fix for the corruption you > are seeing. There have been a number of substantial commits to master > including a refactoring of a number of modules, so I applied this change on > top of the 1.3 branch for you to build and try out. I would like to add > some additional test cases, at which point I will open up and official PR > against master and we will likely be able to pull it back onto the 1.3 > branch for inclusion in the release. > > Please try this out to see if there are remaining issues reading your data. > > https://github.com/jaltekruse/incubator-drill/tree/4056-avro-corruption-bug > > Thanks, > Jason > > > > On Fri, Nov 13, 2015 at 2:58 PM, Stefán Baxter <[email protected]> > wrote: > > > So, > > > > Could someone point me to the appropriate place in the Drill code to > start > > investigating this (We would love to contribute but getting up to speed > is > > a bit much). > > > > I realize that there are many good things happening and that v. 1.3 is > > around the corner but it seems that I incorrectly assumed that data > > corruption issues would get a higher priority or that I would, at the > very > > least, get someone to confirm such a bug. > > > > We are now impeded by this after having moved all our logging from JSON > to > > Avro to avoid the schema related problems we have been running into with > > the JSON reader (null interpreted like double and failing when a string > > eventually comes along) . > > > > - Stefan > > > > > > On Wed, Nov 11, 2015 at 10:14 PM, Stefán Baxter < > [email protected] > > > > > wrote: > > > > > Hi, > > > > > > Can someone please verify that this is in fact a bug so I can rule out > > our > > > own mistakes? > > > > > > We have recently moved all our logging to Avro to compensate for schema > > > differences in JSON that were causing various problems and our latest > > > release is now impeded with this. > > > Alternatively can someone please point me in the right direction if I > was > > > to try to fix this myself. > > > > > > Regards, > > > -Stefán > > > > > > On Tue, Nov 10, 2015 at 2:41 PM, Stefán Baxter < > > [email protected]> > > > wrote: > > > > > >> Thank you Kamesh. > > >> > > >> I have created https://issues.apache.org/jira/browse/DRILL-4056 with > > the > > >> description. > > >> I will send you a confidential test file to your private email. > > >> > > >> Regards, > > >> -Stefan > > >> > > >> On Tue, Nov 10, 2015 at 2:30 PM, Kamesh <[email protected]> > > wrote: > > >> > > >>> Hi Stefán, > > >>> Could you please raise a Jira with sample schema and sample input to > > >>> reproduce it. I will look into this. > > >>> > > >>> On Tue, Nov 10, 2015 at 7:55 PM, Stefán Baxter < > > >>> [email protected]> > > >>> wrote: > > >>> > > >>> > Hi, > > >>> > > > >>> > I have an Avro file that support the following data/schema: > > >>> > > > >>> > {"field":"some", "classification":{"variant":"Gæst"}} > > >>> > > > >>> > When I select 10 rows from this file I get: > > >>> > > > >>> > +---------------------+ > > >>> > | EXPR$0 | > > >>> > +---------------------+ > > >>> > | Gæst | > > >>> > | Voksen | > > >>> > | Voksen | > > >>> > | Invitation KIF KBH | > > >>> > | Invitation KIF KBH | > > >>> > | Ordinarie pris KBH | > > >>> > | Ordinarie pris KBH | > > >>> > | Biljetter 200 krBH | > > >>> > | Biljetter 200 krBH | > > >>> > | Biljetter 200 krBH | > > >>> > +---------------------+ > > >>> > > > >>> > The bug is that the field values are incorrectly de-serialized and > > the > > >>> > value from the previous row is retained if the subsequent row is > > >>> shorter. > > >>> > > > >>> > The sql query: > > >>> > > > >>> > "select s.classification.variant variant from dfs.<some> as s limit > > >>> 10;" > > >>> > > > >>> > > > >>> > That way the "Ordinarie pris" becomes "Ordinarie pris KBH" because > > the > > >>> > previous row had the value "Invitation KIF KBH". > > >>> > > > >>> > Regards, > > >>> > -Stefán > > >>> > > > >>> > > >>> > > >>> > > >>> -- > > >>> Kamesh. > > >>> > > >> > > >> > > > > > >
