Thank you Jason!

I will report back as soon as I have tried this.

On Fri, Nov 13, 2015 at 11:56 PM, Jason Altekruse <[email protected]>
wrote:

> Stefan,
>
> I took a look at the issue and I think I have a fix for the corruption you
> are seeing. There have been a number of substantial commits to master
> including a refactoring of a number of modules, so I applied this change on
> top of the 1.3 branch for you to build and try out. I would like to add
> some additional test cases, at which point I will open up and official PR
> against master and we will likely be able to pull it back onto the 1.3
> branch for inclusion in the release.
>
> Please try this out to see if there are remaining issues reading your data.
>
> https://github.com/jaltekruse/incubator-drill/tree/4056-avro-corruption-bug
>
> Thanks,
> Jason
>
>
>
> On Fri, Nov 13, 2015 at 2:58 PM, Stefán Baxter <[email protected]>
> wrote:
>
> > So,
> >
> > Could someone point me to the appropriate place in the Drill code to
> start
> > investigating this (We would love to contribute but getting up to speed
> is
> > a bit much).
> >
> > I realize that there are many good things happening and that v. 1.3 is
> > around the corner but it seems that I incorrectly assumed that data
> > corruption issues would get a higher priority or that I would, at the
> very
> > least, get someone to confirm such a bug.
> >
> > We are now impeded by this after having moved all our logging from JSON
> to
> > Avro to avoid the schema related problems we have been running into with
> > the JSON reader (null interpreted like double and failing when a string
> > eventually comes along) .
> >
> > - Stefan
> >
> >
> > On Wed, Nov 11, 2015 at 10:14 PM, Stefán Baxter <
> [email protected]
> > >
> > wrote:
> >
> > > Hi,
> > >
> > > Can someone please verify that this is in fact a bug so I can rule out
> > our
> > > own mistakes?
> > >
> > > We have recently moved all our logging to Avro to compensate for schema
> > > differences in JSON that were causing various problems and our latest
> > > release is now impeded with this.
> > > Alternatively can someone please point me in the right direction if I
> was
> > > to try to fix this myself.
> > >
> > > Regards,
> > >   -Stefán
> > >
> > > On Tue, Nov 10, 2015 at 2:41 PM, Stefán Baxter <
> > [email protected]>
> > > wrote:
> > >
> > >> Thank you Kamesh.
> > >>
> > >> I have created https://issues.apache.org/jira/browse/DRILL-4056 with
> > the
> > >> description.
> > >> I will send you a confidential test file to your private email.
> > >>
> > >> Regards,
> > >>  -Stefan
> > >>
> > >> On Tue, Nov 10, 2015 at 2:30 PM, Kamesh <[email protected]>
> > wrote:
> > >>
> > >>> Hi Stefán,
> > >>>  Could you please raise a Jira with sample schema and sample input to
> > >>> reproduce it. I will look into this.
> > >>>
> > >>> On Tue, Nov 10, 2015 at 7:55 PM, Stefán Baxter <
> > >>> [email protected]>
> > >>> wrote:
> > >>>
> > >>> > Hi,
> > >>> >
> > >>> > I have an Avro file that support the following data/schema:
> > >>> >
> > >>> > {"field":"some", "classification":{"variant":"Gæst"}}
> > >>> >
> > >>> > When I select 10 rows from this file I get:
> > >>> >
> > >>> > +---------------------+
> > >>> > |       EXPR$0        |
> > >>> > +---------------------+
> > >>> > | Gæst                |
> > >>> > | Voksen              |
> > >>> > | Voksen              |
> > >>> > | Invitation KIF KBH  |
> > >>> > | Invitation KIF KBH  |
> > >>> > | Ordinarie pris KBH  |
> > >>> > | Ordinarie pris KBH  |
> > >>> > | Biljetter 200 krBH  |
> > >>> > | Biljetter 200 krBH  |
> > >>> > | Biljetter 200 krBH  |
> > >>> > +---------------------+
> > >>> >
> > >>> > The bug is that the field values are incorrectly de-serialized and
> > the
> > >>> > value from the previous row is retained if the subsequent row is
> > >>> shorter.
> > >>> >
> > >>> > The sql query:
> > >>> >
> > >>> > "select s.classification.variant variant from dfs.<some> as s limit
> > >>> 10;"
> > >>> >
> > >>> >
> > >>> > That way the  "Ordinarie pris" becomes "Ordinarie pris KBH" because
> > the
> > >>> > previous row had the value "Invitation KIF KBH".
> > >>> >
> > >>> > Regards,
> > >>> >   -Stefán
> > >>> >
> > >>>
> > >>>
> > >>>
> > >>> --
> > >>> Kamesh.
> > >>>
> > >>
> > >>
> > >
> >
>

Reply via email to