Makes sense.
We just need to keep in mind that we don't use collection.sort for sorting
actual data. Otherwise we should never hit this bug.

On Thu, Feb 26, 2015 at 4:28 PM, Steven Phillips <[email protected]>
wrote:

> It looks like we are using the method in 5 different places in drill. We
> are using to sort lists of: files, drillbit endpoints, workunits, operator
> profiles, and columnIds.
>
> I can't imagine we are ever going to need to sort millions of those. So
> probably no need to worry about this bug.
>
> But we should keep it in mind for any future code that might want to use
> it.
>
> On Thu, Feb 26, 2015 at 1:00 AM, Yash Sharma <[email protected]> wrote:
>
> > As pointed out on the Hadoop mailing list -
> >
> > The OpenJDK’s java.utils.Collection.sort() is broken - such that the
> > default TimSort implementation would cause ArrayIndexOutOfBoundsException
> > for number of elements larger than 67108864.
> >
> > I wonder if we can have such a huge collection in Drill and might hit
> this
> > bug ?
> > We do have Collections.sort used in multiple places
> > including DrillTextRecordReader but do we need to consider workaround for
> > this ?
> >
> > Thoughts ?
> >
> > Links:
> > http://envisage-project.eu/timsort-specification-and-verification/
> >
> > https://bugs.openjdk.java.net/browse/JDK-8072909
> >
>
>
>
> --
>  Steven Phillips
>  Software Engineer
>
>  mapr.com
>

Reply via email to