oh yeah, I think I remember we discussed this a while back ... sorry I forgot the details. If you know you don't have a graph, did you try setting "spark.kryo.referenceTracking" to false? I'm also confused on how you could hit this with a few million objects. Are you serializing them one at a time, or is there one big container which holds them all?
Was there ever any follow up from kryo? On Wed, May 6, 2015 at 2:29 AM, Tristan Blakers <tris...@blackfrog.org> wrote: > Hi Imran, > > I had tried setting a really huge kryo buffer size (GB), but it didn’t > make any difference. > > In my data sets, objects are no more than 1KB each, and don’t form a > graph, so I don’t think the buffer size should need to be larger than a few > MB, except perhaps for reasons of efficiency? > > The exception usually occurs in > “com.esotericsoftware.kryo.util.IdentityObjectIntMap” > when it is resizing (or a similar operation), implying there are too many > object references, though it’s hard to see how I could get to 2b references > from a few million objects... > > T > > On 6 May 2015 at 00:58, Imran Rashid <iras...@cloudera.com> wrote: > >> Are you setting a really large max buffer size for kryo? >> Was this fixed by https://issues.apache.org/jira/browse/SPARK-6405 ? >> >> >> If not, we should open up another issue to get a better warning in these >> cases. >> >> On Tue, May 5, 2015 at 2:47 AM, shahab <shahab.mok...@gmail.com> wrote: >> >>> Thanks Tristan for sharing this. Actually this happens when I am reading >>> a csv file of 3.5 GB. >>> >>> best, >>> /Shahab >>> >>> >>> >>> On Tue, May 5, 2015 at 9:15 AM, Tristan Blakers <tris...@blackfrog.org> >>> wrote: >>> >>>> Hi Shahab, >>>> >>>> I’ve seen exceptions very similar to this (it also manifests as >>>> negative array size exception), and I believe it’s a really bug in Kryo. >>>> >>>> See this thread: >>>> >>>> http://mail-archives.us.apache.org/mod_mbox/spark-user/201502.mbox/%3ccag02ijuw3oqbi2t8acb5nlrvxso2xmas1qrqd_4fq1tgvvj...@mail.gmail.com%3E >>>> >>>> Manifests in all of the following situations when working with an >>>> object graph in excess of a few GB: Joins, Broadcasts, and when using the >>>> hadoop save APIs. >>>> >>>> Tristan >>>> >>>> >>>> On 3 May 2015 at 07:26, Olivier Girardot <ssab...@gmail.com> wrote: >>>> >>>>> Can you post your code, otherwise there's not much we can do. >>>>> >>>>> Regards, >>>>> >>>>> Olivier. >>>>> >>>>> Le sam. 2 mai 2015 à 21:15, shahab <shahab.mok...@gmail.com> a écrit : >>>>> >>>>>> Hi, >>>>>> >>>>>> I am using sprak-1.2.0 and I used Kryo serialization but I get the >>>>>> following excepton. >>>>>> >>>>>> java.io.IOException: com.esotericsoftware.kryo.KryoException: >>>>>> java.lang.IndexOutOfBoundsException: Index: 3448, Size: 1 >>>>>> >>>>>> I do apprecciate if anyone could tell me how I can resolve this? >>>>>> >>>>>> best, >>>>>> /Shahab >>>>>> >>>>> >>>> >>> >> >