Given your confusion about atoms and binaries below, firstly read [1], so you're up to speed about the various Erlang types.
Now, looking back at your logs, there are instances of the atom "r_object" at the start of a tuple. This is a specific record that riak uses internally to store both the metadata and the binary contents of a bucket-key-value item. Now, a little grepping through the riak_kv repository [2], turns out a reference to this "r_object" atom as part of a record [3]. Records are just tuples with a specific atom at the start. The record specification says that the first field is the bucket, and the second is the key (both as binary data). So, going back to your logs, where you found "r_object", the second and third items in those tuples are the binaries representing your bucket and your key (the first is the name of the record, the atom "r_object"). If you copy and paste them into a new erlang shell, putting a "." afterwards, the shell will print them out as strings, like in [4]. I had a headstart because I know the riak_kv repository and how it works, but hopefully this will help. Yes, your nodes will start to OOM if you don't have enough RAM to fetch these 500-900MB items. That's why you should resolve siblings locally *every* time you read a value from riak, and once you've done your application-level modifications to the resolved object, write it back. Depending on your riak client, it will either have a nice resolution system, or allow you to write one yourself. Sam [1]: http://learnyousomeerlang.com/starting-out-for-real [2]: https://github.com/basho/riak_kv [3]: https://github.com/basho/riak_kv/blob/develop/src/riak_object.erl#L44-L51 [4]: https://gist.github.com/lenary/6378560 -- Sam Elliott Engineer [email protected] -- On Thursday, 29 August 2013 at 9:49AM, Simon Effenberg wrote: > Thanks! But could you explain in detail how you came to this numbers > from my previous mail? I have plenty of them in the crash dump and > don't want to put all of them herein :) > > Cheers > Simon > > On Thu, 29 Aug 2013 08:59:15 -0400 > Sam Elliott <[email protected] (mailto:[email protected])> wrote: > > > So I found the following: > > > > {r_object,<<99,111,110,118,101,114,115,97,116,105,111,110>>,<<100,107,114,100,54,58,104,107,110,120,110,114,115,49>>, > > ... > > > > Which turns into: > > Bucket: <<"conversation">> > > Key: <<"dkrd6:hknxnrs1">> > > > > Requesting with an R=1 may not give you exactly what you want, as N > > requests will be made, but only R will be waited-upon. > > > > Sam > > > > -- > > Sam Elliott > > Engineer > > [email protected] (mailto:[email protected]) > > -- > > > > > > On Thursday, 29 August 2013 at 8:53AM, Simon Effenberg wrote: > > > > > Hi Sam, > > > > > > thanks for the answer.. see my questions inline.. > > > > > > > > > On Thu, 29 Aug 2013 08:44:10 -0400 > > > Sam Elliott <[email protected] (mailto:[email protected])> wrote: > > > > > > > Hey Simon, > > > > > > > > If you grep that logfile for "r_object", you'll find a tuple where that > > > > is the first atom, followed by two binaries. The first is the bucket, > > > > the second is the key. > > > > > > > > > I greped and found stuff like this: > > > > > > 8EF5C098:t7:A8:r_object,H8EF5C1B8,H8EF5C1D0,H8EF5C1E8,H8EF5C1F8,H8EF5C208,A9:undefined > > > > > > I'm not sure what you mean with "atom" and "binaries".. can you explain? > > > > > > > > > > > You should then be able to request that bucket-key combination with a > > > > content-type of "text/plain" to see a list of its siblings, if the fsm > > > > doesn't crash again (which may indeed happen, because despite only > > > > asking for the siblings, the fsm is asked for the whole object). > > > > > > > > > Shouldn't it be possible to do the GET with a R=1 on the node with the > > > big object so that it is not a huge problem? > > > > > > Cheers > > > Simon > > > > > > > > > > > Sam > > > > > > > > -- > > > > Sam Elliott > > > > Engineer > > > > [email protected] (mailto:[email protected]) > > > > -- > > > > > > > > > > > > On Thursday, 29 August 2013 at 3:18AM, Simon Effenberg wrote: > > > > > > > > > And here the log files when it crashes (OOM).. > > > > > > > > > > On Thu, 29 Aug 2013 09:01:35 +0200 > > > > > Simon Effenberg <[email protected] > > > > > (mailto:[email protected])> wrote: > > > > > > > > > > > At least one of my 18 nodes is crashing soonish after I started it. > > > > > > > > > > > > I attached the app.config and vm.args files. > > > > > > > > > > > > If it is a big object (or multiple of them) then I have no clue how > > > > > > to > > > > > > find out what object it is. Our secondary indexes are somehow b0rked > > > > > > because I can't find a new imported object even if I search with the > > > > > > exakt createdat_int index which is returned by a HEAD/GET request to > > > > > > the object itself. > > > > > > > > > > > > Any help is much appreciated.. > > > > > > Simon > > > > > > > > > > > > On Thu, 29 Aug 2013 07:26:31 +0200 > > > > > > Simon Effenberg <[email protected] > > > > > > (mailto:[email protected])> wrote: > > > > > > > > > > > > > allow_multi is on and I looked into the graphs.. we have had some > > > > > > > siblings while having big objects but its "only" a max of 35 > > > > > > > siblings > > > > > > > found within one GET request. But I can't say if this is "one" GET > > > > > > > request having 35 siblings or 35 GET requests each having 1 > > > > > > > sibling. > > > > > > > > > > > > > > Also the question: in the erl_crash.dmp I have almost all data at > > > > > > > the > > > > > > > end (> 500MB) which is a huge multiline but with reaaaaaallyyy > > > > > > > long > > > > > > > lines of hexadecimal numbers like > > > > > > > > > > > > > > 36F6465223A223730303531393935227D2C7B226163636F756E74223A22313 > > > > > > > > > > > > > > can I get therein (crash dump file) any hint about the object > > > > > > > which > > > > > > > caused the crash? > > > > > > > > > > > > > > Cheers > > > > > > > Simon > > > > > > > > > > > > > > > > > > > > > On Wed, 28 Aug 2013 19:05:23 -0400 > > > > > > > Sam Elliott <[email protected] > > > > > > > (mailto:[email protected])> wrote: > > > > > > > > > > > > > > > Simon, > > > > > > > > > > > > > > > > This sounds like it might be some kind of sibling explosion. Do > > > > > > > > you have allow_mult=true set on any buckets? If so, are you > > > > > > > > resolving every single time you read the object? What's your > > > > > > > > regular object size (before you started seeing big objects)? > > > > > > > > > > > > > > > > There's some more info in our docs [1] - search for "siblings" > > > > > > > > for the stat names associated with them that might give you > > > > > > > > some information. > > > > > > > > > > > > > > > > Sam > > > > > > > > > > > > > > > > [1] > > > > > > > > http://docs.basho.com/riak/latest/ops/running/stats-and-monitoring/ > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Sam Elliott > > > > > > > > Engineer > > > > > > > > [email protected] (mailto:[email protected]) > > > > > > > > -- > > > > > > > > > > > > > > > > > > > > > > > > On Wednesday, 28 August 2013 at 6:50PM, Simon Effenberg wrote: > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > we have suddenly (regarding to riak stats) really big objects > > > > > > > > > (100th > > > > > > > > > percentile of object size) of 400MB to 900MB. > > > > > > > > > > > > > > > > > > We have no clue from where or how this could be.. is it > > > > > > > > > somehow > > > > > > > > > possible to identify them easily? > > > > > > > > > > > > > > > > > > Cheers > > > > > > > > > Simon > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > > riak-users mailing list > > > > > > > > > [email protected] (mailto:[email protected]) > > > > > > > > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Simon Effenberg | Site Ops Engineer | mobile.international GmbH > > > > > > > Fon: + 49-(0)30-8109 - 7173 > > > > > > > Fax: + 49-(0)30-8109 - 7131 > > > > > > > > > > > > > > Mail: [email protected] (mailto:[email protected]) > > > > > > > Web: www.mobile.de (http://www.mobile.de) > > > > > > > > > > > > > > Marktplatz 1 | 14532 Europarc Dreilinden | Germany > > > > > > > > > > > > > > > > > > > > > Geschäftsführer: Malte Krüger > > > > > > > HRB Nr.: 18517 P, Amtsgericht Potsdam > > > > > > > Sitz der Gesellschaft: Kleinmachnow > > > > > > > > > > > > > > _______________________________________________ > > > > > > > riak-users mailing list > > > > > > > [email protected] (mailto:[email protected]) > > > > > > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Simon Effenberg | Site Ops Engineer | mobile.international GmbH > > > > > > Fon: + 49-(0)30-8109 - 7173 > > > > > > Fax: + 49-(0)30-8109 - 7131 > > > > > > > > > > > > Mail: [email protected] (mailto:[email protected]) > > > > > > Web: www.mobile.de (http://www.mobile.de) > > > > > > > > > > > > Marktplatz 1 | 14532 Europarc Dreilinden | Germany > > > > > > > > > > > > > > > > > > Geschäftsführer: Malte Krüger > > > > > > HRB Nr.: 18517 P, Amtsgericht Potsdam > > > > > > Sitz der Gesellschaft: Kleinmachnow > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Simon Effenberg | Site Ops Engineer | mobile.international GmbH > > > > > Fon: + 49-(0)30-8109 - 7173 > > > > > Fax: + 49-(0)30-8109 - 7131 > > > > > > > > > > Mail: [email protected] (mailto:[email protected]) > > > > > Web: www.mobile.de (http://www.mobile.de) > > > > > > > > > > Marktplatz 1 | 14532 Europarc Dreilinden | Germany > > > > > > > > > > > > > > > Geschäftsführer: Malte Krüger > > > > > HRB Nr.: 18517 P, Amtsgericht Potsdam > > > > > Sitz der Gesellschaft: Kleinmachnow > > > > > > > > > > > > > > > Attachments: > > > > > - console.log > > > > > > > > > > - crash.log > > > > > > > > > > - erlang.log.1 > > > > > > > > > > - error.log > > > > > > > > > > > > > > > -- > > > Simon Effenberg | Site Ops Engineer | mobile.international GmbH > > > Fon: + 49-(0)30-8109 - 7173 > > > Fax: + 49-(0)30-8109 - 7131 > > > > > > Mail: [email protected] (mailto:[email protected]) > > > Web: www.mobile.de (http://www.mobile.de) > > > > > > Marktplatz 1 | 14532 Europarc Dreilinden | Germany > > > > > > > > > Geschäftsführer: Malte Krüger > > > HRB Nr.: 18517 P, Amtsgericht Potsdam > > > Sitz der Gesellschaft: Kleinmachnow > > > > > > > -- > Simon Effenberg | Site Ops Engineer | mobile.international GmbH > Fon: + 49-(0)30-8109 - 7173 > Fax: + 49-(0)30-8109 - 7131 > > Mail: [email protected] (mailto:[email protected]) > Web: www.mobile.de (http://www.mobile.de) > > Marktplatz 1 | 14532 Europarc Dreilinden | Germany > > > Geschäftsführer: Malte Krüger > HRB Nr.: 18517 P, Amtsgericht Potsdam > Sitz der Gesellschaft: Kleinmachnow _______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
