Thanks for the explanation! But it's not possible to get this information out of the r_object in the erl_crash.dmp? Maybe I missed that but the crash information didn't helped.
Is it somehow possible to know the size of objects? maybe stored in the AAE data? (if yes, how could we get this out of it?) Cheers Simon On Thu, 29 Aug 2013 10:16:03 -0400 Sam Elliott <[email protected]> wrote: > Given your confusion about atoms and binaries below, firstly read [1], so > you're up to speed about the various Erlang types. > > Now, looking back at your logs, there are instances of the atom "r_object" at > the start of a tuple. This is a specific record that riak uses internally to > store both the metadata and the binary contents of a bucket-key-value item. > > Now, a little grepping through the riak_kv repository [2], turns out a > reference to this "r_object" atom as part of a record [3]. Records are just > tuples with a specific atom at the start. The record specification says that > the first field is the bucket, and the second is the key (both as binary > data). > > So, going back to your logs, where you found "r_object", the second and third > items in those tuples are the binaries representing your bucket and your key > (the first is the name of the record, the atom "r_object"). > > If you copy and paste them into a new erlang shell, putting a "." afterwards, > the shell will print them out as strings, like in [4]. > > I had a headstart because I know the riak_kv repository and how it works, but > hopefully this will help. > > Yes, your nodes will start to OOM if you don't have enough RAM to fetch these > 500-900MB items. > > That's why you should resolve siblings locally *every* time you read a value > from riak, and once you've done your application-level modifications to the > resolved object, write it back. Depending on your riak client, it will either > have a nice resolution system, or allow you to write one yourself. > > Sam > > [1]: http://learnyousomeerlang.com/starting-out-for-real > [2]: https://github.com/basho/riak_kv > [3]: https://github.com/basho/riak_kv/blob/develop/src/riak_object.erl#L44-L51 > [4]: https://gist.github.com/lenary/6378560 > > -- > Sam Elliott > Engineer > [email protected] > -- > > > On Thursday, 29 August 2013 at 9:49AM, Simon Effenberg wrote: > > > Thanks! But could you explain in detail how you came to this numbers > > from my previous mail? I have plenty of them in the crash dump and > > don't want to put all of them herein :) > > > > Cheers > > Simon > > > > On Thu, 29 Aug 2013 08:59:15 -0400 > > Sam Elliott <[email protected] (mailto:[email protected])> wrote: > > > > > So I found the following: > > > > > > {r_object,<<99,111,110,118,101,114,115,97,116,105,111,110>>,<<100,107,114,100,54,58,104,107,110,120,110,114,115,49>>, > > > ... > > > > > > Which turns into: > > > Bucket: <<"conversation">> > > > Key: <<"dkrd6:hknxnrs1">> > > > > > > Requesting with an R=1 may not give you exactly what you want, as N > > > requests will be made, but only R will be waited-upon. > > > > > > Sam > > > > > > -- > > > Sam Elliott > > > Engineer > > > [email protected] (mailto:[email protected]) > > > -- > > > > > > > > > On Thursday, 29 August 2013 at 8:53AM, Simon Effenberg wrote: > > > > > > > Hi Sam, > > > > > > > > thanks for the answer.. see my questions inline.. > > > > > > > > > > > > On Thu, 29 Aug 2013 08:44:10 -0400 > > > > Sam Elliott <[email protected] (mailto:[email protected])> > > > > wrote: > > > > > > > > > Hey Simon, > > > > > > > > > > If you grep that logfile for "r_object", you'll find a tuple where > > > > > that is the first atom, followed by two binaries. The first is the > > > > > bucket, the second is the key. > > > > > > > > > > > > I greped and found stuff like this: > > > > > > > > 8EF5C098:t7:A8:r_object,H8EF5C1B8,H8EF5C1D0,H8EF5C1E8,H8EF5C1F8,H8EF5C208,A9:undefined > > > > > > > > I'm not sure what you mean with "atom" and "binaries".. can you explain? > > > > > > > > > > > > > > You should then be able to request that bucket-key combination with a > > > > > content-type of "text/plain" to see a list of its siblings, if the > > > > > fsm doesn't crash again (which may indeed happen, because despite > > > > > only asking for the siblings, the fsm is asked for the whole object). > > > > > > > > > > > > Shouldn't it be possible to do the GET with a R=1 on the node with the > > > > big object so that it is not a huge problem? > > > > > > > > Cheers > > > > Simon > > > > > > > > > > > > > > Sam > > > > > > > > > > -- > > > > > Sam Elliott > > > > > Engineer > > > > > [email protected] (mailto:[email protected]) > > > > > -- > > > > > > > > > > > > > > > On Thursday, 29 August 2013 at 3:18AM, Simon Effenberg wrote: > > > > > > > > > > > And here the log files when it crashes (OOM).. > > > > > > > > > > > > On Thu, 29 Aug 2013 09:01:35 +0200 > > > > > > Simon Effenberg <[email protected] > > > > > > (mailto:[email protected])> wrote: > > > > > > > > > > > > > At least one of my 18 nodes is crashing soonish after I started > > > > > > > it. > > > > > > > > > > > > > > I attached the app.config and vm.args files. > > > > > > > > > > > > > > If it is a big object (or multiple of them) then I have no clue > > > > > > > how to > > > > > > > find out what object it is. Our secondary indexes are somehow > > > > > > > b0rked > > > > > > > because I can't find a new imported object even if I search with > > > > > > > the > > > > > > > exakt createdat_int index which is returned by a HEAD/GET request > > > > > > > to > > > > > > > the object itself. > > > > > > > > > > > > > > Any help is much appreciated.. > > > > > > > Simon > > > > > > > > > > > > > > On Thu, 29 Aug 2013 07:26:31 +0200 > > > > > > > Simon Effenberg <[email protected] > > > > > > > (mailto:[email protected])> wrote: > > > > > > > > > > > > > > > allow_multi is on and I looked into the graphs.. we have had > > > > > > > > some > > > > > > > > siblings while having big objects but its "only" a max of 35 > > > > > > > > siblings > > > > > > > > found within one GET request. But I can't say if this is "one" > > > > > > > > GET > > > > > > > > request having 35 siblings or 35 GET requests each having 1 > > > > > > > > sibling. > > > > > > > > > > > > > > > > Also the question: in the erl_crash.dmp I have almost all data > > > > > > > > at the > > > > > > > > end (> 500MB) which is a huge multiline but with reaaaaaallyyy > > > > > > > > long > > > > > > > > lines of hexadecimal numbers like > > > > > > > > > > > > > > > > 36F6465223A223730303531393935227D2C7B226163636F756E74223A22313 > > > > > > > > > > > > > > > > can I get therein (crash dump file) any hint about the object > > > > > > > > which > > > > > > > > caused the crash? > > > > > > > > > > > > > > > > Cheers > > > > > > > > Simon > > > > > > > > > > > > > > > > > > > > > > > > On Wed, 28 Aug 2013 19:05:23 -0400 > > > > > > > > Sam Elliott <[email protected] > > > > > > > > (mailto:[email protected])> wrote: > > > > > > > > > > > > > > > > > Simon, > > > > > > > > > > > > > > > > > > This sounds like it might be some kind of sibling explosion. > > > > > > > > > Do you have allow_mult=true set on any buckets? If so, are > > > > > > > > > you resolving every single time you read the object? What's > > > > > > > > > your regular object size (before you started seeing big > > > > > > > > > objects)? > > > > > > > > > > > > > > > > > > There's some more info in our docs [1] - search for > > > > > > > > > "siblings" for the stat names associated with them that might > > > > > > > > > give you some information. > > > > > > > > > > > > > > > > > > Sam > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > http://docs.basho.com/riak/latest/ops/running/stats-and-monitoring/ > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Sam Elliott > > > > > > > > > Engineer > > > > > > > > > [email protected] (mailto:[email protected]) > > > > > > > > > -- > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wednesday, 28 August 2013 at 6:50PM, Simon Effenberg wrote: > > > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > we have suddenly (regarding to riak stats) really big > > > > > > > > > > objects (100th > > > > > > > > > > percentile of object size) of 400MB to 900MB. > > > > > > > > > > > > > > > > > > > > We have no clue from where or how this could be.. is it > > > > > > > > > > somehow > > > > > > > > > > possible to identify them easily? > > > > > > > > > > > > > > > > > > > > Cheers > > > > > > > > > > Simon > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > > > riak-users mailing list > > > > > > > > > > [email protected] > > > > > > > > > > (mailto:[email protected]) > > > > > > > > > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Simon Effenberg | Site Ops Engineer | mobile.international GmbH > > > > > > > > Fon: + 49-(0)30-8109 - 7173 > > > > > > > > Fax: + 49-(0)30-8109 - 7131 > > > > > > > > > > > > > > > > Mail: [email protected] > > > > > > > > (mailto:[email protected]) > > > > > > > > Web: www.mobile.de (http://www.mobile.de) > > > > > > > > > > > > > > > > Marktplatz 1 | 14532 Europarc Dreilinden | Germany > > > > > > > > > > > > > > > > > > > > > > > > Geschäftsführer: Malte Krüger > > > > > > > > HRB Nr.: 18517 P, Amtsgericht Potsdam > > > > > > > > Sitz der Gesellschaft: Kleinmachnow > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > riak-users mailing list > > > > > > > > [email protected] (mailto:[email protected]) > > > > > > > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Simon Effenberg | Site Ops Engineer | mobile.international GmbH > > > > > > > Fon: + 49-(0)30-8109 - 7173 > > > > > > > Fax: + 49-(0)30-8109 - 7131 > > > > > > > > > > > > > > Mail: [email protected] (mailto:[email protected]) > > > > > > > Web: www.mobile.de (http://www.mobile.de) > > > > > > > > > > > > > > Marktplatz 1 | 14532 Europarc Dreilinden | Germany > > > > > > > > > > > > > > > > > > > > > Geschäftsführer: Malte Krüger > > > > > > > HRB Nr.: 18517 P, Amtsgericht Potsdam > > > > > > > Sitz der Gesellschaft: Kleinmachnow > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Simon Effenberg | Site Ops Engineer | mobile.international GmbH > > > > > > Fon: + 49-(0)30-8109 - 7173 > > > > > > Fax: + 49-(0)30-8109 - 7131 > > > > > > > > > > > > Mail: [email protected] (mailto:[email protected]) > > > > > > Web: www.mobile.de (http://www.mobile.de) > > > > > > > > > > > > Marktplatz 1 | 14532 Europarc Dreilinden | Germany > > > > > > > > > > > > > > > > > > Geschäftsführer: Malte Krüger > > > > > > HRB Nr.: 18517 P, Amtsgericht Potsdam > > > > > > Sitz der Gesellschaft: Kleinmachnow > > > > > > > > > > > > > > > > > > Attachments: > > > > > > - console.log > > > > > > > > > > > > - crash.log > > > > > > > > > > > > - erlang.log.1 > > > > > > > > > > > > - error.log > > > > > > > > > > > > > > > > > > > > -- > > > > Simon Effenberg | Site Ops Engineer | mobile.international GmbH > > > > Fon: + 49-(0)30-8109 - 7173 > > > > Fax: + 49-(0)30-8109 - 7131 > > > > > > > > Mail: [email protected] (mailto:[email protected]) > > > > Web: www.mobile.de (http://www.mobile.de) > > > > > > > > Marktplatz 1 | 14532 Europarc Dreilinden | Germany > > > > > > > > > > > > Geschäftsführer: Malte Krüger > > > > HRB Nr.: 18517 P, Amtsgericht Potsdam > > > > Sitz der Gesellschaft: Kleinmachnow > > > > > > > > > > > > > -- > > Simon Effenberg | Site Ops Engineer | mobile.international GmbH > > Fon: + 49-(0)30-8109 - 7173 > > Fax: + 49-(0)30-8109 - 7131 > > > > Mail: [email protected] (mailto:[email protected]) > > Web: www.mobile.de (http://www.mobile.de) > > > > Marktplatz 1 | 14532 Europarc Dreilinden | Germany > > > > > > Geschäftsführer: Malte Krüger > > HRB Nr.: 18517 P, Amtsgericht Potsdam > > Sitz der Gesellschaft: Kleinmachnow > > > -- Simon Effenberg | Site Ops Engineer | mobile.international GmbH Fon: + 49-(0)30-8109 - 7173 Fax: + 49-(0)30-8109 - 7131 Mail: [email protected] Web: www.mobile.de Marktplatz 1 | 14532 Europarc Dreilinden | Germany Geschäftsführer: Malte Krüger HRB Nr.: 18517 P, Amtsgericht Potsdam Sitz der Gesellschaft: Kleinmachnow _______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
