Oh, I used crash.log. That might make a difference. As for looking at the size of objects, Someone might have scripts for interrogating bitcask/leveldb directly, I'm not sure. I also don't know what data is in the AAE tree, so hopefully someone else on the ML will do.
Sam -- Sam Elliott Engineer [email protected] -- On Thursday, 29 August 2013 at 11:21AM, Simon Effenberg wrote: > Thanks for the explanation! But it's not possible to get this > information out of the r_object in the erl_crash.dmp? Maybe I missed > that but the crash information didn't helped. > > Is it somehow possible to know the size of objects? maybe stored in the > AAE data? (if yes, how could we get this out of it?) > > Cheers > Simon > > On Thu, 29 Aug 2013 10:16:03 -0400 > Sam Elliott <[email protected] (mailto:[email protected])> wrote: > > > Given your confusion about atoms and binaries below, firstly read [1], so > > you're up to speed about the various Erlang types. > > > > Now, looking back at your logs, there are instances of the atom "r_object" > > at the start of a tuple. This is a specific record that riak uses > > internally to store both the metadata and the binary contents of a > > bucket-key-value item. > > > > Now, a little grepping through the riak_kv repository [2], turns out a > > reference to this "r_object" atom as part of a record [3]. Records are just > > tuples with a specific atom at the start. The record specification says > > that the first field is the bucket, and the second is the key (both as > > binary data). > > > > So, going back to your logs, where you found "r_object", the second and > > third items in those tuples are the binaries representing your bucket and > > your key (the first is the name of the record, the atom "r_object"). > > > > If you copy and paste them into a new erlang shell, putting a "." > > afterwards, the shell will print them out as strings, like in [4]. > > > > I had a headstart because I know the riak_kv repository and how it works, > > but hopefully this will help. > > > > Yes, your nodes will start to OOM if you don't have enough RAM to fetch > > these 500-900MB items. > > > > That's why you should resolve siblings locally *every* time you read a > > value from riak, and once you've done your application-level modifications > > to the resolved object, write it back. Depending on your riak client, it > > will either have a nice resolution system, or allow you to write one > > yourself. > > > > Sam > > > > [1]: http://learnyousomeerlang.com/starting-out-for-real > > [2]: https://github.com/basho/riak_kv > > [3]: > > https://github.com/basho/riak_kv/blob/develop/src/riak_object.erl#L44-L51 > > [4]: https://gist.github.com/lenary/6378560 > > > > -- > > Sam Elliott > > Engineer > > [email protected] (mailto:[email protected]) > > -- > > > > > > On Thursday, 29 August 2013 at 9:49AM, Simon Effenberg wrote: > > > > > Thanks! But could you explain in detail how you came to this numbers > > > from my previous mail? I have plenty of them in the crash dump and > > > don't want to put all of them herein :) > > > > > > Cheers > > > Simon > > > > > > On Thu, 29 Aug 2013 08:59:15 -0400 > > > Sam Elliott <[email protected] (mailto:[email protected])> wrote: > > > > > > > So I found the following: > > > > > > > > {r_object,<<99,111,110,118,101,114,115,97,116,105,111,110>>,<<100,107,114,100,54,58,104,107,110,120,110,114,115,49>>, > > > > ... > > > > > > > > Which turns into: > > > > Bucket: <<"conversation">> > > > > Key: <<"dkrd6:hknxnrs1">> > > > > > > > > Requesting with an R=1 may not give you exactly what you want, as N > > > > requests will be made, but only R will be waited-upon. > > > > > > > > Sam > > > > > > > > -- > > > > Sam Elliott > > > > Engineer > > > > [email protected] (mailto:[email protected]) > > > > -- > > > > > > > > > > > > On Thursday, 29 August 2013 at 8:53AM, Simon Effenberg wrote: > > > > > > > > > Hi Sam, > > > > > > > > > > thanks for the answer.. see my questions inline.. > > > > > > > > > > > > > > > On Thu, 29 Aug 2013 08:44:10 -0400 > > > > > Sam Elliott <[email protected] (mailto:[email protected])> > > > > > wrote: > > > > > > > > > > > Hey Simon, > > > > > > > > > > > > If you grep that logfile for "r_object", you'll find a tuple where > > > > > > that is the first atom, followed by two binaries. The first is the > > > > > > bucket, the second is the key. > > > > > > > > > > > > > > > > > > > > I greped and found stuff like this: > > > > > > > > > > 8EF5C098:t7:A8:r_object,H8EF5C1B8,H8EF5C1D0,H8EF5C1E8,H8EF5C1F8,H8EF5C208,A9:undefined > > > > > > > > > > I'm not sure what you mean with "atom" and "binaries".. can you > > > > > explain? > > > > > > > > > > > > > > > > > You should then be able to request that bucket-key combination with > > > > > > a content-type of "text/plain" to see a list of its siblings, if > > > > > > the fsm doesn't crash again (which may indeed happen, because > > > > > > despite only asking for the siblings, the fsm is asked for the > > > > > > whole object). > > > > > > > > > > > > > > > > > > > > Shouldn't it be possible to do the GET with a R=1 on the node with the > > > > > big object so that it is not a huge problem? > > > > > > > > > > Cheers > > > > > Simon > > > > > > > > > > > > > > > > > Sam > > > > > > > > > > > > -- > > > > > > Sam Elliott > > > > > > Engineer > > > > > > [email protected] (mailto:[email protected]) > > > > > > -- > > > > > > > > > > > > > > > > > > On Thursday, 29 August 2013 at 3:18AM, Simon Effenberg wrote: > > > > > > > > > > > > > And here the log files when it crashes (OOM).. > > > > > > > > > > > > > > On Thu, 29 Aug 2013 09:01:35 +0200 > > > > > > > Simon Effenberg <[email protected] > > > > > > > (mailto:[email protected])> wrote: > > > > > > > > > > > > > > > At least one of my 18 nodes is crashing soonish after I started > > > > > > > > it. > > > > > > > > > > > > > > > > I attached the app.config and vm.args files. > > > > > > > > > > > > > > > > If it is a big object (or multiple of them) then I have no clue > > > > > > > > how to > > > > > > > > find out what object it is. Our secondary indexes are somehow > > > > > > > > b0rked > > > > > > > > because I can't find a new imported object even if I search > > > > > > > > with the > > > > > > > > exakt createdat_int index which is returned by a HEAD/GET > > > > > > > > request to > > > > > > > > the object itself. > > > > > > > > > > > > > > > > Any help is much appreciated.. > > > > > > > > Simon > > > > > > > > > > > > > > > > On Thu, 29 Aug 2013 07:26:31 +0200 > > > > > > > > Simon Effenberg <[email protected] > > > > > > > > (mailto:[email protected])> wrote: > > > > > > > > > > > > > > > > > allow_multi is on and I looked into the graphs.. we have had > > > > > > > > > some > > > > > > > > > siblings while having big objects but its "only" a max of 35 > > > > > > > > > siblings > > > > > > > > > found within one GET request. But I can't say if this is > > > > > > > > > "one" GET > > > > > > > > > request having 35 siblings or 35 GET requests each having 1 > > > > > > > > > sibling. > > > > > > > > > > > > > > > > > > Also the question: in the erl_crash.dmp I have almost all > > > > > > > > > data at the > > > > > > > > > end (> 500MB) which is a huge multiline but with > > > > > > > > > reaaaaaallyyy long > > > > > > > > > lines of hexadecimal numbers like > > > > > > > > > > > > > > > > > > 36F6465223A223730303531393935227D2C7B226163636F756E74223A22313 > > > > > > > > > > > > > > > > > > can I get therein (crash dump file) any hint about the object > > > > > > > > > which > > > > > > > > > caused the crash? > > > > > > > > > > > > > > > > > > Cheers > > > > > > > > > Simon > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, 28 Aug 2013 19:05:23 -0400 > > > > > > > > > Sam Elliott <[email protected] > > > > > > > > > (mailto:[email protected])> wrote: > > > > > > > > > > > > > > > > > > > Simon, > > > > > > > > > > > > > > > > > > > > This sounds like it might be some kind of sibling > > > > > > > > > > explosion. Do you have allow_mult=true set on any buckets? > > > > > > > > > > If so, are you resolving every single time you read the > > > > > > > > > > object? What's your regular object size (before you started > > > > > > > > > > seeing big objects)? > > > > > > > > > > > > > > > > > > > > There's some more info in our docs [1] - search for > > > > > > > > > > "siblings" for the stat names associated with them that > > > > > > > > > > might give you some information. > > > > > > > > > > > > > > > > > > > > Sam > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > http://docs.basho.com/riak/latest/ops/running/stats-and-monitoring/ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Sam Elliott > > > > > > > > > > Engineer > > > > > > > > > > [email protected] (mailto:[email protected]) > > > > > > > > > > -- > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wednesday, 28 August 2013 at 6:50PM, Simon Effenberg > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > > > we have suddenly (regarding to riak stats) really big > > > > > > > > > > > objects (100th > > > > > > > > > > > percentile of object size) of 400MB to 900MB. > > > > > > > > > > > > > > > > > > > > > > We have no clue from where or how this could be.. is it > > > > > > > > > > > somehow > > > > > > > > > > > possible to identify them easily? > > > > > > > > > > > > > > > > > > > > > > Cheers > > > > > > > > > > > Simon > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > > > > riak-users mailing list > > > > > > > > > > > [email protected] > > > > > > > > > > > (mailto:[email protected]) > > > > > > > > > > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Simon Effenberg | Site Ops Engineer | mobile.international > > > > > > > > > GmbH > > > > > > > > > Fon: + 49-(0)30-8109 - 7173 > > > > > > > > > Fax: + 49-(0)30-8109 - 7131 > > > > > > > > > > > > > > > > > > Mail: [email protected] > > > > > > > > > (mailto:[email protected]) > > > > > > > > > Web: www.mobile.de (http://www.mobile.de) > > > > > > > > > > > > > > > > > > Marktplatz 1 | 14532 Europarc Dreilinden | Germany > > > > > > > > > > > > > > > > > > > > > > > > > > > Geschäftsführer: Malte Krüger > > > > > > > > > HRB Nr.: 18517 P, Amtsgericht Potsdam > > > > > > > > > Sitz der Gesellschaft: Kleinmachnow > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > > riak-users mailing list > > > > > > > > > [email protected] (mailto:[email protected]) > > > > > > > > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Simon Effenberg | Site Ops Engineer | mobile.international GmbH > > > > > > > > Fon: + 49-(0)30-8109 - 7173 > > > > > > > > Fax: + 49-(0)30-8109 - 7131 > > > > > > > > > > > > > > > > Mail: [email protected] > > > > > > > > (mailto:[email protected]) > > > > > > > > Web: www.mobile.de (http://www.mobile.de) > > > > > > > > > > > > > > > > Marktplatz 1 | 14532 Europarc Dreilinden | Germany > > > > > > > > > > > > > > > > > > > > > > > > Geschäftsführer: Malte Krüger > > > > > > > > HRB Nr.: 18517 P, Amtsgericht Potsdam > > > > > > > > Sitz der Gesellschaft: Kleinmachnow > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Simon Effenberg | Site Ops Engineer | mobile.international GmbH > > > > > > > Fon: + 49-(0)30-8109 - 7173 > > > > > > > Fax: + 49-(0)30-8109 - 7131 > > > > > > > > > > > > > > Mail: [email protected] (mailto:[email protected]) > > > > > > > Web: www.mobile.de (http://www.mobile.de) > > > > > > > > > > > > > > Marktplatz 1 | 14532 Europarc Dreilinden | Germany > > > > > > > > > > > > > > > > > > > > > Geschäftsführer: Malte Krüger > > > > > > > HRB Nr.: 18517 P, Amtsgericht Potsdam > > > > > > > Sitz der Gesellschaft: Kleinmachnow > > > > > > > > > > > > > > > > > > > > > Attachments: > > > > > > > - console.log > > > > > > > > > > > > > > - crash.log > > > > > > > > > > > > > > - erlang.log.1 > > > > > > > > > > > > > > - error.log > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Simon Effenberg | Site Ops Engineer | mobile.international GmbH > > > > > Fon: + 49-(0)30-8109 - 7173 > > > > > Fax: + 49-(0)30-8109 - 7131 > > > > > > > > > > Mail: [email protected] (mailto:[email protected]) > > > > > Web: www.mobile.de (http://www.mobile.de) > > > > > > > > > > Marktplatz 1 | 14532 Europarc Dreilinden | Germany > > > > > > > > > > > > > > > Geschäftsführer: Malte Krüger > > > > > HRB Nr.: 18517 P, Amtsgericht Potsdam > > > > > Sitz der Gesellschaft: Kleinmachnow > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > Simon Effenberg | Site Ops Engineer | mobile.international GmbH > > > Fon: + 49-(0)30-8109 - 7173 > > > Fax: + 49-(0)30-8109 - 7131 > > > > > > Mail: [email protected] (mailto:[email protected]) > > > Web: www.mobile.de (http://www.mobile.de) > > > > > > Marktplatz 1 | 14532 Europarc Dreilinden | Germany > > > > > > > > > Geschäftsführer: Malte Krüger > > > HRB Nr.: 18517 P, Amtsgericht Potsdam > > > Sitz der Gesellschaft: Kleinmachnow > > > > > > > -- > Simon Effenberg | Site Ops Engineer | mobile.international GmbH > Fon: + 49-(0)30-8109 - 7173 > Fax: + 49-(0)30-8109 - 7131 > > Mail: [email protected] (mailto:[email protected]) > Web: www.mobile.de (http://www.mobile.de) > > Marktplatz 1 | 14532 Europarc Dreilinden | Germany > > > Geschäftsführer: Malte Krüger > HRB Nr.: 18517 P, Amtsgericht Potsdam > Sitz der Gesellschaft: Kleinmachnow _______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
