Oh, I used crash.log. That might make a difference.  

As for looking at the size of objects, Someone might have scripts for 
interrogating bitcask/leveldb directly, I'm not sure. I also don't know what 
data is in the AAE tree, so hopefully someone else on the ML will do.

Sam  

--  
Sam Elliott
Engineer
[email protected]
--


On Thursday, 29 August 2013 at 11:21AM, Simon Effenberg wrote:

> Thanks for the explanation! But it's not possible to get this
> information out of the r_object in the erl_crash.dmp? Maybe I missed
> that but the crash information didn't helped.
>  
> Is it somehow possible to know the size of objects? maybe stored in the
> AAE data? (if yes, how could we get this out of it?)
>  
> Cheers
> Simon
>  
> On Thu, 29 Aug 2013 10:16:03 -0400
> Sam Elliott <[email protected] (mailto:[email protected])> wrote:
>  
> > Given your confusion about atoms and binaries below, firstly read [1], so 
> > you're up to speed about the various Erlang types.
> >  
> > Now, looking back at your logs, there are instances of the atom "r_object" 
> > at the start of a tuple. This is a specific record that riak uses 
> > internally to store both the metadata and the binary contents of a 
> > bucket-key-value item.
> >  
> > Now, a little grepping through the riak_kv repository [2], turns out a 
> > reference to this "r_object" atom as part of a record [3]. Records are just 
> > tuples with a specific atom at the start. The record specification says 
> > that the first field is the bucket, and the second is the key (both as 
> > binary data).
> >  
> > So, going back to your logs, where you found "r_object", the second and 
> > third items in those tuples are the binaries representing your bucket and 
> > your key (the first is the name of the record, the atom "r_object").  
> >  
> > If you copy and paste them into a new erlang shell, putting a "." 
> > afterwards, the shell will print them out as strings, like in [4].
> >  
> > I had a headstart because I know the riak_kv repository and how it works, 
> > but hopefully this will help.
> >  
> > Yes, your nodes will start to OOM if you don't have enough RAM to fetch 
> > these 500-900MB items.  
> >  
> > That's why you should resolve siblings locally *every* time you read a 
> > value from riak, and once you've done your application-level modifications 
> > to the resolved object, write it back. Depending on your riak client, it 
> > will either have a nice resolution system, or allow you to write one 
> > yourself.
> >  
> > Sam
> >  
> > [1]: http://learnyousomeerlang.com/starting-out-for-real  
> > [2]: https://github.com/basho/riak_kv  
> > [3]: 
> > https://github.com/basho/riak_kv/blob/develop/src/riak_object.erl#L44-L51
> > [4]: https://gist.github.com/lenary/6378560
> >  
> > --  
> > Sam Elliott
> > Engineer
> > [email protected] (mailto:[email protected])
> > --
> >  
> >  
> > On Thursday, 29 August 2013 at 9:49AM, Simon Effenberg wrote:
> >  
> > > Thanks! But could you explain in detail how you came to this numbers
> > > from my previous mail? I have plenty of them in the crash dump and
> > > don't want to put all of them herein :)
> > >  
> > > Cheers
> > > Simon
> > >  
> > > On Thu, 29 Aug 2013 08:59:15 -0400
> > > Sam Elliott <[email protected] (mailto:[email protected])> wrote:
> > >  
> > > > So I found the following:
> > > >  
> > > > {r_object,<<99,111,110,118,101,114,115,97,116,105,111,110>>,<<100,107,114,100,54,58,104,107,110,120,110,114,115,49>>,
> > > >  ...
> > > >  
> > > > Which turns into:
> > > > Bucket: <<"conversation">>
> > > > Key: <<"dkrd6:hknxnrs1">>
> > > >  
> > > > Requesting with an R=1 may not give you exactly what you want, as N 
> > > > requests will be made, but only R will be waited-upon.  
> > > >  
> > > > Sam
> > > >  
> > > > --  
> > > > Sam Elliott
> > > > Engineer
> > > > [email protected] (mailto:[email protected])
> > > > --
> > > >  
> > > >  
> > > > On Thursday, 29 August 2013 at 8:53AM, Simon Effenberg wrote:
> > > >  
> > > > > Hi Sam,
> > > > >  
> > > > > thanks for the answer.. see my questions inline..
> > > > >  
> > > > >  
> > > > > On Thu, 29 Aug 2013 08:44:10 -0400
> > > > > Sam Elliott <[email protected] (mailto:[email protected])> 
> > > > > wrote:
> > > > >  
> > > > > > Hey Simon,  
> > > > > >  
> > > > > > If you grep that logfile for "r_object", you'll find a tuple where 
> > > > > > that is the first atom, followed by two binaries. The first is the 
> > > > > > bucket, the second is the key.  
> > > > >  
> > > > >  
> > > > >  
> > > > > I greped and found stuff like this:
> > > > >  
> > > > > 8EF5C098:t7:A8:r_object,H8EF5C1B8,H8EF5C1D0,H8EF5C1E8,H8EF5C1F8,H8EF5C208,A9:undefined
> > > > >  
> > > > > I'm not sure what you mean with "atom" and "binaries".. can you 
> > > > > explain?
> > > > >  
> > > > > >  
> > > > > > You should then be able to request that bucket-key combination with 
> > > > > > a content-type of "text/plain" to see a list of its siblings, if 
> > > > > > the fsm doesn't crash again (which may indeed happen, because 
> > > > > > despite only asking for the siblings, the fsm is asked for the 
> > > > > > whole object).
> > > > >  
> > > > >  
> > > > >  
> > > > > Shouldn't it be possible to do the GET with a R=1 on the node with the
> > > > > big object so that it is not a huge problem?
> > > > >  
> > > > > Cheers
> > > > > Simon
> > > > >  
> > > > > >  
> > > > > > Sam  
> > > > > >  
> > > > > > --  
> > > > > > Sam Elliott
> > > > > > Engineer
> > > > > > [email protected] (mailto:[email protected])
> > > > > > --
> > > > > >  
> > > > > >  
> > > > > > On Thursday, 29 August 2013 at 3:18AM, Simon Effenberg wrote:
> > > > > >  
> > > > > > > And here the log files when it crashes (OOM)..
> > > > > > >  
> > > > > > > On Thu, 29 Aug 2013 09:01:35 +0200
> > > > > > > Simon Effenberg <[email protected] 
> > > > > > > (mailto:[email protected])> wrote:
> > > > > > >  
> > > > > > > > At least one of my 18 nodes is crashing soonish after I started 
> > > > > > > > it.
> > > > > > > >  
> > > > > > > > I attached the app.config and vm.args files.
> > > > > > > >  
> > > > > > > > If it is a big object (or multiple of them) then I have no clue 
> > > > > > > > how to
> > > > > > > > find out what object it is. Our secondary indexes are somehow 
> > > > > > > > b0rked
> > > > > > > > because I can't find a new imported object even if I search 
> > > > > > > > with the
> > > > > > > > exakt createdat_int index which is returned by a HEAD/GET 
> > > > > > > > request to
> > > > > > > > the object itself.
> > > > > > > >  
> > > > > > > > Any help is much appreciated..
> > > > > > > > Simon
> > > > > > > >  
> > > > > > > > On Thu, 29 Aug 2013 07:26:31 +0200
> > > > > > > > Simon Effenberg <[email protected] 
> > > > > > > > (mailto:[email protected])> wrote:
> > > > > > > >  
> > > > > > > > > allow_multi is on and I looked into the graphs.. we have had 
> > > > > > > > > some
> > > > > > > > > siblings while having big objects but its "only" a max of 35 
> > > > > > > > > siblings
> > > > > > > > > found within one GET request. But I can't say if this is 
> > > > > > > > > "one" GET
> > > > > > > > > request having 35 siblings or 35 GET requests each having 1
> > > > > > > > > sibling.
> > > > > > > > >  
> > > > > > > > > Also the question: in the erl_crash.dmp I have almost all 
> > > > > > > > > data at the
> > > > > > > > > end (> 500MB) which is a huge multiline but with 
> > > > > > > > > reaaaaaallyyy long
> > > > > > > > > lines of hexadecimal numbers like
> > > > > > > > >  
> > > > > > > > > 36F6465223A223730303531393935227D2C7B226163636F756E74223A22313
> > > > > > > > >  
> > > > > > > > > can I get therein (crash dump file) any hint about the object 
> > > > > > > > > which
> > > > > > > > > caused the crash?
> > > > > > > > >  
> > > > > > > > > Cheers
> > > > > > > > > Simon
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > On Wed, 28 Aug 2013 19:05:23 -0400
> > > > > > > > > Sam Elliott <[email protected] 
> > > > > > > > > (mailto:[email protected])> wrote:
> > > > > > > > >  
> > > > > > > > > > Simon,
> > > > > > > > > >  
> > > > > > > > > > This sounds like it might be some kind of sibling 
> > > > > > > > > > explosion. Do you have allow_mult=true set on any buckets? 
> > > > > > > > > > If so, are you resolving every single time you read the 
> > > > > > > > > > object? What's your regular object size (before you started 
> > > > > > > > > > seeing big objects)?
> > > > > > > > > >  
> > > > > > > > > > There's some more info in our docs [1] - search for 
> > > > > > > > > > "siblings" for the stat names associated with them that 
> > > > > > > > > > might give you some information.
> > > > > > > > > >  
> > > > > > > > > > Sam
> > > > > > > > > >  
> > > > > > > > > > [1] 
> > > > > > > > > > http://docs.basho.com/riak/latest/ops/running/stats-and-monitoring/
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > > --  
> > > > > > > > > > Sam Elliott
> > > > > > > > > > Engineer
> > > > > > > > > > [email protected] (mailto:[email protected])
> > > > > > > > > > --
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > On Wednesday, 28 August 2013 at 6:50PM, Simon Effenberg 
> > > > > > > > > > wrote:
> > > > > > > > > >  
> > > > > > > > > > > Hi,
> > > > > > > > > > >  
> > > > > > > > > > > we have suddenly (regarding to riak stats) really big 
> > > > > > > > > > > objects (100th
> > > > > > > > > > > percentile of object size) of 400MB to 900MB.
> > > > > > > > > > >  
> > > > > > > > > > > We have no clue from where or how this could be.. is it 
> > > > > > > > > > > somehow
> > > > > > > > > > > possible to identify them easily?
> > > > > > > > > > >  
> > > > > > > > > > > Cheers
> > > > > > > > > > > Simon
> > > > > > > > > > >  
> > > > > > > > > > > _______________________________________________
> > > > > > > > > > > riak-users mailing list
> > > > > > > > > > > [email protected] 
> > > > > > > > > > > (mailto:[email protected])
> > > > > > > > > > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> > > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > --  
> > > > > > > > > Simon Effenberg | Site Ops Engineer | mobile.international 
> > > > > > > > > GmbH
> > > > > > > > > Fon: + 49-(0)30-8109 - 7173
> > > > > > > > > Fax: + 49-(0)30-8109 - 7131
> > > > > > > > >  
> > > > > > > > > Mail: [email protected] 
> > > > > > > > > (mailto:[email protected])
> > > > > > > > > Web: www.mobile.de (http://www.mobile.de)
> > > > > > > > >  
> > > > > > > > > Marktplatz 1 | 14532 Europarc Dreilinden | Germany
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > Geschäftsführer: Malte Krüger
> > > > > > > > > HRB Nr.: 18517 P, Amtsgericht Potsdam
> > > > > > > > > Sitz der Gesellschaft: Kleinmachnow  
> > > > > > > > >  
> > > > > > > > > _______________________________________________
> > > > > > > > > riak-users mailing list
> > > > > > > > > [email protected] (mailto:[email protected])
> > > > > > > > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > --  
> > > > > > > > Simon Effenberg | Site Ops Engineer | mobile.international GmbH
> > > > > > > > Fon: + 49-(0)30-8109 - 7173
> > > > > > > > Fax: + 49-(0)30-8109 - 7131
> > > > > > > >  
> > > > > > > > Mail: [email protected] 
> > > > > > > > (mailto:[email protected])
> > > > > > > > Web: www.mobile.de (http://www.mobile.de)
> > > > > > > >  
> > > > > > > > Marktplatz 1 | 14532 Europarc Dreilinden | Germany
> > > > > > > >  
> > > > > > > >  
> > > > > > > > Geschäftsführer: Malte Krüger
> > > > > > > > HRB Nr.: 18517 P, Amtsgericht Potsdam
> > > > > > > > Sitz der Gesellschaft: Kleinmachnow  
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > > --  
> > > > > > > Simon Effenberg | Site Ops Engineer | mobile.international GmbH
> > > > > > > Fon: + 49-(0)30-8109 - 7173
> > > > > > > Fax: + 49-(0)30-8109 - 7131
> > > > > > >  
> > > > > > > Mail: [email protected] (mailto:[email protected])
> > > > > > > Web: www.mobile.de (http://www.mobile.de)
> > > > > > >  
> > > > > > > Marktplatz 1 | 14532 Europarc Dreilinden | Germany
> > > > > > >  
> > > > > > >  
> > > > > > > Geschäftsführer: Malte Krüger
> > > > > > > HRB Nr.: 18517 P, Amtsgericht Potsdam
> > > > > > > Sitz der Gesellschaft: Kleinmachnow  
> > > > > > >  
> > > > > > >  
> > > > > > > Attachments:  
> > > > > > > - console.log
> > > > > > >  
> > > > > > > - crash.log
> > > > > > >  
> > > > > > > - erlang.log.1
> > > > > > >  
> > > > > > > - error.log
> > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > > --  
> > > > > Simon Effenberg | Site Ops Engineer | mobile.international GmbH
> > > > > Fon: + 49-(0)30-8109 - 7173
> > > > > Fax: + 49-(0)30-8109 - 7131
> > > > >  
> > > > > Mail: [email protected] (mailto:[email protected])
> > > > > Web: www.mobile.de (http://www.mobile.de)
> > > > >  
> > > > > Marktplatz 1 | 14532 Europarc Dreilinden | Germany
> > > > >  
> > > > >  
> > > > > Geschäftsführer: Malte Krüger
> > > > > HRB Nr.: 18517 P, Amtsgericht Potsdam
> > > > > Sitz der Gesellschaft: Kleinmachnow  
> > > >  
> > >  
> > >  
> > >  
> > >  
> > >  
> > >  
> > > --  
> > > Simon Effenberg | Site Ops Engineer | mobile.international GmbH
> > > Fon: + 49-(0)30-8109 - 7173
> > > Fax: + 49-(0)30-8109 - 7131
> > >  
> > > Mail: [email protected] (mailto:[email protected])
> > > Web: www.mobile.de (http://www.mobile.de)
> > >  
> > > Marktplatz 1 | 14532 Europarc Dreilinden | Germany
> > >  
> > >  
> > > Geschäftsführer: Malte Krüger
> > > HRB Nr.: 18517 P, Amtsgericht Potsdam
> > > Sitz der Gesellschaft: Kleinmachnow  
> >  
>  
>  
>  
>  
> --  
> Simon Effenberg | Site Ops Engineer | mobile.international GmbH
> Fon: + 49-(0)30-8109 - 7173
> Fax: + 49-(0)30-8109 - 7131
>  
> Mail: [email protected] (mailto:[email protected])
> Web: www.mobile.de (http://www.mobile.de)
>  
> Marktplatz 1 | 14532 Europarc Dreilinden | Germany
>  
>  
> Geschäftsführer: Malte Krüger
> HRB Nr.: 18517 P, Amtsgericht Potsdam
> Sitz der Gesellschaft: Kleinmachnow  




_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to