Re: riak bitcask calculation

2016-07-21 Thread Luke Bakken
Hi Travis,

The memory used by Riak (beam.smp) is for more than key storage, of course.

You can assume that in your case RES will get to about the same amount
after a reboot, given a similar workload in Riak.

--
Luke Bakken
Engineer
lbak...@basho.com

On Mon, Jul 18, 2016 at 8:54 AM, Travis Kirstine
 wrote:
> Yes, the reason I'm concerned is that we projected much lower memory usage 
> based on the calculations.   We originally provisioned 2x the required memory 
> and it appears that this will not be enough.
>
> Am I correct in that the top cmd's RES memory for the beam.smp command is the 
> memory being used by riak for "key storage", if the server was to be rebooted 
> the memory would eventually climb back to this level?
>
> Thanks for your help
>
> -Original Message-
> From: Luke Bakken [mailto:lbak...@basho.com]
> Sent: July-18-16 11:35 AM
> To: Travis Kirstine 
> Cc: riak-users@lists.basho.com; ac...@jdbarnes.com
> Subject: Re: riak bitcask calculation
>
> Hi Travis -
>
> The calculation provided for bitcask memory consumption is only a rough 
> guideline. Using more memory than the calculation suggests is normal and 
> expected with Riak. As you increase load on this cluster memory use may go up 
> further as the operating system manges disk operations and buffers.
>
> Is there a reason you're concerned about this usage?
>
> On Mon, Jul 18, 2016 at 8:28 AM, Travis Kirstine 
>  wrote:
>> Yes from the free command
>>
>> [root@riak1 ~]# free -g
>>   totalusedfree  shared  buff/cache   
>> available
>> Mem: 45   9   0   0  36  
>> 35
>> Swap:23   0  22
>>
>> Or from top
>>
>> PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+ COMMAND
>> 24421 riak  20   0 16.391g 8.492g  41956 S  82.7 18.5  10542:10 beam.smp

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: How to best store arbitrarily large Java objects

2016-07-21 Thread Alex Moore
Hi Henning,

Responses inline:

...

> However, depending on the size of the `TreeMap`, the serialization
> output can become rather large, and this limits the usefulness of my
> object. In our tests, dealing with Riak-objects >2MB proved to be
> significantly slower than dealing with objects <200kB.


Yes. We usually recommend  keeping objects < 100kB for the best
performance; and Riak can usually withstand objects up to 1MB with the
understanding that everything will be a little slower with the larger
objects going around the system.


> My idea was to use a converter that splits the serialized JSON into
> chunks during _write_, and uses links to point from one chunk to the
> next. During _fetch_ the links would be traversed, the JSON string
> concatenated from chunks, deserialized and the object would be
> returned. Looking at `com.basho.riak.client.api.convert.Converter`, it
> seems this is not going to work.


Linkwalking was deprecated in Riak 2.0 so I wouldn't do it that way.

I'm beginning to think that I'll need to remodel my data and use CRDTs
> for individual fields such as the `TreeMap`. Would that be a better
> way?


This sounds like a plausible idea.  If you do a lot of possibly conflicting
updates to the Tree, then a CRDT map would be the way to go.  You could
reuse the key from the main object, and just put it in the new
buckettype/bucket.

If you don't need to update the tree much, you could also just serialize
the tree into it's own object - split up the static data and the often
updated data, and put them in different buckets that share the same key.

Thanks,
Alex


On Thu, Jul 21, 2016 at 9:36 AM, Henning Verbeek 
wrote:

> I have a Java class, which is being stored in Riak. The class contains
> a `TreeMap` field, amongst other fields. Out of the box, Riak is
> converting the object to/from JSON. Everything works fine.
>
> However, depending on the size of the `TreeMap`, the serialization
> output can become rather large, and this limits the usefulness of my
> object. In our tests, dealing with Riak-objects >2MB proved to be
> significantly slower than dealing with objects <200kB.
>
> So, in order to store/fetch instances of my class with arbitrary
> sizes, but with reliable performance, I believe I need to split the
> output into separate Riak-objects after serialization, and reassemble
> before deserialization.
>
> My idea was to use a converter that splits the serialized JSON into
> chunks during _write_, and uses links to point from one chunk to the
> next. During _fetch_ the links would be traversed, the JSON string
> concatenated from chunks, deserialized and the object would be
> returned. Looking at `com.basho.riak.client.api.convert.Converter`, it
> seems this is not going to work.
>
> I'm beginning to think that I'll need to remodel my data and use CRDTs
> for individual fields such as the `TreeMap`. Would that be a better
> way?
>
> Any other recommendations would be much appreciated.
>
> Thanks,
> Henning
> --
> My other signature is a regular expression.
> http://www.pray4snow.de
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


How to best store arbitrarily large Java objects

2016-07-21 Thread Henning Verbeek
I have a Java class, which is being stored in Riak. The class contains
a `TreeMap` field, amongst other fields. Out of the box, Riak is
converting the object to/from JSON. Everything works fine.

However, depending on the size of the `TreeMap`, the serialization
output can become rather large, and this limits the usefulness of my
object. In our tests, dealing with Riak-objects >2MB proved to be
significantly slower than dealing with objects <200kB.

So, in order to store/fetch instances of my class with arbitrary
sizes, but with reliable performance, I believe I need to split the
output into separate Riak-objects after serialization, and reassemble
before deserialization.

My idea was to use a converter that splits the serialized JSON into
chunks during _write_, and uses links to point from one chunk to the
next. During _fetch_ the links would be traversed, the JSON string
concatenated from chunks, deserialized and the object would be
returned. Looking at `com.basho.riak.client.api.convert.Converter`, it
seems this is not going to work.

I'm beginning to think that I'll need to remodel my data and use CRDTs
for individual fields such as the `TreeMap`. Would that be a better
way?

Any other recommendations would be much appreciated.

Thanks,
Henning
-- 
My other signature is a regular expression.
http://www.pray4snow.de

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com