Thanks Kresten! The challenges of fast development with aging docs and
trying to keep up with everything. This indeed is very different and
now makes a lot more sense and is back inline with our original
assumptions (which apparently were wrong initially back with pre 1.0 and
now are correct).
Just one more clarification - so if I have a 5 node cluster, and a pool
of protobuffer client processes which have been balanced in a round
robin manner across all nodes, does it matter if I have a worker
gen_server process <0.100.0> which performs write 1 to node A and then
write 2 to node B (not concurrently) using two different protobuffer
client processes (<0.101.0> and <0.102.0> respectively)? Are there any
caveats to this approach?
Also, we are on 1.2.
Cheers,
Bryan
On 9/3/12 1:37 PM, Kresten Krab Thorup wrote:
That comment is no longer correct since riak now (since 1.0 I believe) ignores
client IDs. See
https://github.com/basho/riak/blob/master/releasenotes/riak-1.0.org#getput-improvements
1. Sibling will only occur if you have allow_mult=true enabled on the bucket.
The following applies only in that case.
2. The way ordering is determined inside riak is by using the vector clock
coming with the write. The only way to get one is reading it from riak. Thus,
to do ordered writes you have to first read an object, modify it, then pass it
back in. If the object was written by someone else in the mean time you'll get
a sibling.
3. Passing in no vclock (a fresh riak_object) now also creates a sibling if
there is an existing object for the given key.
Kresten
Trifork
On 03/09/2012, at 21.33, "Bryan Hughes"
<[email protected]<mailto:[email protected]>> wrote:
Heh - I found my own answer staring me in the face -
http://wiki.basho.com/Vector-Clocks.html.
Concurrent writes If two writes occur simultaneously from clients with
different client IDs but the same vector clock value, Riak will not be able to
determine the correct object to store and the object is given two siblings.
These writes could happen to the same node or different ones.
Cheers,
Bryan
On 9/3/12 10:47 AM, Bryan Hughes wrote:
Thanks for the replies - this is very helpful. Our persistence abstraction
layer already sports a robust process pool as we do support other persistence
solutions (although Riak is our main gun). I just needed to understand the
relationship of the protobuffer client process to the Riak cluster as a whole.
I understand now that the client process binds to an individual node in the
cluster and not the cluster as a whole.
I wasnt sure there might be some logic somewhere that handled a type of proxy
(like Joe was referring to) so that each client connects to a single address
and that proxy implements the necessary routing.
Fortunately, I just need to add some round robin and affinity and load
balancing management to our persistence layer. From what I have been reading
(including basic-client.txt in the riak/doc), the key is to ensure the same
client binds to the same connection against the same node for subsequent writes?
For example, if I have 1000 gen_server processes each reading and writing
atomic values to the cluster, and a process uses connection X to node A for a
write of record 100, the next write of record 100 should be on the same
connection to the same node unless that node goes away.
If I am understanding this correctly, for process A writing record 1 to grab a
random connection to a random node and then writing record 2 on a different
connection to either the same node or different node will result in nothing but
siblings?
Thanks again!
Cheers,
Bryan
On 9/2/12 11:39 PM, Mark Phillips wrote:
Hi Brysn ,
There have been at least four chunks of code released to handle connection
pooling ( in addition to poolboy);
http://wiki.basho.com/Community-Developed-Libraries-and-Projects.html#Client-Libraries-and-Frameworks.
( Scroll down to " Erlang".)
These might be worth a look.
Mark
twitter.com/pharkmillups<http://twitter.com/pharkmillups>
Mark
On Sep 3, 2012, at 6:40, Joseph Lambert
<[email protected]<mailto:[email protected]>> wrote:
Hi Bryan,
AFAIK, there is no built-in connection pooling for the Riak Erlang client. Each
connection will only connect with one node and only that node, but since it's
masterless you can connect to any node. You could roll your own connection
pooling mechanism, or use something like Poolboy to handle it for you. Using
Poolboy is convenient because it comes as a dependency of riak_core.
If you use Poolboy, you'll have to modify riakc_pb_socket slightly to account
for the way poolboy initializes connections (add a start_link/1), or create a
simple module to pass the initialization from poolboy to riakc_pb_socket.
- Joe Lambert
On Mon, Sep 3, 2012 at 11:41 AM, Bryan Hughes
<[email protected]<mailto:[email protected]>> wrote:
Hi Guys,
I have a question regarding Riak's protobuffer client gen_server process. I
have a cluster of 5 nodes (machines), each with consecutive IP addresses. Our
application is 100% erlang and runs on its own machine. The arguments to
riakc_pb_socket:start_link/2 is an Address, Port and the optional Options. The
Address and Port is the address of the riak server, but in the case of a
masterless cluster of 5 machines, which address do I use?
In reviewing the code for riakc_pb_socket.erl, the client opens a socket via
gen_tcp to that particular node in the cluster and only that node. This means
that there is a 1 to 1 connection between the riak node and the client. Is
this correct? Maybe I am missing something?
If so, then it looks like I need to implement my own round-robin algorithm across a pool
of protobuffer clients that I am managing, each bound to a different node in the cluster
while testing "aliveness" with ping/2 and an immediate timeout?
Cheers,
Bryan
--
Bryan Hughes
Wobblesoft
http://www.wobblesoft.com
"Art is never finished, only abandoned. - Leonardo da Vinci"
_______________________________________________
riak-users mailing list
[email protected]<mailto:[email protected]>
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
_______________________________________________
riak-users mailing list
[email protected]<mailto:[email protected]>
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
_______________________________________________
riak-users mailing list
[email protected]<mailto:[email protected]>
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
--
Bryan Hughes
Wobblesoft
http://www.wobblesoft.com
"Art is never finished, only abandoned. - Leonardo da Vinci"
_______________________________________________
riak-users mailing list
[email protected]<mailto:[email protected]>
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
--
Bryan Hughes
*Wobblesoft*
http://www.wobblesoft.com
/"Art is never finished, only abandoned. - Leonardo da Vinci"/
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com