Are you transferring using a single thread? If so, I would recommend you
to use a ThreaPoolExecutor and schedule each write as you, control the
failures (if any) using either an AtomicInteger or a
concurrent/synchronized list where you can track the keys that failed.
No matter how much you do, a single threaded transfer won't help you at
all. We have done transfers many times and depending on the size of the
DB table, we use single thread or thread pool service. Try 8 threads and
see the difference, assuming you have N connections in your Riak client
where N>max thread pool size.
You might want to remove pw=1 when using multi-threading so Riak doesn't
fallback behind too much (elevel db catch up? whatever that's called),
pw=1 will add more risk than the benefit you gain.
Hope that helps,
Guido.
On 13/02/13 09:44, Bogdan Flueras wrote:
Ok, so I've done something like this:
Bucket bucket = client.createBucket("foo"); // lastWriteWins(true)
doesn't work for Protobuf
when I insert I have:
bucket.store(someKey, someValue).withoutFetch().pw(1).execute();
It looks like it's 20% faster than before. Is there something I could
further tweak ?
ing. Bogdan Flueras
On Wed, Feb 13, 2013 at 10:19 AM, Bogdan Flueras
<[email protected] <mailto:[email protected]>> wrote:
Each thread has it's own bucket instance (pointing to the same
location) and I don't re-fetch the bucket per insert.
Thank you very much!
ing. Bogdan Flueras
On Wed, Feb 13, 2013 at 10:14 AM, Russell Brown
<[email protected] <mailto:[email protected]>> wrote:
On 13 Feb 2013, at 08:07, Bogdan Flueras
<[email protected] <mailto:[email protected]>>
wrote:
> How to set the bucket to last write? Is it in the builder?
Something like:
Bucket b =
client.createBucket("my_bucket").lastWriteWins(true);
Also, after you've created the bucket, do you use it from all
threads? You don't re-fetch the bucket per-insert operation,
do you?
But the "withoutFecth()" option is probably going to be the
biggest performance increase, and safe if you are only doing
inserts.
Cheers
Russell
> I'll have a look..
> Yes, I use more threads and the bucket is configured to
spread the load across all nodes.
>
> Thanks, I'll have a deeper look into the API and let you
know about my results.
>
> ing. Bogdan Flueras
>
>
>
> On Wed, Feb 13, 2013 at 10:02 AM, Russell Brown
<[email protected] <mailto:[email protected]>> wrote:
> Hi,
>
> On 13 Feb 2013, at 07:37, Bogdan Flueras
<[email protected] <mailto:[email protected]>>
wrote:
>
> > Hello all,
> > I've got a 5 node cluster with Riak 1.2.1, all machines
are multicore,
> > with min 4GB RAM.
> >
> > I want to insert something like 50 million records in Riak
with the java client (Protobuf used) with default settings.
I've tried also with HTTP protocol and set w = 1 but got some
problems.
> >
> > However the process is very slow: it doesn't write more
than 6GB/ hour or aprox. 280 KB/second.
> > To have all my data filled in, it would take aprox 2 days !!
> >
> > What can I do to have the data filled into Riak ASAP?
> > How should I configure the cluster ? (vm.args/ app.config)
I don't care so much about consistency at this point.
>
> If you are certain to be only inserting new data setting
your bucket(s) to last write wins will speed things up. Also,
are you using multiple threads for the Java client insert?
Spreading the load across all five nodes? Are you using the
"withoutFetch()" option on the java client?
>
> Cheers
>
> Russell
>
> >
> > Thank you,
> > ing. Bogdan Flueras
> >
> > _______________________________________________
> > riak-users mailing list
> > [email protected] <mailto:[email protected]>
> >
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com