I use callback chaining with the python driver and can confirm that it is
very fast.

You can "chain the chains" together to perform sequential processing. I do
this when retrieving "metadata" and then the referenced "payload" for
example, when the metadata has been inverted and the payload is larger than
we want to invert. And you can be running multiple "chains of chains"
asynchronously - cascade state by employing the userdata of the future.

We also multiprocess, for more parallelism, and we distribute work to
multiple multiprocessing instances using a message broker for yet more
parallel activity, as well as reliability.

ml

On Fri, Mar 27, 2015 at 4:28 PM, Tyler Hobbs <ty...@datastax.com> wrote:

> Since you're executing queries sequentially, you may want to look into
> using callback chaining to avoid the cross-thread signaling that results in
> the 1ms latencies.  Basically, just use session.execute_async() and attach
> a callback to the returned future that will execute your next query.  The
> callback is executed on the event loop thread.  The main downsides to this
> are that you need to be careful to avoid blocking the event loop thread
> (including executing session.execute() or prepare()) and you need to ensure
> that all exceptions raised in the callback are handled by your application
> code.
>
> On Fri, Mar 27, 2015 at 3:11 PM, Artur Siekielski <a...@vhex.net> wrote:
>
>> I think that in your example Postgres spends most time on waiting for
>> fsync() to complete. On Linux, for a battery-backed raid controller, it's
>> safe to mount ext4 filesystem with "barrier=0" option which improves
>> fsync() performance a lot. I have partitions mounted with this option and I
>> did a test from Python, using psycopg2 driver, and I got the following
>> latencies, in milliseconds:
>> - INSERT without COMMIT: 0.04
>> - INSERT with COMMIT: 0.12
>> - SELECT: 0.05
>> I'm also repeating benchmark runs multiple times (I'm using Python's
>> "timeit" module).
>>
>>
>> On 03/27/2015 07:58 PM, Ben Bromhead wrote:
>>
>>> Latency can be so variable even when testing things locally. I quickly
>>> fired up postgres and did the following with psql:
>>>
>>> ben=# CREATE TABLE foo(i int, j text, PRIMARY KEY(i));
>>> CREATE TABLE
>>> ben=# \timing
>>> Timing is on.
>>> ben=# INSERT INTO foo VALUES(2, 'yay');
>>> INSERT 0 1
>>> Time: 1.162 ms
>>> ben=# INSERT INTO foo VALUES(3, 'yay');
>>> INSERT 0 1
>>> Time: 1.108 ms
>>>
>>> I then fired up a local copy of Cassandra (2.0.12)
>>>
>>> cqlsh> CREATE KEYSPACE foo WITH replication = { 'class' :
>>> 'SimpleStrategy', 'replication_factor' : 1 };
>>> cqlsh> USE foo;
>>> cqlsh:foo> CREATE TABLE foo(i int PRIMARY KEY, j text);
>>> cqlsh:foo> TRACING ON;
>>> Now tracing requests.
>>> cqlsh:foo> INSERT INTO foo (i, j) VALUES (1, 'yay');
>>>
>>>
>>
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>
>

Reply via email to