Re: How to start using incremental repairs?

2016-08-25 Thread Paulo Motta
1. Migration procedure is no longer necessary after CASSANDRA-8004, and
since you never ran repair before this would not make any difference
anyway, so just run repair and by default (CASSANDRA-7250) this will
already be incremental.
2. Incremental repair is not supported with -pr, -local or -st/-et options,
so you should run incremental repair in all nodes in all DCs sequentially
(you should be aware that this will probably generate inter-DC traffic), no
need to disable autocompaction or stopping nodes.

2016-08-25 18:27 GMT-03:00 Aleksandr Ivanov :

> I’m new in Cassandra and trying to figure out how to _start_ using
> incremental repairs. I have seen article about “Migrating to incremental
> repairs” but since I didn’t use repairs before at all and I use Cassandra
> version v3.0.8, then maybe not all steps are needed which are mentioned in
> Datastax article.
> Should I start with full repair or I can start with executing “nodetool
> repair -pr  my_keyspace” on all nodes without autocompaction disabling and
> node stopping?
>
> I have 6 datacenters with 6 nodes in each DC. Is it enough to run
>  “nodetool repair -pr  my_keyspace” in one DC only or it should be executed
> on all nodes in _all_ DCs?
>
> I have tried to perform “nodetool repair -pr  my_keyspace” on all nodes in
> all datacenters sequentially but I still can see non repaired SSTables
> for my_keyspace   (Repaired at: 0). Is it expected behavior if during
> repair data in my_keyspace wasn’t modified (no writes, no reads)?
>


How to start using incremental repairs?

2016-08-25 Thread Aleksandr Ivanov
I’m new in Cassandra and trying to figure out how to _start_ using
incremental repairs. I have seen article about “Migrating to incremental
repairs” but since I didn’t use repairs before at all and I use Cassandra
version v3.0.8, then maybe not all steps are needed which are mentioned in
Datastax article.
Should I start with full repair or I can start with executing “nodetool
repair -pr  my_keyspace” on all nodes without autocompaction disabling and
node stopping?

I have 6 datacenters with 6 nodes in each DC. Is it enough to run
 “nodetool repair -pr  my_keyspace” in one DC only or it should be executed
on all nodes in _all_ DCs?

I have tried to perform “nodetool repair -pr  my_keyspace” on all nodes in
all datacenters sequentially but I still can see non repaired SSTables
for my_keyspace   (Repaired at: 0). Is it expected behavior if during
repair data in my_keyspace wasn’t modified (no writes, no reads)?


Re: Flush activity and dropped messages

2016-08-25 Thread Patrick McFadin
This looks like you've run out of disk. What are your hardware specs?

Patrick

On Thursday, August 25, 2016, Benedict Elliott Smith 
wrote:

> You should update from 2.0 to avoid this behaviour, is the simple answer.
> You are correct that when the commit log gets full the memtables are
> flushed to make room.  2.0 has several interrelated problems here though:
>
> There is a maximum flush queue length property (I cannot recall its name),
> and once there are this many memtables flushing, no more writes can take
> place on the box, whatsoever.  You cannot simply increase this length,
> though, because that shrinks the maximum size of any single memtable (it
> is, iirc, total_memtable_space / (1 + flush_writers + max_queue_length)),
> which worsens write-amplification from compaction.
>
> This is because the memory management for memtables in 2.0 was really
> terrible, and this queue length was used to try to ensure the space
> allocated was not exceeded.
>
> Compounding this, when clearing the commit log 2.0 will flush all
> memtables with data in them regardless of it is useful to do so, meaning
> having more tables (that are actively written to) than your max queue
> length will necessarily cause stalls every time you run out of commit log
> space.
>
> In 2.1, none of these concerns apply.
>
>
> On 24 August 2016 at 23:40, Vasileios Vlachos  > wrote:
>
>> Hello,
>>
>>
>>
>>
>>
>> We have an 8-node cluster spread out in 2 DCs, 4 nodes in each one. We
>> run C* 2.0.17 on Ubuntu 12.04 at the moment.
>>
>>
>>
>>
>> Our C# application often throws logs, which correlate with dropped
>> messages (counter mutations usually) in our logs. We think that if a
>> specific mutaiton stays in the queue for more than 5 seconds, Cassandra
>> drops it. This is also suggested by these lines in system.log:
>>
>> ERROR [ScheduledTasks:1] 2016-08-23 13:29:51,454 MessagingService.java
>> (line 912) MUTATION messages were dropped in last 5000 ms: 317 for internal
>> timeout and 0 for cross node timeout
>> ERROR [ScheduledTasks:1] 2016-08-23 13:29:51,454 MessagingService.java
>> (line 912) COUNTER_MUTATION messages were dropped in last 5000 ms: 6 for
>> internal timeout and 0 for cross node timeout
>>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,455 StatusLogger.java (line
>> 55) Pool NameActive   Pending  Completed   Blocked
>>  All Time Blocked
>>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,455 StatusLogger.java (line
>> 70) ReadStage 0 0  245177190 0
>> 0
>>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,455 StatusLogger.java (line
>> 70) RequestResponseStage  0 0 3530334509 0
>> 0
>>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,456 StatusLogger.java (line
>> 70) ReadRepairStage   0 01549567 0
>> 0
>>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,456 StatusLogger.java (line
>> 70) MutationStage48  1380 2540965500 0
>> 0
>>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,456 StatusLogger.java (line
>> 70) ReplicateOnWriteStage 0 0  189615571 0
>> 0
>>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,457 StatusLogger.java (line
>> 70) GossipStage   0 0   20586077 0
>> 0
>>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,457 StatusLogger.java (line
>> 70) CacheCleanupExecutor  0 0  0 0
>> 0
>>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,457 StatusLogger.java (line
>> 70) MigrationStage0 0106 0
>> 0
>>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,457 StatusLogger.java (line
>> 70) MemoryMeter   0 0 303029 0
>> 0
>>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,458 StatusLogger.java (line
>> 70) ValidationExecutor0 0  0 0
>> 0
>>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,458 StatusLogger.java (line
>> 70) FlushWriter   1 5 322604 1
>>  8227
>>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,458 StatusLogger.java (line
>> 70) InternalResponseStage 0 0 35 0
>> 0
>>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,459 StatusLogger.java (line
>> 70) AntiEntropyStage  0 0  0 0
>> 0
>>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,459 StatusLogger.java (line
>> 70) MemtablePostFlusher   1 5 424104 0
>> 0
>>  INFO [ScheduledTasks:1] 2016-08-23 

Exception while using LIST on Cassandra + PHP

2016-08-25 Thread Enrico Sola
Hello to everyone, I'm pretty new to Cassandra and I'm working on PHP based 
projects.
For interface PHP with Cassandra I use the DataStax PHP driver, which lately 
has been updated to the version 1.2.2 and now I've a trouble while using LIST 
data type: when I try to use a prepared statement and I pass a Collection 
object as parameter I got the error "Invalid value type".
Here's a simple test:

try{
$cluster = Cassandra::cluster()->build();
$session = $cluster->connect('test');
$statement = new Cassandra\SimpleStatement('CREATE TABLE IF NOT EXISTS 
list_test (name VARCHAR, id UUID, date TIMESTAMP, values LIST, PRIMARY 
KEY((name), id)) WITH CLUSTERING ORDER BY (id ASC);');
$session->execute($statement);
$list = new Cassandra\Collection(Cassandra::TYPE_VARCHAR);
$list->add('foo');
$list->add('bar');
$statement = $session->prepare('INSERT INTO list_test (name, id, date, 
values) VALUES (?, uuid(), toTimestamp(now()), ?) IF NOT EXISTS;');
$result = $session->execute($statement, new 
Cassandra\ExecutionOptions(array('arguments' => array('Test', $list;
while ($result) {
foreach ($result as $row) {
if ( !isset($row['[applied]']) || $row['[applied]'] !== 
true ){
echo 'FAIL';
}
}
$result = $result->nextPage();
}
echo 'OK';
}catch(Exception $ex){
echo var_dump($ex);
}

Here's the var_dump result of the Collection object:

object(Cassandra\Collection)#11 (2) { ["type"]=> object(Cassandra\Type\Set)#12 
(1) { ["valueType"]=> object(Cassandra\Type\Scalar)#13 (1) { ["name"]=> 
string(7) "varchar" } } ["values"]=> array(2) { [0]=> string(3) "foo" [1]=> 
string(3) "bar" } } 

And here there's the exception thrown:

object(Cassandra\Exception\InvalidArgumentException)#15 (7) { 
["message":protected]=> string(18) "Invalid value type" 
["string":"Exception":private]=> string(0) "" ["code":protected]=> 
int(16777229) ["file":protected]=> string(47) 
"/var/www/lab/Resources/_install/bin/install.php" ["line":protected]=> int(24) 
["trace":"Exception":private]=> array(2) { [0]=> array(6) { ["file"]=> 
string(47) "/var/www/lab/Resources/_install/bin/install.php" ["line"]=> int(24) 
["function"]=> string(7) "execute" ["class"]=> string(24) 
"Cassandra\DefaultSession" ["type"]=> string(2) "->" ["args"]=> array(2) { 
[0]=> object(Cassandra\PreparedStatement)#14 (0) { } [1]=> 
object(Cassandra\ExecutionOptions)#10 (0) { } } } [1]=> array(4) { ["file"]=> 
string(22) "/var/www/lab/index.php" ["line"]=> int(146) ["args"]=> array(1) { 
[0]=> string(47) "/var/www/lab/Resources/_install/bin/install.php" } 
["function"]=> string(12) "include_once" } } ["previous":"Exception":private]=> 
NULL } 

Now my question is: I'm doing something wrong or this is a bug in the DataStax 
driver?
For avoid sending another message I've also another question, it's normal that 
databases like MySQL or SQLite takes less that 1 ms for run a query while 
Cassandra takes about ten times more time? (I'm talking on PHP)

DATABASES BENCHMARK:

MySQL:
TABLE CREATION: 0.0015501976013184 SECONDS.
INSERT: 0.00049901008605957 SECONDS.
SELECT: 7.1525573730469E-6 SECONDS.

CASSANDRA:
TABLE CREATION: 0.080638885498047 SECONDS.
INSERT: 0.038239002227783 SECONDS.
SELECT: 0.025631904602051 SECONDS.

SQLITE 3:
TABLE CREATION: 0.0025551319122314 SECONDS.
INSERT: 0.012134075164795 SECONDS.
SELECT: 7.1048736572266E-5 SECONDS.

My software enviroment for this test:

PHP 7.0.8
NGINX 1.4.6
Ubuntu 14.04 LTS
MySQL 5.5.50
Cassandra 3.0.8 (DataStax PHP Driver 1.2.2)
SQLite 3

Thanks for help,

Enrico.

smime.p7s
Description: S/MIME cryptographic signature


Re: Stale value appears after consecutive TRUNCATE

2016-08-25 Thread Yuji Ito
Thank you for testing, Christian

What did you set commitlog_sync in cassandra.yaml?
I set commitlog_sync batch (window 2ms) as below.

commitlog_sync: batch
commitlog_sync_batch_window_in_ms: 2

The problem didn't occur by setting  commitlog_sync periodic(default).

regards,
yuji


On Thu, Aug 25, 2016 at 6:11 PM, horschi  wrote:

> (running C* 2.2.7)
>
> On Thu, Aug 25, 2016 at 11:10 AM, horschi  wrote:
>
>> Hi Yuji,
>>
>> I tried your script a couple of times. I did not experience any stale
>> values. (On my Linux laptop)
>>
>> regards,
>> Ch
>>
>> On Mon, Aug 15, 2016 at 7:29 AM, Yuji Ito  wrote:
>>
>>> Hi,
>>>
>>> I can reproduce the problem with the following script.
>>> I got rows which should be truncated.
>>> If truncating is executed only once, the problem doesn't occur.
>>>
>>> The test for multi nodes (replication_factor:3, kill & restart C*
>>> processes in all nodes) can also reproduce it.
>>>
>>> test script:
>>> 
>>>
>>> ip=xxx.xxx.xxx.xxx
>>>
>>> echo "0. prepare a table"
>>> cqlsh $ip -e "drop keyspace testdb;"
>>> cqlsh $ip -e "CREATE KEYSPACE testdb WITH replication = {'class':
>>> 'SimpleStrategy', 'replication_factor': '1'};"
>>> cqlsh $ip -e "CREATE TABLE testdb.testtbl (key int PRIMARY KEY, val
>>> int);"
>>>
>>> echo "1. insert rows"
>>> for key in $(seq 1 10)
>>> do
>>> cqlsh $ip -e "insert into testdb.testtbl (key, val) values($key,
>>> 1000) IF NOT EXISTS;" >> /dev/null 2>&1
>>> done
>>>
>>> echo "2. truncate the table twice"
>>> cqlsh $ip -e "consistency all; truncate table testdb.testtbl"
>>> cqlsh $ip -e "consistency all; truncate table testdb.testtbl"
>>>
>>> echo "3. kill C* process"
>>> ps auxww | grep "CassandraDaemon" | awk '{if ($13 ~ /cassand/) print
>>> $2}' | xargs sudo kill -9
>>>
>>> echo "4. restart C* process"
>>> sudo /etc/init.d/cassandra start
>>> sleep 20
>>>
>>> echo "5. check the table"
>>> cqlsh $ip -e "select * from testdb.testtbl;"
>>>
>>> 
>>>
>>> test result:
>>> 
>>>
>>> 0. prepare a table
>>> 1. insert rows
>>> 2. truncate the table twice
>>> Consistency level set to ALL.
>>> Consistency level set to ALL.
>>> 3. kill C* process
>>> 4. restart C* process
>>> Starting Cassandra: OK
>>> 5. check the table
>>>
>>>  key | val
>>> -+--
>>>5 | 1000
>>>   10 | 1000
>>>1 | 1000
>>>8 | 1000
>>>2 | 1000
>>>4 | 1000
>>>7 | 1000
>>>6 | 1000
>>>9 | 1000
>>>3 | 1000
>>>
>>> (10 rows)
>>>
>>> 
>>>
>>>
>>> Thanks Christian,
>>>
>>> I tried with durable_writes=False.
>>> It failed. I guessed this failure was caused by another problem.
>>> I use SimpleStrategy.
>>> A keyspace using the SimpleStrategy isn't permitted to use
>>> durable_writes=False.
>>>
>>>
>>> Regards,
>>> Yuji
>>>
>>> On Thu, Aug 11, 2016 at 12:41 AM, horschi  wrote:
>>>
 Hi Yuji,

 ok, perhaps you are seeing a different issue than I do.

 Have you tried with durable_writes=False? If the issue is caused by the
 commitlog, then it should work if you disable durable_writes.

 Cheers,
 Christian



 On Tue, Aug 9, 2016 at 3:04 PM, Yuji Ito  wrote:

> Thanks Christian
>
> can you reproduce the behaviour with a single node?
>
> I tried my test with a single node. But I can't.
>
> This behaviour is seems to be CQL only, or at least has gotten worse
>> with CQL. I did not experience this with Thrift.
>
> I truncate tables with CQL. I've never tried with Thrift.
>
> I think that my problem can happen when truncating even succeeds.
> That's because I check all records after truncating.
>
> I checked the source code.
> ReplayPosition.segment and position become -1 and 0
> (ReplayPosition.NONE) in dscardSSTables() at truncating a table when there
> is no SSTable.
> I guess that ReplayPosition.segment shouldn't be -1 at truncating a
> table in this case.
> replayMutation() can request unexpected replay mutations because of
> this segment's value.
>
> Is there anyone familiar with truncate and replay?
>
> Regards,
> Yuji
>
>
> On Mon, Aug 8, 2016 at 6:36 PM, horschi  wrote:
>
>> Hi Yuji,
>>
>> can you reproduce the behaviour with a single node?
>>
>> The reason I ask is because I probably have the same issue with my
>> automated tests (which run truncate between every test), which run on my
>> local laptop.
>>
>> Maybe around 5 tests randomly fail out of my 1800. I can see that the
>> failed tests sometimes show data from other tests, which I think must be
>> because of a failed truncate. This behaviour is seems to be CQL only, or 
>> at
>> least has gotten worse with CQL. I did not experience this with Thrift.
>>
>> regards,
>> Christian
>>
>>
>>
>> On Mon, Aug 8, 2016 at 7:34 

Re: Flush activity and dropped messages

2016-08-25 Thread Benedict Elliott Smith
You should update from 2.0 to avoid this behaviour, is the simple answer.
You are correct that when the commit log gets full the memtables are
flushed to make room.  2.0 has several interrelated problems here though:

There is a maximum flush queue length property (I cannot recall its name),
and once there are this many memtables flushing, no more writes can take
place on the box, whatsoever.  You cannot simply increase this length,
though, because that shrinks the maximum size of any single memtable (it
is, iirc, total_memtable_space / (1 + flush_writers + max_queue_length)),
which worsens write-amplification from compaction.

This is because the memory management for memtables in 2.0 was really
terrible, and this queue length was used to try to ensure the space
allocated was not exceeded.

Compounding this, when clearing the commit log 2.0 will flush all memtables
with data in them regardless of it is useful to do so, meaning having more
tables (that are actively written to) than your max queue length will
necessarily cause stalls every time you run out of commit log space.

In 2.1, none of these concerns apply.


On 24 August 2016 at 23:40, Vasileios Vlachos 
wrote:

> Hello,
>
>
>
>
>
> We have an 8-node cluster spread out in 2 DCs, 4 nodes in each one. We run
> C* 2.0.17 on Ubuntu 12.04 at the moment.
>
>
>
>
> Our C# application often throws logs, which correlate with dropped
> messages (counter mutations usually) in our logs. We think that if a
> specific mutaiton stays in the queue for more than 5 seconds, Cassandra
> drops it. This is also suggested by these lines in system.log:
>
> ERROR [ScheduledTasks:1] 2016-08-23 13:29:51,454 MessagingService.java
> (line 912) MUTATION messages were dropped in last 5000 ms: 317 for internal
> timeout and 0 for cross node timeout
> ERROR [ScheduledTasks:1] 2016-08-23 13:29:51,454 MessagingService.java
> (line 912) COUNTER_MUTATION messages were dropped in last 5000 ms: 6 for
> internal timeout and 0 for cross node timeout
>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,455 StatusLogger.java (line
> 55) Pool NameActive   Pending  Completed   Blocked
>  All Time Blocked
>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,455 StatusLogger.java (line
> 70) ReadStage 0 0  245177190 0
> 0
>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,455 StatusLogger.java (line
> 70) RequestResponseStage  0 0 3530334509 0
> 0
>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,456 StatusLogger.java (line
> 70) ReadRepairStage   0 01549567 0
> 0
>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,456 StatusLogger.java (line
> 70) MutationStage48  1380 2540965500 0
> 0
>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,456 StatusLogger.java (line
> 70) ReplicateOnWriteStage 0 0  189615571 0
> 0
>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,457 StatusLogger.java (line
> 70) GossipStage   0 0   20586077 0
> 0
>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,457 StatusLogger.java (line
> 70) CacheCleanupExecutor  0 0  0 0
> 0
>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,457 StatusLogger.java (line
> 70) MigrationStage0 0106 0
> 0
>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,457 StatusLogger.java (line
> 70) MemoryMeter   0 0 303029 0
> 0
>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,458 StatusLogger.java (line
> 70) ValidationExecutor0 0  0 0
> 0
>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,458 StatusLogger.java (line
> 70) FlushWriter   1 5 322604 1
>  8227
>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,458 StatusLogger.java (line
> 70) InternalResponseStage 0 0 35 0
> 0
>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,459 StatusLogger.java (line
> 70) AntiEntropyStage  0 0  0 0
> 0
>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,459 StatusLogger.java (line
> 70) MemtablePostFlusher   1 5 424104 0
> 0
>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,459 StatusLogger.java (line
> 70) MiscStage 0 0  0 0
> 0
>  INFO [ScheduledTasks:1] 2016-08-23 13:29:51,460 StatusLogger.java (line
> 70) PendingRangeCalculator0 0 37 0
> 0
>  INFO [ScheduledTasks:1] 2016-08-23 

Re: How to configure cassandra in a multi cluster mode?

2016-08-25 Thread Carlos Alonso
Listen address is the network address the node will listen into.
Seeds is the contact points a node will use when bootstrapping to find a
cluster.

To make those four into a single cluster just start the first one with
himself as a seed and then sequentially bootstrap the others with the first
one as a seed.

Finally run nodetool status to check that they see each other.

Hope it helps

Carlos Alonso | Software Engineer | @calonso 

On 25 August 2016 at 10:19, Alexandr Porunov 
wrote:

> Hello,
>
> I am little bit confusing about cassandra's configuration.
> There are 2 parameters which I don't understand:
> listen_address
> seeds
>
> I have 4 identical nodes:
> 192.168.0.61 cassandra1
> 192.168.0.62 cassandra2
> 192.168.0.63 cassandra3
> 192.168.0.64 cassandra4
>
> What shell I do to configure those 4 nodes into a single cluster?
>
> Sincerely,
> Alexandr
>
>


Re: Stale value appears after consecutive TRUNCATE

2016-08-25 Thread horschi
(running C* 2.2.7)

On Thu, Aug 25, 2016 at 11:10 AM, horschi  wrote:

> Hi Yuji,
>
> I tried your script a couple of times. I did not experience any stale
> values. (On my Linux laptop)
>
> regards,
> Ch
>
> On Mon, Aug 15, 2016 at 7:29 AM, Yuji Ito  wrote:
>
>> Hi,
>>
>> I can reproduce the problem with the following script.
>> I got rows which should be truncated.
>> If truncating is executed only once, the problem doesn't occur.
>>
>> The test for multi nodes (replication_factor:3, kill & restart C*
>> processes in all nodes) can also reproduce it.
>>
>> test script:
>> 
>>
>> ip=xxx.xxx.xxx.xxx
>>
>> echo "0. prepare a table"
>> cqlsh $ip -e "drop keyspace testdb;"
>> cqlsh $ip -e "CREATE KEYSPACE testdb WITH replication = {'class':
>> 'SimpleStrategy', 'replication_factor': '1'};"
>> cqlsh $ip -e "CREATE TABLE testdb.testtbl (key int PRIMARY KEY, val int);"
>>
>> echo "1. insert rows"
>> for key in $(seq 1 10)
>> do
>> cqlsh $ip -e "insert into testdb.testtbl (key, val) values($key,
>> 1000) IF NOT EXISTS;" >> /dev/null 2>&1
>> done
>>
>> echo "2. truncate the table twice"
>> cqlsh $ip -e "consistency all; truncate table testdb.testtbl"
>> cqlsh $ip -e "consistency all; truncate table testdb.testtbl"
>>
>> echo "3. kill C* process"
>> ps auxww | grep "CassandraDaemon" | awk '{if ($13 ~ /cassand/) print $2}'
>> | xargs sudo kill -9
>>
>> echo "4. restart C* process"
>> sudo /etc/init.d/cassandra start
>> sleep 20
>>
>> echo "5. check the table"
>> cqlsh $ip -e "select * from testdb.testtbl;"
>>
>> 
>>
>> test result:
>> 
>>
>> 0. prepare a table
>> 1. insert rows
>> 2. truncate the table twice
>> Consistency level set to ALL.
>> Consistency level set to ALL.
>> 3. kill C* process
>> 4. restart C* process
>> Starting Cassandra: OK
>> 5. check the table
>>
>>  key | val
>> -+--
>>5 | 1000
>>   10 | 1000
>>1 | 1000
>>8 | 1000
>>2 | 1000
>>4 | 1000
>>7 | 1000
>>6 | 1000
>>9 | 1000
>>3 | 1000
>>
>> (10 rows)
>>
>> 
>>
>>
>> Thanks Christian,
>>
>> I tried with durable_writes=False.
>> It failed. I guessed this failure was caused by another problem.
>> I use SimpleStrategy.
>> A keyspace using the SimpleStrategy isn't permitted to use
>> durable_writes=False.
>>
>>
>> Regards,
>> Yuji
>>
>> On Thu, Aug 11, 2016 at 12:41 AM, horschi  wrote:
>>
>>> Hi Yuji,
>>>
>>> ok, perhaps you are seeing a different issue than I do.
>>>
>>> Have you tried with durable_writes=False? If the issue is caused by the
>>> commitlog, then it should work if you disable durable_writes.
>>>
>>> Cheers,
>>> Christian
>>>
>>>
>>>
>>> On Tue, Aug 9, 2016 at 3:04 PM, Yuji Ito  wrote:
>>>
 Thanks Christian

 can you reproduce the behaviour with a single node?

 I tried my test with a single node. But I can't.

 This behaviour is seems to be CQL only, or at least has gotten worse
> with CQL. I did not experience this with Thrift.

 I truncate tables with CQL. I've never tried with Thrift.

 I think that my problem can happen when truncating even succeeds.
 That's because I check all records after truncating.

 I checked the source code.
 ReplayPosition.segment and position become -1 and 0
 (ReplayPosition.NONE) in dscardSSTables() at truncating a table when there
 is no SSTable.
 I guess that ReplayPosition.segment shouldn't be -1 at truncating a
 table in this case.
 replayMutation() can request unexpected replay mutations because of
 this segment's value.

 Is there anyone familiar with truncate and replay?

 Regards,
 Yuji


 On Mon, Aug 8, 2016 at 6:36 PM, horschi  wrote:

> Hi Yuji,
>
> can you reproduce the behaviour with a single node?
>
> The reason I ask is because I probably have the same issue with my
> automated tests (which run truncate between every test), which run on my
> local laptop.
>
> Maybe around 5 tests randomly fail out of my 1800. I can see that the
> failed tests sometimes show data from other tests, which I think must be
> because of a failed truncate. This behaviour is seems to be CQL only, or 
> at
> least has gotten worse with CQL. I did not experience this with Thrift.
>
> regards,
> Christian
>
>
>
> On Mon, Aug 8, 2016 at 7:34 AM, Yuji Ito  wrote:
>
>> Hi all,
>>
>> I have a question about clearing table and commit log replay.
>> After some tables were truncated consecutively, I got some stale
>> values.
>> This problem doesn't occur when I clear keyspaces with DROP (and
>> CREATE).
>>
>> I'm testing the following test with node failure.
>> Some stale values appear at checking phase.
>>
>> Test iteration:
>> 1. initialize tables as below
>> 2. request a lot of 

Re: Stale value appears after consecutive TRUNCATE

2016-08-25 Thread horschi
Hi Yuji,

I tried your script a couple of times. I did not experience any stale
values. (On my Linux laptop)

regards,
Ch

On Mon, Aug 15, 2016 at 7:29 AM, Yuji Ito  wrote:

> Hi,
>
> I can reproduce the problem with the following script.
> I got rows which should be truncated.
> If truncating is executed only once, the problem doesn't occur.
>
> The test for multi nodes (replication_factor:3, kill & restart C*
> processes in all nodes) can also reproduce it.
>
> test script:
> 
>
> ip=xxx.xxx.xxx.xxx
>
> echo "0. prepare a table"
> cqlsh $ip -e "drop keyspace testdb;"
> cqlsh $ip -e "CREATE KEYSPACE testdb WITH replication = {'class':
> 'SimpleStrategy', 'replication_factor': '1'};"
> cqlsh $ip -e "CREATE TABLE testdb.testtbl (key int PRIMARY KEY, val int);"
>
> echo "1. insert rows"
> for key in $(seq 1 10)
> do
> cqlsh $ip -e "insert into testdb.testtbl (key, val) values($key, 1000)
> IF NOT EXISTS;" >> /dev/null 2>&1
> done
>
> echo "2. truncate the table twice"
> cqlsh $ip -e "consistency all; truncate table testdb.testtbl"
> cqlsh $ip -e "consistency all; truncate table testdb.testtbl"
>
> echo "3. kill C* process"
> ps auxww | grep "CassandraDaemon" | awk '{if ($13 ~ /cassand/) print $2}'
> | xargs sudo kill -9
>
> echo "4. restart C* process"
> sudo /etc/init.d/cassandra start
> sleep 20
>
> echo "5. check the table"
> cqlsh $ip -e "select * from testdb.testtbl;"
>
> 
>
> test result:
> 
>
> 0. prepare a table
> 1. insert rows
> 2. truncate the table twice
> Consistency level set to ALL.
> Consistency level set to ALL.
> 3. kill C* process
> 4. restart C* process
> Starting Cassandra: OK
> 5. check the table
>
>  key | val
> -+--
>5 | 1000
>   10 | 1000
>1 | 1000
>8 | 1000
>2 | 1000
>4 | 1000
>7 | 1000
>6 | 1000
>9 | 1000
>3 | 1000
>
> (10 rows)
>
> 
>
>
> Thanks Christian,
>
> I tried with durable_writes=False.
> It failed. I guessed this failure was caused by another problem.
> I use SimpleStrategy.
> A keyspace using the SimpleStrategy isn't permitted to use
> durable_writes=False.
>
>
> Regards,
> Yuji
>
> On Thu, Aug 11, 2016 at 12:41 AM, horschi  wrote:
>
>> Hi Yuji,
>>
>> ok, perhaps you are seeing a different issue than I do.
>>
>> Have you tried with durable_writes=False? If the issue is caused by the
>> commitlog, then it should work if you disable durable_writes.
>>
>> Cheers,
>> Christian
>>
>>
>>
>> On Tue, Aug 9, 2016 at 3:04 PM, Yuji Ito  wrote:
>>
>>> Thanks Christian
>>>
>>> can you reproduce the behaviour with a single node?
>>>
>>> I tried my test with a single node. But I can't.
>>>
>>> This behaviour is seems to be CQL only, or at least has gotten worse
 with CQL. I did not experience this with Thrift.
>>>
>>> I truncate tables with CQL. I've never tried with Thrift.
>>>
>>> I think that my problem can happen when truncating even succeeds.
>>> That's because I check all records after truncating.
>>>
>>> I checked the source code.
>>> ReplayPosition.segment and position become -1 and 0
>>> (ReplayPosition.NONE) in dscardSSTables() at truncating a table when there
>>> is no SSTable.
>>> I guess that ReplayPosition.segment shouldn't be -1 at truncating a
>>> table in this case.
>>> replayMutation() can request unexpected replay mutations because of this
>>> segment's value.
>>>
>>> Is there anyone familiar with truncate and replay?
>>>
>>> Regards,
>>> Yuji
>>>
>>>
>>> On Mon, Aug 8, 2016 at 6:36 PM, horschi  wrote:
>>>
 Hi Yuji,

 can you reproduce the behaviour with a single node?

 The reason I ask is because I probably have the same issue with my
 automated tests (which run truncate between every test), which run on my
 local laptop.

 Maybe around 5 tests randomly fail out of my 1800. I can see that the
 failed tests sometimes show data from other tests, which I think must be
 because of a failed truncate. This behaviour is seems to be CQL only, or at
 least has gotten worse with CQL. I did not experience this with Thrift.

 regards,
 Christian



 On Mon, Aug 8, 2016 at 7:34 AM, Yuji Ito  wrote:

> Hi all,
>
> I have a question about clearing table and commit log replay.
> After some tables were truncated consecutively, I got some stale
> values.
> This problem doesn't occur when I clear keyspaces with DROP (and
> CREATE).
>
> I'm testing the following test with node failure.
> Some stale values appear at checking phase.
>
> Test iteration:
> 1. initialize tables as below
> 2. request a lot of read/write concurrently
> 3. check all records
> 4. repeat from the beginning
>
> I use C* 2.2.6. There are 3 nodes (replication_factor: 3).
> Each node kills cassandra process at random intervals and restarts it
> immediately.
>
> My initialization:

How to configure cassandra in a multi cluster mode?

2016-08-25 Thread Alexandr Porunov
Hello,

I am little bit confusing about cassandra's configuration.
There are 2 parameters which I don't understand:
listen_address
seeds

I have 4 identical nodes:
192.168.0.61 cassandra1
192.168.0.62 cassandra2
192.168.0.63 cassandra3
192.168.0.64 cassandra4

What shell I do to configure those 4 nodes into a single cluster?

Sincerely,
Alexandr