CVE-2023-30601: Apache Cassandra: Privilege escalation when enabling FQL/Audit logs

2023-05-29 Thread Marcus Eriksson
Severity: important

Affected versions:

- Apache Cassandra 4.0.0 through 4.0.9
- Apache Cassandra 4.1.0 through 4.1.1

Description:

Privilege escalation when enabling FQL/Audit logs allows user with JMX access 
to run arbitrary commands as the user running Apache Cassandra
This issue affects Apache Cassandra: from 4.0.0 through 4.0.9, from 4.1.0 
through 4.1.1.

WORKAROUND
The vulnerability requires nodetool/JMX access to be exploitable, disable 
access for any non-trusted users.

MITIGATION
Upgrade to 4.0.10 or 4.1.2 and leave the new FQL/Auditlog configuration 
property allow_nodetool_archive_command as false.

This issue is being tracked as CASSANDRA-18550 

Credit:

Gal Elbaz at Oligo (finder)

References:

https://cassandra.apache.org/
https://www.cve.org/CVERecord?id=CVE-2023-30601
https://issues.apache.org/jira/browse/CASSANDRA-18550



Re: Failed service startup

2022-12-13 Thread Marcus Eriksson
This looks like https://issues.apache.org/jira/browse/CASSANDRA-17273

iirc you can merge the two files - making sure all ADD and REMOVE records are 
in both files, I think you would need to add 
`ADD:[/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67417-big-,0,8][3940068469]`
 to the data02 transaction log file

Make sure you back up all involved sstables before trying this

/Marcus


On Mon, Dec 12, 2022 at 02:40:25PM +, Marc Hoppins wrote:
> Hi, all,
> 
> We had a failed HDD on one node. The node was shut down pending repair.  
> There are now 4 other nodes with Cassandra not running and unable to startup 
> due to the following kinds of error.  Is this kind of thing due to the 
> original stopped node?
> 
> ERROR [main] 2022-12-12 14:58:10,838 LogReplicaSet.java:145 - Mismatched line 
> in file 
> nb_txn_anticompactionafterrepair_5865e530-7a18-11ed-950f-954f6819a607.log: 
> got 
> 'ADD:[/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67417-big-,0,8][3940068469]'
>  expected 
> 'ADD:[/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67418-big-,0,8][2798461787]',
>  giving up
> ERROR [main] 2022-12-12 14:58:10,838 LogFile.java:161 - Failed to read 
> records for transaction log 
> [nb_txn_anticompactionafterrepair_5865e530-7a18-11ed-950f-954f6819a607.log in 
> /mnt/data02/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c,
>  
> /mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c]
> ERROR [main] 2022-12-12 14:58:10,840 LogTransaction.java:551 - Unexpected 
> disk state: failed to read transaction log 
> [nb_txn_anticompactionafterrepair_5865e530-7a18-11ed-950f-954f6819a607.log in 
> /mnt/data02/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c,
>  
> /mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c]
> Files and contents follow:
> /mnt/data02/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb_txn_anticompactionafterrepair_5865e530-7a18-11ed-950f-954f6819a607.log
> 
> ADD:[/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67416-big-,0,8][1963077611]
> 
> ADD:[/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67418-big-,0,8][2798461787]
> 
> REMOVE:[/mnt/data02/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67405-big-,1665045804823,8][1428695358]
> 
> REMOVE:[/mnt/data02/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67402-big-,1665050002894,8][2407633150]
> COMMIT:[,0,0][2613697770]
> /mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb_txn_anticompactionafterrepair_5865e530-7a18-11ed-950f-954f6819a607.log
> 
> ADD:[/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67416-big-,0,8][1963077611]
> 
> ADD:[/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67417-big-,0,8][3940068469]
> ***Does not match 
> 
>  in first replica file
> 
> ADD:[/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67418-big-,0,8][2798461787]
> 
> REMOVE:[/mnt/data02/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67405-big-,1665045804823,8][1428695358]
> 
> REMOVE:[/mnt/data02/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67402-big-,1665050002894,8][2407633150]
> COMMIT:[,0,0][2613697770]
> 
> ERROR [main] 2022-12-12 14:58:10,841 CassandraDaemon.java:911 - Cannot remove 
> temporary or obsoleted files for hades.prod_md5_sha1 due to a problem with 
> transaction log files. Please check records with problems in the log messages 
> above and fix them. Refer to the 3.0 upgrading instructions in NEWS.txt for a 
> description of transaction log files.
> 
> Sstableutil only returned
> 
> ERROR 15:35:52,217 Mismatched line in file 
> nb_txn_anticompactionafterrepair_5865e530-7a18-11ed-950f-954f6819a607.log: 
> got 
> 'ADD:[/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67417-big-,0,8][3940068469]'
>  expected 
> 'ADD:[/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67418-big-,0,8][2798461787]',
>  giving up
> ERROR 15:35:52,219 Failed to read records for transaction log 
> [nb_txn_anticompactionafterrepair_5865e530-7a18-11ed-950f-954f6819a607.log in 
> /mnt/data02/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c,
>  
> /mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c]
> ERROR 15:35:52,220 Unexpected disk state: failed to read transaction log 
> [nb_txn_anticompactionafterrepair_5865e530-7a18-11ed-950f-954f6819a607.log in 
> /mnt/data02/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c,
>  
> 

CVE-2021-44521: Apache Cassandra: Remote code execution for scripted UDFs

2022-02-11 Thread Marcus Eriksson
Severity: high

Description:

When running Apache Cassandra with the following configuration:

enable_user_defined_functions: true
enable_scripted_user_defined_functions: true
enable_user_defined_functions_threads: false 

it is possible for an attacker to execute arbitrary code on the host. The 
attacker would need to have enough permissions to create user defined functions 
in the cluster to be able to exploit this. Note that this configuration is 
documented as unsafe, and will continue to be considered unsafe after this CVE.

This issue is being tracked as CASSANDRA-17352

Mitigation:

Set `enable_user_defined_functions_threads: true` (this is default)
or
3.0 users should upgrade to 3.0.26
3.11 users should upgrade to 3.11.12
4.0 users should upgrade to 4.0.2

Credit:

This issue was discovered by Omer Kaspi of the JFrog Security vulnerability 
research team.


Re: What is the cons of changing LCS fanout option to 100 or even bigger?

2018-09-18 Thread Marcus Eriksson
problem would be that for every file you flush, you would recompact all of
L1 - files are flushed to L0, then compacted together with all overlapping
files in L1.

On Tue, Sep 18, 2018 at 4:53 AM 健 戴  wrote:

> Hi,
>
> I have one table having 2T data saved in c* each node.
> And if using LCS, the data will have 5 level:
>
>
>- L1: 160M * 10 = 1.6G
>- L2: 1.6G * 10 = 16G
>- L3: 16G * 10 = 160G
>- L4: 160G * 10 = 1.6T
>- L5: 1.6T * 10 = 16T
>
> When I looking into the source code, I found an option: fanout_size.
>
> The default value is 10. What about change this value to 100? Then the
> level will reduce to 3:
>
>- L1: 160M * 100 = 16G
>- L2: 16G * 100 = 1.6T
>- L3: 1.6T * 100 = 160T
>
> Or even could I set this to 1? And all files are in a same level.
> Should it be better then?
> What is the cons of the bigger value of this option?
>
> Thanks for your help.
>
>
> Jian
>


Re: Fresh SSTable files (due to repair?) in a static table (was Re: Drop TTLd rows: upgradesstables -a or scrub?)

2018-09-17 Thread Marcus Eriksson
It could also be https://issues.apache.org/jira/browse/CASSANDRA-2503

On Mon, Sep 17, 2018 at 4:04 PM Jeff Jirsa  wrote:

>
>
> On Sep 17, 2018, at 2:34 AM, Oleksandr Shulgin <
> oleksandr.shul...@zalando.de> wrote:
>
> On Tue, Sep 11, 2018 at 8:10 PM Oleksandr Shulgin <
> oleksandr.shul...@zalando.de> wrote:
>
>> On Tue, 11 Sep 2018, 19:26 Jeff Jirsa,  wrote:
>>
>>> Repair or read-repair
>>>
>>
>> Could you be more specific please?
>>
>> Why any data would be streamed in if there is no (as far as I can see)
>> possibilities for the nodes to have inconsistency?
>>
>
> Again, given that the tables are not updated anymore from the application
> and we have repaired them successfully multiple times already, how can it
> be that any inconsistency would be found by read-repair or normal repair?
>
> We have seen this on a number of nodes, including SSTables written at the
> time there was guaranteed no repair running.
>
>
> Not obvious to me where the sstable is coming from - you’d have to look in
> the logs. If it’s read repair, it’ll be created during a memtable flush. If
> it’s nodetool repair, it’ll be streamed in. It could also be compaction
> (especially tombstone compaction), in which case it’ll be in the compaction
> logs and it’ll have an sstable ancestor in the metadata.
>
>
>


Re: Index summary redistribution seems to block all compactions

2017-10-25 Thread Marcus Eriksson
Anything in the logs? It *could* be
https://issues.apache.org/jira/browse/CASSANDRA-13873

On Tue, Oct 24, 2017 at 11:18 PM, Sotirios Delimanolis <
sotodel...@yahoo.com.invalid> wrote:

> On a Cassandra 2.2.11 cluster, I noticed estimated compactions
> accumulating on one node. nodetool compactionstats showed the following:
>
> compaction typekeyspace table   completed
>  totalunit   progress
>  Compaction ks1some_table   204.68 MB
>  204.98 MB   bytes 99.86%
>Index summary redistribution*null*  *null*   457.72
> KB  950 MB   bytes  *0.05%*
>  Compaction ks1some_table   461.61 MB
>  461.95 MB   bytes 99.93%
>Tombstone Compaction ks1some_table   618.34 MB
>  618.47 MB   bytes 99.98%
>  Compaction ks1some_table   378.37 MB
> 380 MB   bytes 99.57%
>Tombstone Compaction ks1some_table   326.51 MB
>  327.63 MB   bytes 99.66%
>Tombstone Compaction ks2   other_table29.38 MB
> 29.38 MB   bytes100.00%
>Tombstone Compaction ks1some_table503.4 MB
>  507.28 MB   bytes 99.24%
>  Compaction ks1some_table   353.44 MB
>  353.47 MB   bytes 99.99%
>
>
> They had been like this for a while (all different tables). A thread dump
> showed all 8 CompactionExecutor threads looking like
>
> "CompactionExecutor:6" #84 daemon prio=1 os_prio=4 tid=0x7f5771172000
> nid=0x7646 waiting on condition [0x7f578847b000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0005fe5656e8> (a
> com.google.common.util.concurrent.AbstractFuture$Sync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.
> java:175)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.
> parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.
> doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.
> acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
> at com.google.common.util.concurrent.AbstractFuture$
> Sync.get(AbstractFuture.java:285)
> at com.google.common.util.concurrent.AbstractFuture.get(
> AbstractFuture.java:116)
> at org.apache.cassandra.utils.FBUtilities.waitOnFuture(
> FBUtilities.java:390)
> at org.apache.cassandra.db.SystemKeyspace.forceBlockingFlush(
> SystemKeyspace.java:593)
> at org.apache.cassandra.db.SystemKeyspace.finishCompaction(
> SystemKeyspace.java:368)
> at org.apache.cassandra.db.compaction.CompactionTask.
> runMayThrow(CompactionTask.java:205)
> at org.apache.cassandra.utils.WrappedRunnable.run(
> WrappedRunnable.java:28)
> at org.apache.cassandra.db.compaction.CompactionTask.
> executeInternal(CompactionTask.java:74)
> at org.apache.cassandra.db.compaction.AbstractCompactionTask.
> execute(AbstractCompactionTask.java:80)
> at org.apache.cassandra.db.compaction.CompactionManager$
> BackgroundCompactionCandidate.run(CompactionManager.java:257)
> at java.util.concurrent.Executors$RunnableAdapter.
> call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
>
> A MemtablePostFlush thread was awaiting some flush count down latch
>
> "MemtablePostFlush:1" #30 daemon prio=5 os_prio=0 tid=0x7f57705dac00
> nid=0x75bf waiting on condition [0x7f578a8fb000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x000573da6c90> (a
> java.util.concurrent.CountDownLatch$Sync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.
> java:175)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.
> parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.
> doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.
> acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
> at java.util.concurrent.CountDownLatch.await(
> CountDownLatch.java:231)
> at org.apache.cassandra.db.ColumnFamilyStore$PostFlush.
> call(ColumnFamilyStore.java:1073)
> at org.apache.cassandra.db.ColumnFamilyStore$PostFlush.
> call(ColumnFamilyStore.java:1026)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 

Re: Restoring a table cassandra - compactions

2017-06-01 Thread Marcus Eriksson
This is done to avoid overlap in levels > 0

There is this though: https://issues.apache.org/jira/browse/CASSANDRA-13425

If you are restoring an entire node, starting with an empty data directory,
you should probably stop cassandra, copy the snapshot in, and restart, that
will keep the levels

On Thu, Jun 1, 2017 at 4:25 PM, Jean Carlo 
wrote:

> Hello.
>
> During the restore of a table using its snapshot and nodetool refresh, I
> could see that cassandra starts to make a lot of compactions (depending on
> the size of the data).
>
> I wanted to know why and I found this in the code of cassandra 2.1.14.
>
> for CASSANDRA-4872
>
> +// force foreign sstables to level 0
> +try
> +{
> +if (new File(descriptor.filenameFor(
> Component.STATS)).exists())
> +{
> +SSTableMetadata oldMetadata =
> SSTableMetadata.serializer.deserialize(descriptor);
> +LeveledManifest.mutateLevel(oldMetadata, descriptor,
> descriptor.filenameFor(Component.STATS), 0);
> +}
> +}
> +catch (IOException e)
>
>
> This is very interesting and I wanted to know if this was coded taking
> into account only the case of a migration from STCS to LCS or if for the
> case LCS to LCS this is not pertinent
>
> In my case, I use nodetool refresh not only to restore a table but also to
> make an exact copy of any table LCS. So I think the levels do not need to
> change.
>
> @Marcus Can you be so kind to clarify this for me please ?
>
> Thenk you very much in advance
>
> Best regards
>
> Jean Carlo
>
> "The best way to predict the future is to invent it" Alan Kay
>


Re: dtests jolokia fails to attach

2016-10-06 Thread Marcus Eriksson
It is this: "-XX:+PerfDisableSharedMem" - in your dtest you need to do
"remove_perf_disable_shared_mem(node1)" before starting the node

/Marcus

On Thu, Oct 6, 2016 at 8:30 AM, Benjamin Roth 
wrote:

> Maybe additional information, this is the CS command line for ccm node1:
>
> br   20376  3.2  8.6 2331136 708308 pts/5  Sl   06:10   0:30 java
> -Xloggc:/home/br/.ccm/test/node1/logs/gc.log -ea -XX:+UseThreadPriorities
> -XX:ThreadPriorityPolicy=42 -XX:+HeapDumpOnOutOfMemoryError -Xss256k
> -XX:StringTableSize=103 -XX:+AlwaysPreTouch -XX:-UseBiasedLocking
> -XX:+UseTLAB -XX:+ResizeTLAB -XX:+UseNUMA -XX:+PerfDisableSharedMem
> -Djava.net.preferIPv4Stack=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
> -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8
> -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 
> -XX:+UseCMSInitiatingOccupancyOnly
> -XX:CMSWaitDuration=1 -XX:+CMSParallelInitialMarkEnabled
> -XX:+CMSEdenChunksRecordAlways -XX:+CMSClassUnloadingEnabled
> -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC
> -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime
> -XX:+PrintPromotionFailure -XX:+UseGCLogFileRotation
> -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10M -Xms500M -Xmx500M -Xmn50M
> -XX:+UseCondCardMark 
> -XX:CompileCommandFile=/home/br/.ccm/test/node1/conf/hotspot_compiler
> -javaagent:/home/br/repos/cassandra/lib/jamm-0.3.0.jar
> -Dcassandra.jmx.local.port=7100 
> -Dcom.sun.management.jmxremote.authenticate=false
> -Dcom.sun.management.jmxremote.password.file=/etc/cassandra/jmxremote.password
> -Djava.library.path=/home/br/repos/cassandra/lib/sigar-bin
> -Dcassandra.migration_task_wait_in_seconds=6 -Dcassandra.libjemalloc=/usr/
> lib/x86_64-linux-gnu/libjemalloc.so.1 -Dlogback.configurationFile=logback.xml
> -Dcassandra.logdir=/var/log/cassandra 
> -Dcassandra.storagedir=/home/br/repos/cassandra/data
> -Dcassandra-pidfile=/home/br/.ccm/test/node1/cassandra.pid -cp
> /home/br/.ccm/test/node1/conf:/home/br/repos/cassandra/
> build/classes/main:/home/br/repos/cassandra/build/classes/
> thrift:/home/br/repos/cassandra/lib/HdrHistogram-2.1.9.jar:/home/br/repos/
> cassandra/lib/ST4-4.0.8.jar:/home/br/repos/cassandra/lib/
> airline-0.6.jar:/home/br/repos/cassandra/lib/antlr-
> runtime-3.5.2.jar:/home/br/repos/cassandra/lib/asm-5.0.4.
> jar:/home/br/repos/cassandra/lib/caffeine-2.2.6.jar:/home/
> br/repos/cassandra/lib/cassandra-driver-core-3.0.1-
> shaded.jar:/home/br/repos/cassandra/lib/commons-cli-1.1.
> jar:/home/br/repos/cassandra/lib/commons-codec-1.2.jar:/
> home/br/repos/cassandra/lib/commons-lang3-3.1.jar:/home/
> br/repos/cassandra/lib/commons-math3-3.2.jar:/home/br/repos/cassandra/lib/
> compress-lzf-0.8.4.jar:/home/br/repos/cassandra/lib/
> concurrent-trees-2.4.0.jar:/home/br/repos/cassandra/lib/
> concurrentlinkedhashmap-lru-1.4.jar:/home/br/repos/
> cassandra/lib/disruptor-3.0.1.jar:/home/br/repos/cassandra/
> lib/ecj-4.4.2.jar:/home/br/repos/cassandra/lib/guava-18.
> 0.jar:/home/br/repos/cassandra/lib/high-scale-lib-
> 1.0.6.jar:/home/br/repos/cassandra/lib/hppc-0.5.4.jar:/
> home/br/repos/cassandra/lib/jackson-core-asl-1.9.2.jar:/
> home/br/repos/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/
> home/br/repos/cassandra/lib/jamm-0.3.0.jar:/home/br/repos/
> cassandra/lib/javax.inject.jar:/home/br/repos/cassandra/
> lib/jbcrypt-0.3m.jar:/home/br/repos/cassandra/lib/jcl-over-
> slf4j-1.7.7.jar:/home/br/repos/cassandra/lib/jctools-
> core-1.2.1.jar:/home/br/repos/cassandra/lib/jflex-1.6.0.jar:
> /home/br/repos/cassandra/lib/jna-4.0.0.jar:/home/br/repos/
> cassandra/lib/joda-time-2.4.jar:/home/br/repos/cassandra/
> lib/json-simple-1.1.jar:/home/br/repos/cassandra/lib/
> libthrift-0.9.2.jar:/home/br/repos/cassandra/lib/log4j-
> over-slf4j-1.7.7.jar:/home/br/repos/cassandra/lib/logback-
> classic-1.1.3.jar:/home/br/repos/cassandra/lib/logback-
> core-1.1.3.jar:/home/br/repos/cassandra/lib/lz4-1.3.0.jar:/
> home/br/repos/cassandra/lib/metrics-core-3.1.0.jar:/home/
> br/repos/cassandra/lib/metrics-jvm-3.1.0.jar:/home/br/repos/cassandra/lib/
> metrics-logback-3.1.0.jar:/home/br/repos/cassandra/lib/
> netty-all-4.0.39.Final.jar:/home/br/repos/cassandra/lib/
> ohc-core-0.4.4.jar:/home/br/repos/cassandra/lib/ohc-core-
> j8-0.4.4.jar:/home/br/repos/cassandra/lib/primitive-1.0.
> jar:/home/br/repos/cassandra/lib/reporter-config-base-3.0.
> 0.jar:/home/br/repos/cassandra/lib/reporter-config3-3.0.0.jar:/home/br/
> repos/cassandra/lib/sigar-1.6.4.jar:/home/br/repos/
> cassandra/lib/slf4j-api-1.7.7.jar:/home/br/repos/cassandra/
> lib/snakeyaml-1.11.jar:/home/br/repos/cassandra/lib/snappy-
> java-1.1.1.7.jar:/home/br/repos/cassandra/lib/snowball-
> stemmer-1.3.0.581.1.jar:/home/br/repos/cassandra/lib/stream-
> 2.5.2.jar:/home/br/repos/cassandra/lib/thrift-server-0.
> 3.7.jar:/home/br/repos/cassandra/lib/jsr223/*/*.jar
> -Dcassandra.join_ring=True -Dcassandra.logdir=/home/br/.ccm/test/node1/logs
> -Dcassandra.boot_without_jna=true 

Re: High Heap Memory usage during nodetool repair in Cassandra 3.0.3

2016-06-22 Thread Marcus Eriksson
it could also be CASSANDRA-11412 if you have many sstables and vnodes

On Wed, Jun 22, 2016 at 2:50 PM, Bhuvan Rawal  wrote:

> Thanks for the info Paulo, Robert. I tried further testing with other
> parameters and it was prevalent. We could be either 11739, 11206. But im
> spektical about 11739 because repair works well in 3.5 and 11739 seems to
> be fixed for 3.7/3.0.7.
>
> We may possibly resolve this by increasing heap size thereby reducing some
> page cache bandwidth before upgrading to higher versions.
>
> On Mon, Jun 20, 2016 at 10:00 PM, Paulo Motta 
> wrote:
>
>> You could also be hitting CASSANDRA-11739, which was fixed on 3.0.7 and
>> could potentially cause OOMs for long-running repairs.
>>
>>
>> 2016-06-20 13:26 GMT-03:00 Robert Stupp :
>>
>>> One possibility might be CASSANDRA-11206 (Support large partitions on
>>> the 3.0 sstable format), which reduces heap usage for other operations
>>> (like repair, compactions) as well.
>>> You can verify that by setting column_index_cache_size_in_kb in c.yaml
>>> to a really high value like 1000 - if you see the same behaviour in 3.7
>>> with that setting, there’s not much you can do except upgrading to 3.7 as
>>> that change went into 3.6 and not into 3.0.x.
>>>
>>> —
>>> Robert Stupp
>>> @snazy
>>>
>>> On 20 Jun 2016, at 18:13, Bhuvan Rawal  wrote:
>>>
>>> Hi All,
>>>
>>> We are running Cassandra 3.0.3 on Production with Max Heap Size of 8GB.
>>> There has been a consistent issue with nodetool repair for a while and
>>> we have tried issuing it with multiple options --pr, --local as well,
>>> sometimes node went down with Out of Memory error and at times nodes did
>>> stopped connecting any connection, even jmx nodetool commands.
>>>
>>> On trying with same data on 3.7 Repair Ran successfully without
>>> encountering any of the above mentioned issues. I then tried increasing
>>> heap to 16GB on 3.0.3 and repair ran successfully.
>>>
>>> I then analyzed memory usage during nodetool repair for 3.0.3(16GB
>>> heap) vs 3.7 (8GB Heap) and 3.0.3 occupied 11-14 GB at all times,
>>> whereas 3.7 spiked between 1-4.5 GB while repair runs. As they ran on
>>> same dataset and unrepaired data with full repair.
>>>
>>> We would like to know if it is a known bug that was fixed post 3.0.3 and
>>> there could be a possible way by which we can run repair on 3.0.3 without
>>> increasing heap size as for all other activities 8GB works for us.
>>>
>>> PFA the visualvm snapshots.
>>>
>>> 
>>> ​3.0.3 VisualVM Snapshot, consistent heap usage of greater than 12 GB.
>>>
>>>
>>> 
>>> ​3.7 VisualVM Snapshot, 8GB Max Heap and max heap usage till about 5GB.
>>>
>>> Thanks & Regards,
>>> Bhuvan Rawal
>>>
>>>
>>> PS: In case if the snapshots are not visible, they can be viewed from
>>> the following links:
>>> 3.0.3:
>>> https://s31.postimg.org/4e7ifsjaz/Screenshot_from_2016_06_20_21_06_09.png
>>> 3.7:
>>> https://s31.postimg.org/xak32s9m3/Screenshot_from_2016_06_20_21_05_57.png
>>>
>>>
>>>
>>
>


Re: Effectiveness of Scrub Operation vs SSTable previously marked in blacklist

2016-03-23 Thread Marcus Eriksson
yeah that is most likely a bug, could you file a ticket?

On Tue, Mar 22, 2016 at 4:36 AM, Michael Fong <
michael.f...@ruckuswireless.com> wrote:

> Hi, all,
>
>
>
> We recently encountered a scenario under Cassandra 2.0 deployment.
> Cassandra detected a corrupted sstable, and when we attempt to scrub the
> sstable (with all the associated sstables), the corrupted sstable was not
> included in the sstable list. This continues until we restart Cassandra and
> perform sstable again.
>
>
>
> After we traced the Cassandra source code, we are a bit confused with the
> effectiveness of scrubbing and SStable being marked in blacklist in
> Cassandra 2.0+
>
>
>
> It seems from previous version (Cassandra 1.2), the scrub operation would
> operate on a sstable regardless of it being previously marked. However, in
> Cassandra 2.0, the function flows seems changed.
>
>
>
> Here is function flow that we traced in Cassandra 2.0 source code:
>
>
>
> From org.apache.cassandra.db.compaction.CompactionManager
>
> …
> public void performScrub(ColumnFamilyStore cfStore, final boolean 
> skipCorrupted, final boolean checkData) throws InterruptedException, 
> ExecutionException
>
> {
>
> performAllSSTableOperation(cfStore, new AllSSTablesOperation()
>
> {
>
> …
>
> private void performAllSSTableOperation(final ColumnFamilyStore cfs,
> final AllSSTablesOperation operation) throws InterruptedException,
> ExecutionException
>
> {
>
> final Iterable sstables = cfs.markAllCompacting();
>
> …
>
> org.apache.cassandra.db. ColumnFamilyStore
> …
>
> public Iterable markAllCompacting()
>
> {
>
> Callable callable = new 
> Callable()
>
> {
>
> public Iterable call() throws Exception
>
> {
>
> assert data.getCompacting().isEmpty() : data.getCompacting();
>
> Iterable sstables = 
> Lists.newArrayList(*AbstractCompactionStrategy.filterSuspectSSTables(getSSTables())*);
>
> if (Iterables.isEmpty(sstables))
>
> return null;
>
> …
>
>
>
> If it is true, would this flow – marking corrupted sstable in blacklist,
> defeat the original purpose of scrub operation?  Thanks in advanced!
>
>
>
>
>
> Sincerely,
>
>
>
> Michael Fong
>


Re: DTCS Question

2016-03-19 Thread Marcus Eriksson
On Wed, Mar 16, 2016 at 6:49 PM, Anubhav Kale 
wrote:

> I am using Cassandra 2.1.13 which has all the latest DTCS fixes (it does
> STCS within the DTCS windows). It also introduced a field called
> MAX_WINDOW_SIZE which defaults to one day.
>
>
>
> So in my data folders, I may see SS Tables that span beyond a day
> (generated through old data through repairs or commit logs), but whenever I
> see a message in logs “Compacted Foo” (meaning the SS Table under question
> was definitely a result of compaction), the “Foo” SS Table should never
> have data beyond a day. Is this understanding accurate ?
>
No - not until https://issues.apache.org/jira/browse/CASSANDRA-10496 (read
for explanation)


>
>
> If we have issues with repairs pulling in old data, should MAX_WINDOW_SIZE
> instead be set to a larger value so that we don’t run the risk of too many
> SS Tables lying around and never getting compacted ?
>
No, with CASSANDRA-10280 that old data will get compacted if needed
(assuming you have default settings). If the remote node is correctly date
tiered, the streamed sstable will also be correctly date tiered. Then that
streamed sstable will be put in a time window and if there are enough
sstables in that old window, we do a compaction.

/Marcus


Re: Compaction Filter in Cassandra

2016-03-12 Thread Marcus Eriksson
We don't have anything like that, do you have a specific use case in mind?

Could you create a JIRA ticket and we can discuss there?

/Marcus

On Sat, Mar 12, 2016 at 7:05 AM, Dikang Gu  wrote:

> Hello there,
>
> RocksDB has the feature called "Compaction Filter" to allow application to
> modify/delete a key-value during the background compaction.
> https://github.com/facebook/rocksdb/blob/v4.1/include/rocksdb/options.h#L201-L226
>
> I'm wondering is there a plan/value to add this into C* as well? Or is
> there already a similar thing in C*?
>
> Thanks
>
> --
> Dikang
>
>


Re: Too many sstables with DateTieredCompactionStrategy

2016-02-29 Thread Marcus Eriksson
why do you have 'timestamp_resolution': 'MILLISECONDS'? It should be left
as default (MICROSECONDS) unless you do "USING TIMESTAMP
"-inserts, see
https://issues.apache.org/jira/browse/CASSANDRA-11041

On Mon, Feb 29, 2016 at 2:36 PM, Noorul Islam K M  wrote:

>
> Hi all,
>
> We are using below compaction settings for a table
>
> compaction = {'timestamp_resolution': 'MILLISECONDS',
> 'max_sstable_age_days': '365', 'base_time_seconds': '60', 'class':
> 'org.apache.cassandra.db.compaction.DateTieredCompactionStrategy'}
>
> But it is creating too many sstables. Currently number of sstables
> is 4. We have been injecting data for the last three days.
>
> We have set the compactionthroughput to 128 MB/s
>
> $ nodetool getcompactionthroughput
>
> Current compaction throughput: 128 MB/s
>
> But this is not helping.
>
> How can we control the number of sstables in this case?
>
> Thanks and Regards
> Noorul
>


Re: JBOD device space allocation?

2016-02-24 Thread Marcus Eriksson
On Wed, Feb 24, 2016 at 6:28 PM, Jack Krupansky <jack.krupan...@gmail.com>
wrote:

> Thanks. I didn't pay enough attention to that statement on my initial
> reading of that post (which was where I became aware of the 3.2 behavior in
> the first place.)
>
> Considering that the doc explicitly recommends that the byte ordered
> partitioner not be used, that implies that the 3.2 JBOD behavior should be
> used for all recommended partitioner use cases.
>
> I'm still not clear on when exactly a node would not have "localRanges" -
> in terms of how the user would hit that scenario, or is than merely a
> defensive check for a scenario which cannot normally be encountered? I
> mean, it means that the endpoint is not responsible for any range of
> tokens, but how can that ever be true, or is that simply if the user
> configures the node to own zero tokens? But other than that, is there any
> normal way a user could end up with a node that has no "localRanges"?
>

IIRC it is only defensive now - before
https://issues.apache.org/jira/browse/CASSANDRA-9317 it could be empty
during startup


>
> But even if the node owns no "local" ranges, can't it have replicated data
> from RF=k-1 other nodes? Or does empty localRanges mean than the RF=k-1
> nodes that might have replicated data for this node are all also configured
> to own zero tokens? Seems that way. But is there any reasonable scenario
> under which the user would hit this? I mean, why would the code care either
> way with respect to JBOD strategy for the case where no local data is
> stored?
>

local ranges are all ranges the node should store - if you have 256 vnode
tokens and RF=3, you will have 768 local ranges

/Marcus


>
>
> -- Jack Krupansky
>
> On Wed, Feb 24, 2016 at 2:15 AM, Marcus Eriksson <krum...@gmail.com>
> wrote:
>
>> It is mentioned here btw: http://www.datastax.com/dev/blog/improving-jbod
>>
>> On Wed, Feb 24, 2016 at 8:14 AM, Marcus Eriksson <krum...@gmail.com>
>> wrote:
>>
>>> If you don't use RandomPartitioner/Murmur3Partitioner you will get the
>>> old behavior.
>>>
>>> On Wed, Feb 24, 2016 at 2:47 AM, Jack Krupansky <
>>> jack.krupan...@gmail.com> wrote:
>>>
>>>> I just wanted to confirm whether my understanding of how JBOD allocates
>>>> device space is correct of not...
>>>>
>>>> Pre-3.2:
>>>> On each memtable flush Cassandra will select the directory (device)
>>>> which has the most available space as a percentage of the total available
>>>> space on all of the listed directories/devices. A random weighted value is
>>>> used so it won't always pick the same directory/device with the most space,
>>>> the goal being to balance writes for performance.
>>>>
>>>> As of 3.2:
>>>> The ranges of tokens stored on the local node will be evenly
>>>> distributed among the configured storage devices - even by token range,
>>>> even if that may be uneven by actual partition sizes. The code presumes
>>>> that each of the configured local storage devices has the same capacity.
>>>>
>>>> The relevant change in 3.2 appears to be:
>>>> Make sure tokens don't exist in several data directories
>>>> (CASSANDRA-6696)
>>>>
>>>> The code for the pre-3.2 model is still in 3.x - is there some other
>>>> code path which will cause the pre-3.2 behavior even when runing 3.2 or
>>>> later?
>>>>
>>>> I see this code which seems to allow for at least some cases where the
>>>> pre-3.2 behavior would still be invoked, but I'm not sure what user-level
>>>> cases that might be:
>>>>
>>>> if (!cfs.getPartitioner().splitter().isPresent() ||
>>>> localRanges.isEmpty())
>>>>   return Collections.singletonList(new
>>>> FlushRunnable(lastReplayPosition.get(), txn));
>>>>
>>>> return createFlushRunnables(localRanges, txn);
>>>>
>>>> IOW, if the partitioner does not have a splitter present or the
>>>> localRanges for the node cannot be determined. But... what exactly would a
>>>> user do to cause that?
>>>>
>>>> There is no doc for this stuff - can a committer (or adventurous user!)
>>>> confirm what is actually implemented, both pre and post 3.2? (I already
>>>> pinged docs on this.)
>>>>
>>>> Or if anybody is actually using JBOD, what behavior they are seeing for
>>>> device space utilization.
>>>>
>>>> Thanks!
>>>>
>>>> -- Jack Krupansky
>>>>
>>>
>>>
>>
>


Re: JBOD device space allocation?

2016-02-23 Thread Marcus Eriksson
If you don't use RandomPartitioner/Murmur3Partitioner you will get the old
behavior.

On Wed, Feb 24, 2016 at 2:47 AM, Jack Krupansky 
wrote:

> I just wanted to confirm whether my understanding of how JBOD allocates
> device space is correct of not...
>
> Pre-3.2:
> On each memtable flush Cassandra will select the directory (device) which
> has the most available space as a percentage of the total available space
> on all of the listed directories/devices. A random weighted value is used
> so it won't always pick the same directory/device with the most space, the
> goal being to balance writes for performance.
>
> As of 3.2:
> The ranges of tokens stored on the local node will be evenly distributed
> among the configured storage devices - even by token range, even if that
> may be uneven by actual partition sizes. The code presumes that each of the
> configured local storage devices has the same capacity.
>
> The relevant change in 3.2 appears to be:
> Make sure tokens don't exist in several data directories (CASSANDRA-6696)
>
> The code for the pre-3.2 model is still in 3.x - is there some other code
> path which will cause the pre-3.2 behavior even when runing 3.2 or later?
>
> I see this code which seems to allow for at least some cases where the
> pre-3.2 behavior would still be invoked, but I'm not sure what user-level
> cases that might be:
>
> if (!cfs.getPartitioner().splitter().isPresent() || localRanges.isEmpty())
>   return Collections.singletonList(new
> FlushRunnable(lastReplayPosition.get(), txn));
>
> return createFlushRunnables(localRanges, txn);
>
> IOW, if the partitioner does not have a splitter present or the
> localRanges for the node cannot be determined. But... what exactly would a
> user do to cause that?
>
> There is no doc for this stuff - can a committer (or adventurous user!)
> confirm what is actually implemented, both pre and post 3.2? (I already
> pinged docs on this.)
>
> Or if anybody is actually using JBOD, what behavior they are seeing for
> device space utilization.
>
> Thanks!
>
> -- Jack Krupansky
>


Re: JBOD device space allocation?

2016-02-23 Thread Marcus Eriksson
It is mentioned here btw: http://www.datastax.com/dev/blog/improving-jbod

On Wed, Feb 24, 2016 at 8:14 AM, Marcus Eriksson <krum...@gmail.com> wrote:

> If you don't use RandomPartitioner/Murmur3Partitioner you will get the old
> behavior.
>
> On Wed, Feb 24, 2016 at 2:47 AM, Jack Krupansky <jack.krupan...@gmail.com>
> wrote:
>
>> I just wanted to confirm whether my understanding of how JBOD allocates
>> device space is correct of not...
>>
>> Pre-3.2:
>> On each memtable flush Cassandra will select the directory (device) which
>> has the most available space as a percentage of the total available space
>> on all of the listed directories/devices. A random weighted value is used
>> so it won't always pick the same directory/device with the most space, the
>> goal being to balance writes for performance.
>>
>> As of 3.2:
>> The ranges of tokens stored on the local node will be evenly distributed
>> among the configured storage devices - even by token range, even if that
>> may be uneven by actual partition sizes. The code presumes that each of the
>> configured local storage devices has the same capacity.
>>
>> The relevant change in 3.2 appears to be:
>> Make sure tokens don't exist in several data directories (CASSANDRA-6696)
>>
>> The code for the pre-3.2 model is still in 3.x - is there some other code
>> path which will cause the pre-3.2 behavior even when runing 3.2 or later?
>>
>> I see this code which seems to allow for at least some cases where the
>> pre-3.2 behavior would still be invoked, but I'm not sure what user-level
>> cases that might be:
>>
>> if (!cfs.getPartitioner().splitter().isPresent() || localRanges.isEmpty())
>>   return Collections.singletonList(new
>> FlushRunnable(lastReplayPosition.get(), txn));
>>
>> return createFlushRunnables(localRanges, txn);
>>
>> IOW, if the partitioner does not have a splitter present or the
>> localRanges for the node cannot be determined. But... what exactly would a
>> user do to cause that?
>>
>> There is no doc for this stuff - can a committer (or adventurous user!)
>> confirm what is actually implemented, both pre and post 3.2? (I already
>> pinged docs on this.)
>>
>> Or if anybody is actually using JBOD, what behavior they are seeing for
>> device space utilization.
>>
>> Thanks!
>>
>> -- Jack Krupansky
>>
>
>


Re: 3k sstables during a repair incremental !!

2016-02-10 Thread Marcus Eriksson
The reason for this is probably
https://issues.apache.org/jira/browse/CASSANDRA-10831 (which only affects
2.1)

So, if you had problems with incremental repair and LCS before, upgrade to
2.1.13 and try again

/Marcus

On Wed, Feb 10, 2016 at 2:59 PM, horschi  wrote:

> Hi Jean,
>
> we had the same issue, but on SizeTieredCompaction. During repair the
> number of SSTables and pending compactions were exploding.
>
> It not only affected latencies, at some point Cassandra ran out of heap.
>
> After the upgrade to 2.2 things got much better.
>
> regards,
> Christian
>
>
> On Wed, Feb 10, 2016 at 2:46 PM, Jean Carlo 
> wrote:
> > Hi Horschi !!!
> >
> > I have the 2.1.12. But I think it is something related to Level
> compaction
> > strategy. It is impressive that we passed from 6 sstables to 3k sstable.
> > I think this will affect the latency on production because the number of
> > compactions going on
> >
> >
> >
> > Best regards
> >
> > Jean Carlo
> >
> > "The best way to predict the future is to invent it" Alan Kay
> >
> > On Wed, Feb 10, 2016 at 2:37 PM, horschi  wrote:
> >>
> >> Hi Jean,
> >>
> >> which Cassandra version do you use?
> >>
> >> Incremental repair got much better in 2.2 (for us at least).
> >>
> >> kind regards,
> >> Christian
> >>
> >> On Wed, Feb 10, 2016 at 2:33 PM, Jean Carlo 
> >> wrote:
> >> > Hello guys!
> >> >
> >> > I am testing the repair inc in my custer cassandra. I am doing my test
> >> > over
> >> > these tables
> >> >
> >> > CREATE TABLE pns_nonreg_bench.cf3 (
> >> > s text,
> >> > sp int,
> >> > d text,
> >> > dp int,
> >> > m map,
> >> > t timestamp,
> >> > PRIMARY KEY (s, sp, d, dp)
> >> > ) WITH CLUSTERING ORDER BY (sp ASC, d ASC, dp ASC)
> >> >
> >> > AND compaction = {'class':
> >> > 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
> >> > AND compression = {'sstable_compression':
> >> > 'org.apache.cassandra.io.compress.SnappyCompressor'}
> >> >
> >> > CREATE TABLE pns_nonreg_bench.cf1 (
> >> > ise text PRIMARY KEY,
> >> > int_col int,
> >> > text_col text,
> >> > ts_col timestamp,
> >> > uuid_col uuid
> >> > ) WITH bloom_filter_fp_chance = 0.01
> >> >  AND compaction = {'class':
> >> > 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
> >> > AND compression = {'sstable_compression':
> >> > 'org.apache.cassandra.io.compress.SnappyCompressor'}
> >> >
> >> > table cf1
> >> > Space used (live): 665.7 MB
> >> > table cf2
> >> > Space used (live): 697.03 MB
> >> >
> >> > It happens that when I do repair -inc -par on theses tables, cf2 got a
> >> > pick
> >> > of 3k sstables. When the repair finish, it takes 30 min or more to
> >> > finish
> >> > all the compactations and return to 6 sstable.
> >> >
> >> > I am a little concern about if this will happen on production. is it
> >> > normal?
> >> >
> >> > Saludos
> >> >
> >> > Jean Carlo
> >> >
> >> > "The best way to predict the future is to invent it" Alan Kay
> >
> >
>


Re: Transitioning to incremental repair

2015-12-02 Thread Marcus Eriksson
Bryan, this should be improved with
https://issues.apache.org/jira/browse/CASSANDRA-10768 - could you try it
out?

On Tue, Dec 1, 2015 at 10:58 PM, Bryan Cheng  wrote:

> Sorry if I misunderstood, but are you asking about the LCS case?
>
> Based on our experience, I would absolutely recommend you continue with
> the migration procedure. Even if the compaction strategy is the same, the
> process of anticompaction is incredibly painful. We observed our test
> cluster running 2.1.11 experiencing a dramatic increase in latency and not
> responding to nodetool queries over JMX while anticompacting the largest
> SSTables. This procedure also took several times longer than a standard
> full repair.
>
> If you absolutely cannot perform the migration procedure, I believe 2.2.x
> contains the changes to automatically set the RepairedAt flags after a full
> repair, so you may be able to do a full repair on 2.2.x and then transition
> directly to incremental without migrating (can someone confirm?)
>


Re: Transitioning to incremental repair

2015-12-01 Thread Marcus Eriksson
Yes, it should now be safe to just run a repair with -inc -par to migrate
to incremental repairs

BUT, if you currently use for example repair service in OpsCenter or
Spotifys Cassandra reaper, you might still want to migrate the way it is
documented as you will have to run a full repair to migrate to incremental
repairs, not many sub range repairs and that might not be possible for some
users with a lot of data or with vnodes etc.

I would also wait until
https://issues.apache.org/jira/browse/CASSANDRA-10768 has been committed
and released as it will improve anticompaction performance

/Marcus

On Tue, Dec 1, 2015 at 3:24 PM, Sam Klock  wrote:

> Hi folks,
>
> A question like this was recently asked, but I don't think anyone ever
> supplied an unambiguous answer.  We have a set of clusters currently
> using sequential repair, and we'd like to transition them to
> incremental repair.  According to the documentation, this is a very
> manual (and likely time-consuming) process:
>
>
> http://docs.datastax.com/en/cassandra/2.1/cassandra/operations/opsRepairNodesMigration.html
>
> Our understanding is that this process is necessary for tables that use
> LCS, as unrepaired tables are compacted using STCS and (without the
> process described in the doc) all tables start in the unrepaired
> state.  The pain of this migration strategy is supposed to be offset by
> the savings in undesired compaction activity.  The docs aren't
> especially clear, but it sounds like this strategy is not needed for
> tables that use STCS.
>
> However, CASSANDRA-8004 (resolved against 2.1.2) appears intended to
> have both the repaired and unrepaired sstable sets use the same
> compaction strategy.  It seems like that obviates the rationale for a
> migration procedure, which is supported by offhand comments on this
> list, e.g.:
>
> https://www.mail-archive.com/user%40cassandra.apache.org/msg40303.html
> https://www.mail-archive.com/user%40cassandra.apache.org/msg44896.html
>
> In other words, it *looks* like the docs are obsolete, and the
> migration process for existing clusters only consists of flipping the
> switch (i.e., adding "-inc" to invocations of "nodetool repair").
>
> Our questions:
>
> 1) Is our understanding of the status quo following 2.1.2 correct?
> Does migrating existing clusters to incremental repair only require
> adding the "-inc" argument, or is a process still required?
>
> 2) If a process is still required, have there been any changes since
> 2.1.2?  Are the docs up-to-date?
>
> 3) If there is no process or if the process has changed, are there
> plans on the DataStax side to update the documentation accordingly?
>
> Thanks,
> SK
>


Re: LTCS Strategy Resulting in multiple SSTables

2015-09-15 Thread Marcus Eriksson
if you are on Cassandra 2.2, it is probably this:
https://issues.apache.org/jira/browse/CASSANDRA-10270

On Tue, Sep 15, 2015 at 4:37 AM, Saladi Naidu  wrote:

> We are using Level Tiered Compaction Strategy on a Column Family. Below
> are CFSTATS from two nodes in same cluster, one node has 880 SStables in L0
> whereas one node just has 1 SSTable in L0. In the node where there are
> multiple SStables, all of them are small size and created same time stamp.
> We ran Compaction, it did not result in much change, node remained with
> huge number of SStables. Due to this large number of SSTables, Read
> performance is being impacted
>
> In same cluster, under same keyspace, we are observing this discrepancy in
> other column families as well. What is going wrong? What is the solution to
> fix this
>
> *---*NODE1*---*
> *Table: category_ranking_dedup*
> *SSTable count: 1*
> *SSTables in each level: [1, 0, 0, 0, 0,
> 0, 0, 0, 0]*
> *Space used (live): 2012037*
> *Space used (total): 2012037*
> *Space used by snapshots (total): 0*
> *SSTable Compression Ratio:
> 0.07677216119569073*
> *Memtable cell count: 990*
> *Memtable data size: 32082*
> *Memtable switch count: 11*
> *Local read count: 2842*
> *Local read latency: 3.215 ms*
> *Local write count: 18309*
> *Local write latency: 5.008 ms*
> *Pending flushes: 0*
> *Bloom filter false positives: 0*
> *Bloom filter false ratio: 0.0*
> *Bloom filter space used: 816*
> *Compacted partition minimum bytes: 87*
> *Compacted partition maximum bytes:
> 25109160*
> *Compacted partition mean bytes: 22844*
> *Average live cells per slice (last five
> minutes): 338.84588318085855*
> *Maximum live cells per slice (last five
> minutes): 10002.0*
> *Average tombstones per slice (last five
> minutes): 36.53307529908515*
> *Maximum tombstones per slice (last five
> minutes): 36895.0*
>
> *NODE2---  *
> *Table: category_ranking_dedup*
> *SSTable count: 808*
> *SSTables in each level: [808/4, 0, 0, 0,
> 0, 0, 0, 0, 0]*
> *Space used (live): 291641980*
> *Space used (total): 291641980*
> *Space used by snapshots (total): 0*
> *SSTable Compression Ratio:
> 0.1431106696818256*
> *Memtable cell count: 4365293*
> *Memtable data size: 3742375*
> *Memtable switch count: 44*
> *Local read count: 2061*
> *Local read latency: 31.983 ms*
> *Local write count: 30096*
> *Local write latency: 27.449 ms*
> *Pending flushes: 0*
> *Bloom filter false positives: 0*
> *Bloom filter false ratio: 0.0*
> *Bloom filter space used: 54544*
> *Compacted partition minimum bytes: 87*
> *Compacted partition maximum bytes:
> 25109160*
> *Compacted partition mean bytes: 634491*
> *Average live cells per slice (last five
> minutes): 416.1780688985929*
> *Maximum live cells per slice (last five
> minutes): 10002.0*
> *Average tombstones per slice (last five
> minutes): 45.11547792333818*
> *Maximum tombstones per slice (last five
> minutes): 36895.0*
>
>
>
>
> Naidu Saladi
>


Re: Incremental repair from the get go

2015-09-04 Thread Marcus Eriksson
Starting up fresh it is totally OK to just start using incremental repairs

On Thu, Sep 3, 2015 at 10:25 PM, Jean-Francois Gosselin <
jfgosse...@gmail.com> wrote:

>
> On fresh install of Cassandra what's the best approach to start using
> incremental repair from the get go (I'm using LCS) ?
>
> Run nodetool repair -inc after inserting a few rows , or we still need to
> follow the migration procedure with sstablerepairedset ?
>
> From the documentation "... If you use the leveled compaction strategy
> and perform an incremental repair for the first time, Cassandra performs
> size-tiering on all SSTables because the repair/unrepaired status is
> unknown. This operation can take a long time. To save time, migrate to
> incremental repair one node at a time. ..."
>
> With almost no data size-tiering should be quick ?  Basically is there a
> short cut to avoid the migration via sstablerepairedset  on a fresh install
> ?
>
> Thanks
>
> JF
>


Re: Garbage collector launched on all nodes at once

2015-06-17 Thread Marcus Eriksson
It is probably this: https://issues.apache.org/jira/browse/CASSANDRA-9549

On Wed, Jun 17, 2015 at 7:37 PM, Michał Łowicki mlowi...@gmail.com wrote:

 Looks that memtable heap size is growing on some nodes rapidly (
 https://www.dropbox.com/s/3brloiy3fqang1r/Screenshot%202015-06-17%2019.21.49.png?dl=0).
 Drops are the places when nodes have been restarted.

 On Wed, Jun 17, 2015 at 6:53 PM, Michał Łowicki mlowi...@gmail.com
 wrote:

 Hi,

 Two datacenters with 6 nodes (2.1.6) each. In each DC garbage collection
 is launched at the same time on each node (See [1] for total GC duration
 per 5 seconds). RF is set to 3. Any ideas?

 [1]
 https://www.dropbox.com/s/bsbyew1jxbe3dgo/Screenshot%202015-06-17%2018.49.48.png?dl=0

 --
 BR,
 Michał Łowicki




 --
 BR,
 Michał Łowicki



Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Marcus Eriksson
nope, but you can correlate I guess, tools/bin/sstablemetadata gives you
sstable level information

and, it is also likely that since you get so many L0 sstables, you will be
doing size tiered compaction in L0 for a while.

On Tue, Apr 21, 2015 at 1:40 PM, Anishek Agarwal anis...@gmail.com wrote:

 @Marcus I did look and that is where i got the above but it doesnt show
 any detail about moving from L0 -L1 any specific arguments i should try
 with ?

 On Tue, Apr 21, 2015 at 4:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 you need to look at nodetool compactionstats - there is probably a big L0
 - L1 compaction going on that blocks other compactions from starting

 On Tue, Apr 21, 2015 at 1:06 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 the some_bits column has about 14-15 bytes of data per key.

 On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 Hello,

 I am inserting about 100 million entries via datastax-java driver to a
 cassandra cluster of 3 nodes.

 Table structure is as

 create keyspace test with replication = {'class':
 'NetworkTopologyStrategy', 'DC' : 3};

 CREATE TABLE test_bits(id bigint primary key , some_bits text) with
 gc_grace_seconds=0 and compaction = {'class': 'LeveledCompactionStrategy'}
 and compression={'sstable_compression' : ''};

 have 75 threads that are inserting data into the above table with each
 thread having non over lapping keys.

 I see that the number of pending tasks via nodetool compactionstats
 keeps increasing and looks like from nodetool cfstats test.test_bits has
 SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 0],

 Why is compaction not kicking in ?

 thanks
 anishek







Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Marcus Eriksson
you need to look at nodetool compactionstats - there is probably a big L0
- L1 compaction going on that blocks other compactions from starting

On Tue, Apr 21, 2015 at 1:06 PM, Anishek Agarwal anis...@gmail.com wrote:

 the some_bits column has about 14-15 bytes of data per key.

 On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 Hello,

 I am inserting about 100 million entries via datastax-java driver to a
 cassandra cluster of 3 nodes.

 Table structure is as

 create keyspace test with replication = {'class':
 'NetworkTopologyStrategy', 'DC' : 3};

 CREATE TABLE test_bits(id bigint primary key , some_bits text) with
 gc_grace_seconds=0 and compaction = {'class': 'LeveledCompactionStrategy'}
 and compression={'sstable_compression' : ''};

 have 75 threads that are inserting data into the above table with each
 thread having non over lapping keys.

 I see that the number of pending tasks via nodetool compactionstats
 keeps increasing and looks like from nodetool cfstats test.test_bits has
 SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 0],

 Why is compaction not kicking in ?

 thanks
 anishek





Re: RepairException on C* 2.1.3

2015-04-19 Thread Marcus Eriksson
Issue here is that getPosition returns null

I think this was fixed in
https://issues.apache.org/jira/browse/CASSANDRA-8750

On Fri, Apr 17, 2015 at 10:55 PM, Robert Coli rc...@eventbrite.com wrote:

 On Fri, Apr 17, 2015 at 11:40 AM, Mark Greene green...@gmail.com wrote:

 I'm receiving an exception when I run a repair process via: 'nodetool
 repair -par keyspace'


 This JIRA claims fixed in 2.1.3, but I believe I have heard at least one
 other report that it isn't :

 https://issues.apache.org/jira/browse/CASSANDRA-8211

 If I were you, I would :

 a) file a JIRA at http://issues.apache.org
 b) reply to the list telling us the URL of your issue

 =Rob




Re: nodetool cleanup error

2015-03-31 Thread Marcus Eriksson
It should work on 2.0.13. If it fails with that assertion, you should just
retry. If that does not work, and you can reproduce this, please file a
ticket

/Marcus

On Tue, Mar 31, 2015 at 9:33 AM, Amlan Roy amlan@cleartrip.com wrote:

 Hi,

 Thanks for the reply. Since nodetool cleanup is not working even after
 upgrading to 2.0.13, is it recommended to go to an older version (2.0.11
 for example, with 2.0.12 also it did not work). Is there any other way of
 cleaning data from existing nodes after adding a new node.

 Regards,
 Amlan

 On 31-Mar-2015, at 5:00 am, Yuki Morishita mor.y...@gmail.com wrote:

  Looks like the issue is
 https://issues.apache.org/jira/browse/CASSANDRA-9070.
 
  On Mon, Mar 30, 2015 at 6:25 PM, Robert Coli rc...@eventbrite.com
 wrote:
  On Mon, Mar 30, 2015 at 4:21 PM, Amlan Roy amlan@cleartrip.com
 wrote:
 
  Thanks for the reply. I have upgraded to 2.0.13. Now I get the
 following
  error.
 
 
  If cleanup is still excepting for you on 2.0.13 with some sstables you
 have,
  I would strongly consider :
 
  1) file a JIRA (http://issues.apache.org) and attach / offer the
 sstables
  for debugging
  2) let the list know the JIRA id of the ticket
 
  =Rob
 
 
 
 
  --
  Yuki Morishita
  t:yukim (http://twitter.com/yukim)




Re: Stable cassandra build for production usage

2015-03-17 Thread Marcus Eriksson
Do you see the segfault or do you see
https://issues.apache.org/jira/browse/CASSANDRA-8716 ?

On Tue, Mar 17, 2015 at 10:34 AM, Ajay ajay.ga...@gmail.com wrote:

 Hi,

 Now that 2.0.13 is out, I don't see nodetool cleanup issue(
 https://issues.apache.org/jira/browse/CASSANDRA-8718) been fixed yet. The
 bug show priority Minor. Anybody facing this issue?.

 Thanks
 Ajay

 On Thu, Mar 12, 2015 at 11:41 PM, Robert Coli rc...@eventbrite.com
 wrote:

 On Thu, Mar 12, 2015 at 10:50 AM, Ajay ajay.ga...@gmail.com wrote:

 Please suggest what is the best option in this for production deployment
 in EC2 given that we are deploying Cassandra cluster for the 1st time (so
 likely that we add more data centers/nodes and schema changes in the
 initial few months)


 Voting for 2.0.13 is in process. I'd wait for that. But I don't need
 OpsCenter.

 =Rob






Re: C* 2.1.3 - Incremental replacement of compacted SSTables

2015-02-22 Thread Marcus Eriksson
We had some issues with it right before we wanted to release 2.1.3 so we
temporarily(?) disabled it, it *might* get removed entirely in 2.1.4, if
you have any input, please comment on this ticket:
https://issues.apache.org/jira/browse/CASSANDRA-8833

/Marcus

On Sat, Feb 21, 2015 at 7:29 PM, Mark Greene green...@gmail.com wrote:

 I saw in the NEWS.txt that this has been disabled.

 Does anyone know why that was the case? Is it temporary just for the 2.1.3
 release?

 Thanks,
 Mark Greene



Re: How to deal with too many sstables

2015-02-02 Thread Marcus Eriksson
https://issues.apache.org/jira/browse/CASSANDRA-8635

On Tue, Feb 3, 2015 at 5:47 AM, 曹志富 cao.zh...@gmail.com wrote:

 Just run nodetool repair.

 The nodes witch has many sstables are newest in my cluster.Before add
 these nodes to my cluster ,my cluster have not compaction automaticly
 because my cluster is an only write cluster.

 thanks.

 --
 曹志富
 手机:18611121927
 邮箱:caozf.zh...@gmail.com
 微博:http://weibo.com/boliza/

 2015-02-03 12:16 GMT+08:00 Flavien Charlon flavien.char...@gmail.com:

 Did you run incremental repair? Incremental repair is broken in 2.1 and
 tends to create way too many SSTables.

 On 2 February 2015 at 18:05, 曹志富 cao.zh...@gmail.com wrote:

 Hi,all:
 I have 18 nodes C* cluster with cassandra2.1.2.Some nodes have aboud
 40,000+ sstables.

 my compaction strategy is STCS.

 Could someone give me some solution to deal with this situation.

 Thanks.
 --
 曹志富
 手机:18611121927
 邮箱:caozf.zh...@gmail.com
 微博:http://weibo.com/boliza/






Re: incremential repairs - again

2015-01-28 Thread Marcus Eriksson
Hi

Unsure what you mean by automatically, but you should use -par -inc when
you repair

And, you should wait until 2.1.3 (which will be out very soon) before doing
this, we have fixed many issues with incremental repairs

/Marcus

On Thu, Jan 29, 2015 at 7:44 AM, Roland Etzenhammer 
r.etzenham...@t-online.de wrote:

 Hi,

 a short question about the new incremental repairs again. I am running
 2.1.2 (for testing). Marcus pointed me that 2.1.2 should do incremental
 repairs automatically, so I rolled back all steps taken. I expect that
 routine repair times will decrease when I do not put many new data on the
 cluster.

 But they dont - they are constant at about 1000 minutes  per node, so I
 extracted all Repaired at with sstablemetadata and I cant see any recent
 date. I put several GB of data into the cluster in 2015 and I run nodetool
 repair -pr on every node regularly.

 Am I still missing something? Or is this one of the issues with 2.1.2
 (CASSANDRA-8316)?

 Thanks for hints,
 Jan





Re: incremental repairs

2015-01-08 Thread Marcus Eriksson
If you are on 2.1.2+ (or using STCS) you don't those steps (should probably
update the blog post).

Now we keep separate levelings for the repaired/unrepaired data and move
the sstables over after the first incremental repair

But, if you are running 2.1 in production, I would recommend that you wait
until 2.1.3 is out, https://issues.apache.org/jira/browse/CASSANDRA-8316
fixes a bunch of issues with incremental repairs

-pr is sufficient, same rules apply as before, if you run -pr you need to
repair every node

/Marcus

On Thu, Jan 8, 2015 at 9:16 AM, Roland Etzenhammer 
r.etzenham...@t-online.de wrote:

 Hi,

 I am currently trying to migrate my test cluster to incremental repairs.
 These are the steps I'm doing on every node:

 - touch marker
 - nodetool disableautocompation
 - nodetool repair
 - cassandra stop
 - find all *Data*.db files older then marker
 - invoke sstablerepairedset on those
 - cassandra start

 This is essentially what http://www.datastax.com/dev/
 blog/anticompaction-in-cassandra-2-1 says. After all nodes migrated this
 way, I think I need to run my regular repairs more often and they should be
 faster afterwards. But do I need to run nodetool repair or is nodetool
 repair -pr sufficient?

 And do I need to reenable autocompation? Oder do I need to compact myself?

 Thanks for any input,
 Roland



Re: incremental repairs

2015-01-08 Thread Marcus Eriksson
Yes, you should reenable autocompaction

/Marcus

On Thu, Jan 8, 2015 at 10:33 AM, Roland Etzenhammer 
r.etzenham...@t-online.de wrote:

 Hi Marcus,

 thanks for that quick reply. I did also look at:

 http://www.datastax.com/documentation/cassandra/2.1/
 cassandra/operations/ops_repair_nodes_c.html

 which describes the same process, it's 2.1.x, so I see that 2.1.2+ is not
 covered there. I did upgrade my testcluster to 2.1.2 and with your hint I
 take a look at sstablemetadata from a non migrated node and there are
 indeed Repaired at entries on some sstables already. So if I got this
 right, in 2.1.2+ there is nothing to do to switch to incremental repairs
 (apart from running the repairs themself).

 But one thing I see during testing is that there are many sstables, with
 small size:

 - in total there are 5521 sstables on one node
 - 115 sstables are bigger than 1MB
 - 4949 sstables are smaller than 10kB

 I don't know where they came from - I found one piece of information where
 this happend when cassandra was low on heap which happend to me while
 running tests (the suggested solution is to trigger compaction via JMX).

 Question for me: I did disable autocompaction on some nodes of our test
 cluster as the blog and docs said. Should/can I reenable autocompaction
 again with incremental repairs?

 Cheers,
 Roland






Re: Compaction Strategy guidance

2014-11-25 Thread Marcus Eriksson
If you are that write-heavy you should definitely go with STCS, LCS
optimizes for reads by doing more compactions

/Marcus

On Tue, Nov 25, 2014 at 11:22 AM, Andrei Ivanov aiva...@iponweb.net wrote:

 Hi Jean-Armel, Nikolai,

 1. Increasing sstable size doesn't work (well, I think, unless we
 overscale - add more nodes than really necessary, which is
 prohibitive for us in a way). Essentially there is no change.  I gave
 up and will go for STCS;-(
 2. We use 2.0.11 as of now
 3. We are running on EC2 c3.8xlarge instances with EBS volumes for data
 (GP SSD)

 Jean-Armel, I believe that what you say about many small instances is
 absolutely true. But, is not good in our case - we write a lot and
 almost never read what we've written. That is, we want to be able to
 read everything, but in reality we hardly read 1%, I think. This
 implies that smaller instances are of no use in terms of read
 performance for us. And generally nstances/cpu/ram is more expensive
 than storage. So, we really would like to have instances with large
 storage.

 Andrei.





 On Tue, Nov 25, 2014 at 11:23 AM, Jean-Armel Luce jaluc...@gmail.com
 wrote:
  Hi Andrei, Hi Nicolai,
 
  Which version of C* are you using ?
 
  There are some recommendations about the max storage per node :
 
 http://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2
 
  For 1.0 we recommend 300-500GB. For 1.2 we are looking to be able to
 handle
  10x
  (3-5TB).
 
  I have the feeling that those recommendations are sensitive according
 many
  criteria such as :
  - your hardware
  - the compaction strategy
  - ...
 
  It looks that LCS lower those limitations.
 
  Increasing the size of sstables might help if you have enough CPU and you
  can put more load on your I/O system (@Andrei, I am interested by the
  results of your  experimentation about large sstable files)
 
  From my point of view, there are some usage patterns where it is better
 to
  have many small servers than a few large servers. Probably, it is better
 to
  have many small servers if you need LCS for large tables.
 
  Just my 2 cents.
 
  Jean-Armel
 
  2014-11-24 19:56 GMT+01:00 Robert Coli rc...@eventbrite.com:
 
  On Mon, Nov 24, 2014 at 6:48 AM, Nikolai Grigoriev 
 ngrigor...@gmail.com
  wrote:
 
  One of the obvious recommendations I have received was to run more than
  one instance of C* per host. Makes sense - it will reduce the amount
 of data
  per node and will make better use of the resources.
 
 
  This is usually a Bad Idea to do in production.
 
  =Rob
 
 
 



Re: LCS: sstables grow larger

2014-11-18 Thread Marcus Eriksson
I suspect they are getting size tiered in L0 - if you have too many
sstables in L0, we will do size tiered compaction on sstables in L0 to
improve performance

Use tools/bin/sstablemetadata to get the level for those sstables, if they
are in L0, that is probably the reason.

/Marcus

On Tue, Nov 18, 2014 at 2:06 PM, Andrei Ivanov aiva...@iponweb.net wrote:

 Dear all,

 I have the following problem:
 - C* 2.0.11
 - LCS with default 160MB
 - Compacted partition maximum bytes: 785939 (for cf/table xxx.xxx)
 - Compacted partition mean bytes: 6750 (for cf/table xxx.xxx)

 I would expect the sstables to be of +- maximum 160MB. Despite this I
 see files like:
 192M Nov 18 13:00 xxx-xxx-jb-15580-Data.db
 or
 631M Nov 18 13:03 xxx-xxx-jb-15583-Data.db

 Am I missing something? What could be the reason? (Actually this is a
 fresh cluster - on an old one I'm seeing 500GB sstables). I'm
 getting really desperate in my attempt to understand what's going on.

 Thanks in advance Andrei.



Re: LCS: sstables grow larger

2014-11-18 Thread Marcus Eriksson
No, they will get compacted into smaller sstables in L1+ eventually (once
you have less than 32 sstables in L0, an ordinary L0 - L1 compaction will
happen)

But, if you consistently get many files in L0 it means that compaction is
not keeping up with your inserts and you should probably expand your
cluster (or consider going back to SizeTieredCompactionStrategy for the
tables that take that many writes)

/Marcus

On Tue, Nov 18, 2014 at 2:49 PM, Andrei Ivanov aiva...@iponweb.net wrote:

 Marcus, thanks a lot! It explains a lot those huge tables are indeed at L0.

 It seems that they start to appear as a result of some massive
 operations (join, repair, rebuild). What's their fate in the future?
 Will they continue to propagate like this through levels? Is there
 anything that can be done to avoid/solve/prevent this?

 My fears here are around a feeling that those big tables (like in my
 old cluster) will be hardly compactable in the future...

 Sincerely, Andrei.

 On Tue, Nov 18, 2014 at 4:27 PM, Marcus Eriksson krum...@gmail.com
 wrote:
  I suspect they are getting size tiered in L0 - if you have too many
 sstables
  in L0, we will do size tiered compaction on sstables in L0 to improve
  performance
 
  Use tools/bin/sstablemetadata to get the level for those sstables, if
 they
  are in L0, that is probably the reason.
 
  /Marcus
 
  On Tue, Nov 18, 2014 at 2:06 PM, Andrei Ivanov aiva...@iponweb.net
 wrote:
 
  Dear all,
 
  I have the following problem:
  - C* 2.0.11
  - LCS with default 160MB
  - Compacted partition maximum bytes: 785939 (for cf/table xxx.xxx)
  - Compacted partition mean bytes: 6750 (for cf/table xxx.xxx)
 
  I would expect the sstables to be of +- maximum 160MB. Despite this I
  see files like:
  192M Nov 18 13:00 xxx-xxx-jb-15580-Data.db
  or
  631M Nov 18 13:03 xxx-xxx-jb-15583-Data.db
 
  Am I missing something? What could be the reason? (Actually this is a
  fresh cluster - on an old one I'm seeing 500GB sstables). I'm
  getting really desperate in my attempt to understand what's going on.
 
  Thanks in advance Andrei.
 
 



Re: LCS: sstables grow larger

2014-11-18 Thread Marcus Eriksson
you should stick to as small nodes as possible yes :)

There are a few relevant tickets related to bootstrap and LCS:
https://issues.apache.org/jira/browse/CASSANDRA-6621 - startup
with -Dcassandra.disable_stcs_in_l0=true to not do STCS in L0
https://issues.apache.org/jira/browse/CASSANDRA-7460 - (3.0) send source
sstable level when bootstrapping

On Tue, Nov 18, 2014 at 3:33 PM, Andrei Ivanov aiva...@iponweb.net wrote:

 OK, got it.

 Actually, my problem is not that we constantly having many files at
 L0. Normally, quite a few of them - that is, nodes are managing to
 compact incoming writes in a timely manner.

 But it looks like when we join a new node, it receives tons of files
 from existing nodes (and they end up at L0, right?) and that seems to
 be where our problems start. In practice, in what I call the old
 cluster, compaction became a problem at ~2TB nodes. (You, know, we are
 trying to save something on HW - we are running on EC2 with EBS
 volumes)

 Do I get it right that, we better stick to cmaller nodes?



 On Tue, Nov 18, 2014 at 5:20 PM, Marcus Eriksson krum...@gmail.com
 wrote:
  No, they will get compacted into smaller sstables in L1+ eventually (once
  you have less than 32 sstables in L0, an ordinary L0 - L1 compaction
 will
  happen)
 
  But, if you consistently get many files in L0 it means that compaction is
  not keeping up with your inserts and you should probably expand your
 cluster
  (or consider going back to SizeTieredCompactionStrategy for the tables
 that
  take that many writes)
 
  /Marcus
 
  On Tue, Nov 18, 2014 at 2:49 PM, Andrei Ivanov aiva...@iponweb.net
 wrote:
 
  Marcus, thanks a lot! It explains a lot those huge tables are indeed at
  L0.
 
  It seems that they start to appear as a result of some massive
  operations (join, repair, rebuild). What's their fate in the future?
  Will they continue to propagate like this through levels? Is there
  anything that can be done to avoid/solve/prevent this?
 
  My fears here are around a feeling that those big tables (like in my
  old cluster) will be hardly compactable in the future...
 
  Sincerely, Andrei.
 
  On Tue, Nov 18, 2014 at 4:27 PM, Marcus Eriksson krum...@gmail.com
  wrote:
   I suspect they are getting size tiered in L0 - if you have too many
   sstables
   in L0, we will do size tiered compaction on sstables in L0 to improve
   performance
  
   Use tools/bin/sstablemetadata to get the level for those sstables, if
   they
   are in L0, that is probably the reason.
  
   /Marcus
  
   On Tue, Nov 18, 2014 at 2:06 PM, Andrei Ivanov aiva...@iponweb.net
   wrote:
  
   Dear all,
  
   I have the following problem:
   - C* 2.0.11
   - LCS with default 160MB
   - Compacted partition maximum bytes: 785939 (for cf/table xxx.xxx)
   - Compacted partition mean bytes: 6750 (for cf/table xxx.xxx)
  
   I would expect the sstables to be of +- maximum 160MB. Despite this I
   see files like:
   192M Nov 18 13:00 xxx-xxx-jb-15580-Data.db
   or
   631M Nov 18 13:03 xxx-xxx-jb-15583-Data.db
  
   Am I missing something? What could be the reason? (Actually this is a
   fresh cluster - on an old one I'm seeing 500GB sstables). I'm
   getting really desperate in my attempt to understand what's going on.
  
   Thanks in advance Andrei.
  
  
 
 



Re: Question on how to run incremental repairs

2014-10-22 Thread Marcus Eriksson
On Wed, Oct 22, 2014 at 2:39 PM, Juho Mäkinen juho.maki...@gmail.com
wrote:

 I'm having problems understanding how incremental repairs are supposed to
 be run.

 If I try to do nodetool repair -inc cassandra will complain that It is
 not possible to mix sequential repair and incremental repairs. However it
 seems that running nodetool repair -inc -par does the job, but I couldn't
 be sure if  this is the correct (and only?) way to run incremental repairs?

 yes, you need to run with -par


 Previously I ran repairs with nodetool repair -pr on each node at a
 time, so that I could minimise the performance hit. I've understood that
 doing a single nodetool repair -inc -par command runs it on all machines
 in the entire cluster, so doesn't that cause a big performance penalty? Can
 I run incremental repairs on one node at a time?


repair still works the same way, you can do with -pr, and no, repair -inc
-par does not run on all nodes, it repairs all ranges that the node you are
executing it on owns, so, if you have rf = 3 you will need to run repair
(without -pr) on every third node


 If running nodetool repair -inc -par every night in a single node is
 fine, should I still spread them out so that each node takes a turn
 executing this command each night?


use your old schedule, repair works the same way, just that incremental
repair does not include already repaired sstables


 Last question is a bit deeper: What I've understood is that incremental
 repairs don't do repairs on SSTables which have already been repaired, but
 doesn't this mean that these repaired SSTables can't be checked towards
 missing or incorrect data?


no, if you get a corrupt sstable for example, you will need to run an old
style repair on that node (without -inc).




Re: stream_throughput_outbound_megabits_per_sec

2014-10-17 Thread Marcus Eriksson
On Thu, Oct 16, 2014 at 1:54 AM, Donald Smith 
donald.sm...@audiencescience.com wrote:


  *stream_throughput_outbound_megabits_per_sec*  is the timeout per
 operation on the streaming socket.   The docs recommend not to have it
 too low (because a timeout causes streaming to restart from the beginning).
 But the default 0 never times out.  What's a reasonable value?


no, it is not a timeout, it states how fast sstables are streamed



 Does it stream an entire SSTable in one operation? I doubt it.  How large
 is the object it streams in one operation?  I'm tempted to put the
 timeout at 30 seconds or 1 minute. Is that too low?


unsure what you meat by 'operation' here, but it is one tcp connection,
streaming the whole file (if thats what we want)


/Marcus


Re: Disabling compaction

2014-10-10 Thread Marcus Eriksson
what version are you on?

On Thu, Oct 9, 2014 at 10:33 PM, Parag Shah ps...@proofpoint.com wrote:

  Hi all,

   I am trying to disable compaction for a few select tables. Here is
 a definition of one such table:

  CREATE TABLE blob_2014_12_31 (
   blob_id uuid,
   blob_index int,
   blob_chunk blob,
   PRIMARY KEY (blob_id, blob_index)
 ) WITH
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='99.0PERCENTILE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'enabled': 'false', 'class': 'SizeTieredCompactionStrategy'}
 AND
   compression={'sstable_compression': 'LZ4Compressor’};

  I have set compaction ‘enabled’ : ‘false’ on the above table.

  However, I do see compactions being run for this node:

  -bash-3.2$ nodetool compactionstats
 pending tasks: 55
   compaction typekeyspace   table   completed
   total  unit  progress
Compaction ids_high_awslab blob_2014_11_15 18122816990
 35814893020 bytes50.60%
Compaction ids_high_awslab blob_2014_12_31 18576750966
 34242866468 bytes54.25%
Compaction ids_high_awslab blob_2014_12_15 19213914904
 35956698600 bytes53.44%
 Active compaction remaining time :   0h49m46s

  Can you someone tell me why this is happening? Do I need to set the
 compaction threshold  to 0 0?

  Regards
 Parag



Re: Disabling compaction

2014-10-10 Thread Marcus Eriksson
this is fixed in 2.0.8; https://issues.apache.org/jira/browse/CASSANDRA-7187

/Marcus

On Fri, Oct 10, 2014 at 7:11 PM, Parag Shah ps...@proofpoint.com wrote:

  Cassandra Version: 2.0.7

  In my application, I am using Cassandra Java Driver 2.0.2

  Thanks
 Parag

   From: Marcus Eriksson krum...@gmail.com
 Reply-To: user@cassandra.apache.org user@cassandra.apache.org
 Date: Thursday, October 9, 2014 at 11:56 PM
 To: user@cassandra.apache.org user@cassandra.apache.org
 Subject: Re: Disabling compaction

   what version are you on?

 On Thu, Oct 9, 2014 at 10:33 PM, Parag Shah ps...@proofpoint.com wrote:

  Hi all,

   I am trying to disable compaction for a few select tables. Here is
 a definition of one such table:

  CREATE TABLE blob_2014_12_31 (
   blob_id uuid,
   blob_index int,
   blob_chunk blob,
   PRIMARY KEY (blob_id, blob_index)
 ) WITH
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='99.0PERCENTILE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'enabled': 'false', 'class':
 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor’};

  I have set compaction ‘enabled’ : ‘false’ on the above table.

  However, I do see compactions being run for this node:

  -bash-3.2$ nodetool compactionstats
 pending tasks: 55
   compaction typekeyspace   table   completed
   total  unit  progress
Compaction ids_high_awslab blob_2014_11_15 18122816990
 35814893020 bytes50.60%
Compaction ids_high_awslab blob_2014_12_31 18576750966
 34242866468 bytes54.25%
Compaction ids_high_awslab blob_2014_12_15 19213914904
 35956698600 bytes53.44%
 Active compaction remaining time :   0h49m46s

  Can you someone tell me why this is happening? Do I need to set the
 compaction threshold  to 0 0?

  Regards
  Parag





Re: Would warnings about overlapping SStables explain high pending compactions?

2014-09-25 Thread Marcus Eriksson
Not really

What version are you on? Do you have pending compactions and no ongoing
compactions?

/Marcus

On Wed, Sep 24, 2014 at 11:35 PM, Donald Smith 
donald.sm...@audiencescience.com wrote:

  On one of our nodes we have lots of pending compactions (499).In the
 past we’ve seen pending compactions go up to 2400 and all the way back down
 again.



 Investigating, I saw warnings such as the following in the logs about
 overlapping SStables and about needing to run “nodetool scrub” on a table.
 Would the overlapping SStables explain the pending compactions?



 WARN [RMI TCP Connection(2)-10.5.50.30] 2014-09-24 09:14:11,207
 LeveledManifest.java (line 154) At level 1,
 SSTableReader(path='/data/data/XYZ/ABC/XYZ-ABC-jb-388233-Data.db')
 [DecoratedKey(-6112875836465333229,
 3366636664393031646263356234663832383264616561666430383739383738),
 DecoratedKey(-4509284829153070912,
 3366336562386339376664376633353635333432636662373739626465393636)]
 overlaps
 SSTableReader(path='/data/data/XYZ/ABC/XYZ-ABC_blob-jb-388150-Data.db')
 [DecoratedKey(-4834684725563291584,
 336633623334363664363632666365303664333936336337343566373838),
 DecoratedKey(-4136919579566299218,
 3366613535646662343235336335633862666530316164323232643765323934)].  This
 could be caused by a bug in Cassandra 1.1.0 .. 1.1.3 or due to the fact
 that you have dropped sstables from another node into the data directory.
 Sending back to L0.  If you didn't drop in sstables, and have not yet run
 scrub, you should do so since you may also have rows out-of-order within an
 sstable



 Thanks



 *Donald A. Smith* | Senior Software Engineer
 P: 425.201.3900 x 3866
 C: (206) 819-5965
 F: (646) 443-2333
 dona...@audiencescience.com


 [image: AudienceScience]





Re: Worse perf after Row Caching version 1.2.5:

2014-02-12 Thread Marcus Eriksson
select * from table will not populate row cache, but if the row is
cached, it will be used. You need to use select * from table where X=Y to
populate row cache.

when setting caching = rows_only you disable key cache which might hurt
your performance.


On Wed, Feb 12, 2014 at 9:05 PM, PARASHAR, BHASKARJYA JAY bp1...@att.comwrote:

  Thanks Jonathan,



 I have the cfstats but our prod team has changed some configs after my
 post and I do not have the cfhistograms  information now.



 No Of nodes: 3

 Ram: 472GB

 Cassandra version: 1.2.5



 I am pasting the cfstats below.



 Regards

 Jay



 CREATE TABLE EnablerCreditReasonInfo (

   key text PRIMARY KEY,

   creditReasonDescription text

 ) WITH COMPACT STORAGE AND

   bloom_filter_fp_chance=0.01 AND

   caching='ROWS_ONLY' AND

   comment='' AND

   dclocal_read_repair_chance=0.00 AND

   gc_grace_seconds=864000 AND

   read_repair_chance=0.10 AND

   replicate_on_write='true' AND

   populate_io_cache_on_flush='false' AND

   compaction={'class': 'SizeTieredCompactionStrategy'} AND

   compression={'sstable_compression': 'SnappyCompressor'};





 CFStats

   Column Family: EnablerCreditReasonInfo

   SSTable count: 3

   Space used (live): 108067

   Space used (total): 108067

   Number of Keys (estimate): 1920

   Memtable Columns Count: 0

   Memtable Data Size: 0

   Memtable Switch Count: 0

   Read Count: 0

   Read Latency: NaN ms.

   Write Count: 0

   Write Latency: NaN ms.

   Pending Tasks: 0

   Bloom Filter False Positives: 0

   Bloom Filter False Ratio: 0.0

   Bloom Filter Space Used: 2232

   Compacted row minimum size: 61

   Compacted row maximum size: 149

   Compacted row mean size: 100



 *From:* Jonathan Lacefield [mailto:jlacefi...@datastax.com]
 *Sent:* Tuesday, February 11, 2014 10:43 AM
 *To:* user@cassandra.apache.org
 *Subject:* Re: Worse perf after Row Caching version 1.2.5:



 Hello,



   Please paste the output of cfhistograms for these tables.  Also, what
 does your environment look like, number of nodes, disk drive configs,
 memory, C* version, etc.



 Thanks,



 Jonathan


   Jonathan Lacefield

 Solutions Architect, DataStax

 (404) 822 3487

 [image: Image removed by sender.] http://www.linkedin.com/in/jlacefield





 [image: Image removed by 
 sender.]http://www.datastax.com/what-we-offer/products-services/training/virtual-training



 On Tue, Feb 11, 2014 at 10:26 AM, PARASHAR, BHASKARJYA JAY bp1...@att.com
 wrote:

 Hi,



 I have two tables and I enabled row caching for both of them using CQL.
 These two CF's are very small with one about 300 rows and other  2000
 rows. The rows themselves are small.

 Cassandra heap: 8gb.

 a.   alter table TABLE_X with caching = 'rows_only';

 b.  alter table TABLE_Y with caching = 'rows_only';

 I also changed row_cache_size_in_mb: 1024 in the Cassandra.yaml file.

 After extensive testing, it seems the performance of Table_X degraded from
 600ms to 750ms and Table_Y gained about 10 ms (from 188ms to 177 ms).

 More Info

 Table X is always queried with Select * from Table_X;  Cfstats in
 Table_X shows Read Latency: NaN ms. I assumed that since we select all the
 rows, the entire table would be cached.

 Table_Y has a secondary index and is queried on that index.





 Would appreciate any input why the performance is worse and how to enable
 row caching for these two tables.



 Thanks

 Jay







inline: ~WRD000.jpg

Re: Migrate data from acunu to Apache cassandra 1.1.12

2014-02-02 Thread Marcus Eriksson
You need an up to date Cassandra, files with -ic- are for Cassandra 1.2.5+

/Marcus


On Mon, Feb 3, 2014 at 8:31 AM, Aravindan T aravinda...@tcs.com wrote:

 Hi,

 There is a necessity where i need to migrate data from acunu cassandra to
 apache cassandra .

 As part of it, the column families snapshots are taken using the nodetool
 command but while loading into the Apache cassandra with the help of
 sstableloader, i get error's like below

  WARN 12:13:25,299 Invalid file 'samplewatchtower-student-ic-5-TOC.txt' in
 data directory /samplewatchtower/student.
 Skipping file samplewatchtower-student-ic-6-Data.db, error opening it: EOF
 after 0 bytes out of 8
  WARN 12:13:25,315 Invalid file 'samplewatchtower-student-ic-6-Summary.db'
 in data directory /samplewatchtower/student.
 Skipping file samplewatchtower-student-ic-5-Data.db, error opening it: EOF
 after 0 bytes out of 8
  WARN 12:13:25,316 Invalid file 'samplewatchtower-student-ic-6-TOC.txt' in
 data directory /samplewatchtower/student.
  WARN 12:13:25,316 Invalid file 'samplewatchtower-student-ic-5-Summary.db'
 in data directory /samplewatchtower/student.
 No sstables to stream.


 Can you please help in how to perform this data migration successfully?


 Aravind

 =-=-=
 Notice: The information contained in this e-mail
 message and/or attachments to it may contain
 confidential or privileged information. If you are
 not the intended recipient, any dissemination, use,
 review, distribution, printing or copying of the
 information contained in this e-mail message
 and/or attachments to it are strictly prohibited. If
 you have received this communication in error,
 please notify us by reply e-mail or telephone and
 immediately and permanently delete the message
 and any attachments. Thank you




Re: Endless loop LCS compaction

2013-12-18 Thread Marcus Eriksson
this has been fixed:

https://issues.apache.org/jira/browse/CASSANDRA-6496


On Wed, Dec 18, 2013 at 2:51 PM, Desimpel, Ignace 
ignace.desim...@nuance.com wrote:

 Hi,
 Would it not be possible that in some rare cases these 'small' files are
 created also and thus resulting in the same endless loop behavior? Like a
 storm on the server make the memtables flushing. When the storm lies down,
 the compaction then would have the same problem?

 Regards,

 Ignace

 -Original Message-
 From: Desimpel, Ignace
 Sent: dinsdag 12 november 2013 09:32
 To: 'Chris Burroughs'
 Subject: RE: Endless loop LCS compaction

 I think that regardless the size, the code should not go into an endless
 loop.

 -Original Message-
 From: Chris Burroughs [mailto:chris.burrou...@gmail.com]
 Sent: vrijdag 8 november 2013 16:49
 To: user@cassandra.apache.org
 Cc: Desimpel, Ignace
 Subject: Re: Endless loop LCS compaction

 On 11/07/2013 06:48 AM, Desimpel, Ignace wrote:
  Total data size is only 3.5GB. Column family was created with
 SSTableSize : 10 MB

 You may want to try a significantly larger size.

 https://issues.apache.org/jira/browse/CASSANDRA-5727



Re: Cassand is holding too many deleted file descriptors

2013-11-14 Thread Marcus Eriksson
yeah this is known, and we are looking for a fix

https://issues.apache.org/jira/browse/CASSANDRA-6275

if you have a simple way of reproducing, please add a comment


On Thu, Nov 14, 2013 at 10:53 AM, Murthy Chelankuri kmurt...@gmail.comwrote:

 I See lots of these deleted file descriptors cassandra is holding in my
 case out of 90K file descriptors 80.5K is having these descriptors

 Because of this cassandra is not performing well.

 Can some one please tell what i am doing wrong.


 lr-x-- 1 root root 64 Nov 14 08:25 10875 -
 /var/lib/cassandra/data/tests/sample_data/sample_data_points-jb-119-Data.db
 (deleted)
 lr-x-- 1 root root 64 Nov 14 08:25 10876 -
 /var/lib/cassandra/data/tests/sample_data/sample_data_points-jb-110-Data.db
 (deleted)
 lr-x-- 1 root root 64 Nov 14 08:25 10877 -
 /var/lib/cassandra/data/tests/sample_data/sample_data_points-jb-133-Data.db
 (deleted)
 lr-x-- 1 root root 64 Nov 14 08:25 10878 -
 /var/lib/cassandra/data/tests/sample_data/sample_data_points-jb-124-Data.db
 (deleted)
 lr-x-- 1 root root 64 Nov 14 08:25 10879 -
 /var/lib/cassandra/data/tests/sample_data/sample_data_points-jb-110-Data.db
 (deleted)
 lr-x-- 1 root root 64 Nov 14 08:11 1088 -
 /var/lib/cassandra/data/tests/sample_data/sample_data_points-jb-110-Data.db
 (deleted)
 lr-x-- 1 root root 64 Nov 14 08:25 10880 -
 /var/lib/cassandra/data/tests/sample_data/sample_data_points-jb-133-Data.db
 (deleted)
 lr-x-- 1 root root 64 Nov 14 08:25 10881 -
 /var/lib/cassandra/data/tests/sample_data/sample_data_points-jb-119-Data.db
 (deleted)
 lr-x-- 1 root root 64 Nov 14 08:25 10882 -
 /var/lib/cassandra/data/tests/sample_data/sample_data_points-jb-124-Data.db
 (deleted)
 lr-x-- 1 root root 64 Nov 14 08:25 10883 -
 /var/lib/cassandra/data/tests/sample_data/sample_data_points-jb-119-Data.db
 (deleted)
 lr-x-- 1 root root 64 Nov 14 08:25 10884 -
 /var/lib/cassandra/data/tests/sample_data/sample_data_points-jb-133-Data.db
 (deleted)
 lr-x-- 1 root root 64 Nov 14 08:25 10885 -
 /var/lib/cassandra/data/tests/sample_data/sample_data_points-jb-110-Data.db
 (deleted)
 lr-x-- 1 root root 64 Nov 14 08:25 10886 -
 /var/lib/cassandra/data/tests/sample_data/sample_data_points-jb-124-Data.db
 (deleted)
 lr-x-- 1 root root 64 Nov 14 08:25 10887 -
 /var/lib/cassandra/data/tests/sample_data/sample_data_points-jb-119-Data.db
 (deleted)







Re: Migration LCS from 1.2.X to 2.0.x exception

2013-09-25 Thread Marcus Eriksson
this is the issue:
https://issues.apache.org/jira/browse/CASSANDRA-5383

guess it fell between chairs, will poke around


On Tue, Sep 24, 2013 at 4:26 PM, Nate McCall n...@thelastpickle.com wrote:

 What version of 1.2.x?

 Unfortunately, you must go through 1.2.9 first. See
 https://github.com/apache/cassandra/blob/cassandra-2.0.0/NEWS.txt#L19-L24


 On Tue, Sep 24, 2013 at 8:57 AM, Desimpel, Ignace 
 ignace.desim...@nuance.com wrote:

  Tested on WINDOWS : On startup of the 2.0.0 version from 1.2.x files I
 get an error as listed below.

 ** **

 This is due to the code in LeveledManifest:: mutateLevel. The method
 already has a comment saying that it is scary …

 On windows, one cannot use the File::rename if the target file name
 already exists. 

 Also, even on Linux, I’m not sure if a rename would actually
 ‘overwrite/implicit-delete’ the content of the target file.

 ** **

 Anyway, adding code (below) before the FileUtils.renameWithConfirm should
 work in both cases (maybe even rename the fromFile to be able to recover…)
 

 File oTo = new File(filename);

 if ( oTo.exists() ) oTo.delete();

 ** **

 ** **

 java.lang.RuntimeException: Failed to rename
 …..xxx\Function-ic-10-Statistics.db-tmp to
 …..xxx\Function-ic-10-Statistics.db

 at
 org.apache.cassandra.io.util.FileUtils.renameWithConfirm(FileUtils.java:136)
 ~[main/:na]

 at
 org.apache.cassandra.io.util.FileUtils.renameWithConfirm(FileUtils.java:125)
 ~[main/:na]

 at
 org.apache.cassandra.db.compaction.LeveledManifest.mutateLevel(LeveledManifest.java:601)
 ~[main/:na]

 at
 org.apache.cassandra.db.compaction.LegacyLeveledManifest.migrateManifests(LegacyLeveledManifest.java:103)
 ~[main/:na]

 at
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:247)
 ~[main/:na]

 at
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:443)
 ~[main/:na]

 ** **

 Regards,

 ** **

 Ignace Desimpel





Re: 1.2.10 - 2.0.1 migration issue

2013-09-25 Thread Marcus Eriksson
this is most likely a bug, filed
https://issues.apache.org/jira/browse/CASSANDRA-6093 and will try to have a
look today.


On Wed, Sep 25, 2013 at 1:48 AM, Christopher Wirt chris.w...@struq.comwrote:

 Hi,

 ** **

 Just had a go at upgrading a node to the latest stable c* 2 release and
 think I ran into some issues with manifest migration.

 ** **

 On initial start up I hit this error as it starts to load the first of my
 CF. 

 ** **

 INFO [main] 2013-09-24 22:56:01,018 LegacyLeveledManifest.java (line 89)
 Migrating manifest for struqrealtime/impressionstorev2

 INFO [main] 2013-09-24 22:56:01,019 LegacyLeveledManifest.java (line 119)
 Snapshotting struqrealtime, impressionstorev2 to pre-sstablemetamigration*
 ***

 ERROR [main] 2013-09-24 22:56:01,030 CassandraDaemon.java (line 459)
 Exception encountered during startup

 FSWriteError in
 /disk1/cassandra/data/struqrealtime/impressionstorev2/snapshots/pre-sstablemetamigration/impressionstorev2.json
 

 at
 org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:83)**
 **

 at
 org.apache.cassandra.db.compaction.LegacyLeveledManifest.snapshotWithoutCFS(LegacyLeveledManifest.java:138)
 

 at
 org.apache.cassandra.db.compaction.LegacyLeveledManifest.migrateManifests(LegacyLeveledManifest.java:91)
 

 at
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:246)
 

 at
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:442)
 

 at
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:485)
 

 Caused by: java.nio.file.NoSuchFileException:
 /disk1/cassandra/data/struqrealtime/impressionstorev2/snapshots/pre-sstablemetamigration/impressionstorev2.json
 -
 /disk1/cassandra/data/struqrealtime/impressionstorev2/impressionstorev2.json
 

 at
 sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)

 at
 sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)

 at
 sun.nio.fs.UnixFileSystemProvider.createLink(UnixFileSystemProvider.java:474)
 

 at java.nio.file.Files.createLink(Files.java:1037)

 at
 org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:79)**
 **

 ... 5 more

 ** **

 I had already successful run a test migration on our dev server. Only real
 difference I can see if the number of data directories defined and the
 amount of data being held. 

 ** **

 I’ve run upgradesstables under 1.2.10. I have always been using vnodes and
 CQL3. I recently moved to using LZ4 instead of Snappy..

 ** **

 I tried to startup again and it gave me a slightly different error

 ** **

 INFO [main] 2013-09-24 22:58:28,218 LegacyLeveledManifest.java (line 89)
 Migrating manifest for struqrealtime/impressionstorev2

 INFO [main] 2013-09-24 22:58:28,218 LegacyLeveledManifest.java (line 119)
 Snapshotting struqrealtime, impressionstorev2 to pre-sstablemetamigration*
 ***

 ERROR [main] 2013-09-24 22:58:28,222 CassandraDaemon.java (line 459)
 Exception encountered during startup

 java.lang.RuntimeException: Tried to create duplicate hard link to
 /disk3/cassandra/data/struqrealtime/impressionstorev2/snapshots/pre-sstablemetamigration/struqrealtime-impressionstorev2-ic-1030-TOC.txt
 

 at
 org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:71)**
 **

 at
 org.apache.cassandra.db.compaction.LegacyLeveledManifest.snapshotWithoutCFS(LegacyLeveledManifest.java:129)
 

 at
 org.apache.cassandra.db.compaction.LegacyLeveledManifest.migrateManifests(LegacyLeveledManifest.java:91)
 

 at
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:246)
 

 at
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:442)
 

 at
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:485)
 

 ** **

 Will have a go recreating this tomorrow.

 ** **

 Any insight or guesses at what the issue might be are always welcome.

 ** **

 Thanks,

 Chris



Re: 1.2.10 - 2.0.1 migration issue

2013-09-25 Thread Marcus Eriksson
cant really reproduce, could you update the ticket with a bit more info
about your setup?

do you have multiple .json files in your data dirs?


On Wed, Sep 25, 2013 at 10:07 AM, Marcus Eriksson krum...@gmail.com wrote:

 this is most likely a bug, filed
 https://issues.apache.org/jira/browse/CASSANDRA-6093 and will try to have
 a look today.


 On Wed, Sep 25, 2013 at 1:48 AM, Christopher Wirt chris.w...@struq.comwrote:

 Hi,

 ** **

 Just had a go at upgrading a node to the latest stable c* 2 release and
 think I ran into some issues with manifest migration.

 ** **

 On initial start up I hit this error as it starts to load the first of my
 CF. 

 ** **

 INFO [main] 2013-09-24 22:56:01,018 LegacyLeveledManifest.java (line 89)
 Migrating manifest for struqrealtime/impressionstorev2

 INFO [main] 2013-09-24 22:56:01,019 LegacyLeveledManifest.java (line 119)
 Snapshotting struqrealtime, impressionstorev2 to pre-sstablemetamigration
 

 ERROR [main] 2013-09-24 22:56:01,030 CassandraDaemon.java (line 459)
 Exception encountered during startup

 FSWriteError in
 /disk1/cassandra/data/struqrealtime/impressionstorev2/snapshots/pre-sstablemetamigration/impressionstorev2.json
 

 at
 org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:83)*
 ***

 at
 org.apache.cassandra.db.compaction.LegacyLeveledManifest.snapshotWithoutCFS(LegacyLeveledManifest.java:138)
 

 at
 org.apache.cassandra.db.compaction.LegacyLeveledManifest.migrateManifests(LegacyLeveledManifest.java:91)
 

 at
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:246)
 

 at
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:442)
 

 at
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:485)
 

 Caused by: java.nio.file.NoSuchFileException:
 /disk1/cassandra/data/struqrealtime/impressionstorev2/snapshots/pre-sstablemetamigration/impressionstorev2.json
 -
 /disk1/cassandra/data/struqrealtime/impressionstorev2/impressionstorev2.json
 

 at
 sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)***
 *

 at
 sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)

 at
 sun.nio.fs.UnixFileSystemProvider.createLink(UnixFileSystemProvider.java:474)
 

 at java.nio.file.Files.createLink(Files.java:1037)

 at
 org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:79)*
 ***

 ... 5 more

 ** **

 I had already successful run a test migration on our dev server. Only
 real difference I can see if the number of data directories defined and the
 amount of data being held. 

 ** **

 I’ve run upgradesstables under 1.2.10. I have always been using vnodes
 and CQL3. I recently moved to using LZ4 instead of Snappy..

 ** **

 I tried to startup again and it gave me a slightly different error

 ** **

 INFO [main] 2013-09-24 22:58:28,218 LegacyLeveledManifest.java (line 89)
 Migrating manifest for struqrealtime/impressionstorev2

 INFO [main] 2013-09-24 22:58:28,218 LegacyLeveledManifest.java (line 119)
 Snapshotting struqrealtime, impressionstorev2 to pre-sstablemetamigration
 

 ERROR [main] 2013-09-24 22:58:28,222 CassandraDaemon.java (line 459)
 Exception encountered during startup

 java.lang.RuntimeException: Tried to create duplicate hard link to
 /disk3/cassandra/data/struqrealtime/impressionstorev2/snapshots/pre-sstablemetamigration/struqrealtime-impressionstorev2-ic-1030-TOC.txt
 

 at
 org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:71)*
 ***

 at
 org.apache.cassandra.db.compaction.LegacyLeveledManifest.snapshotWithoutCFS(LegacyLeveledManifest.java:129)
 

 at
 org.apache.cassandra.db.compaction.LegacyLeveledManifest.migrateManifests(LegacyLeveledManifest.java:91)
 

 at
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:246)
 

 at
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:442)
 

 at
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:485)
 

 ** **

 Will have a go recreating this tomorrow.

 ** **

 Any insight or guesses at what the issue might be are always welcome.

 ** **

 Thanks,

 Chris





Re: 1.2.10 - 2.0.1 migration issue

2013-09-25 Thread Marcus Eriksson
you are probably reading trunk NEWS.txt

read the ticket for explanation of what the issue was (it is a proper bug)


On Wed, Sep 25, 2013 at 12:59 PM, Christopher Wirt chris.w...@struq.comwrote:

 Hi Marcus,

 Thanks for having a look at this.

 ** **

 Just noticed this in the NEWS.txt 

 ** **

 For *leveled* compaction users, 2.0 must be *atleast* started before

  upgrading to 2.1 due to the fact that the old JSON *leveled*

  manifest is migrated into the *sstable* *metadata* files on startup**
 **

  in 2.0 and this code is gone from 2.1.

 ** **

 Basically, my fault for skimming over this too quickly. 

 ** **

 We will move from 1.2.10 - 2.0 - 2.1

 ** **

 Thanks,

 Chris

 ** **

 ** **

 *From:* Marcus Eriksson [mailto:krum...@gmail.com]
 *Sent:* 25 September 2013 09:37
 *To:* user@cassandra.apache.org
 *Subject:* Re: 1.2.10 - 2.0.1 migration issue

 ** **

 cant really reproduce, could you update the ticket with a bit more info
 about your setup?

 ** **

 do you have multiple .json files in your data dirs?

 ** **

 On Wed, Sep 25, 2013 at 10:07 AM, Marcus Eriksson krum...@gmail.com
 wrote:

 this is most likely a bug, filed
 https://issues.apache.org/jira/browse/CASSANDRA-6093 and will try to have
 a look today.

 ** **

 On Wed, Sep 25, 2013 at 1:48 AM, Christopher Wirt chris.w...@struq.com
 wrote:

 Hi,

  

 Just had a go at upgrading a node to the latest stable c* 2 release and
 think I ran into some issues with manifest migration.

  

 On initial start up I hit this error as it starts to load the first of my
 CF. 

  

 INFO [main] 2013-09-24 22:56:01,018 LegacyLeveledManifest.java (line 89)
 Migrating manifest for struqrealtime/impressionstorev2

 INFO [main] 2013-09-24 22:56:01,019 LegacyLeveledManifest.java (line 119)
 Snapshotting struqrealtime, impressionstorev2 to pre-sstablemetamigration*
 ***

 ERROR [main] 2013-09-24 22:56:01,030 CassandraDaemon.java (line 459)
 Exception encountered during startup

 FSWriteError in
 /disk1/cassandra/data/struqrealtime/impressionstorev2/snapshots/pre-sstablemetamigration/impressionstorev2.json
 

 at
 org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:83)**
 **

 at
 org.apache.cassandra.db.compaction.LegacyLeveledManifest.snapshotWithoutCFS(LegacyLeveledManifest.java:138)
 

 at
 org.apache.cassandra.db.compaction.LegacyLeveledManifest.migrateManifests(LegacyLeveledManifest.java:91)
 

 at
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:246)
 

 at
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:442)
 

 at
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:485)
 

 Caused by: java.nio.file.NoSuchFileException:
 /disk1/cassandra/data/struqrealtime/impressionstorev2/snapshots/pre-sstablemetamigration/impressionstorev2.json
 -
 /disk1/cassandra/data/struqrealtime/impressionstorev2/impressionstorev2.json
 

 at
 sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)

 at
 sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)

 at
 sun.nio.fs.UnixFileSystemProvider.createLink(UnixFileSystemProvider.java:474)
 

 at java.nio.file.Files.createLink(Files.java:1037)

 at
 org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:79)**
 **

 ... 5 more

  

 I had already successful run a test migration on our dev server. Only real
 difference I can see if the number of data directories defined and the
 amount of data being held. 

  

 I’ve run upgradesstables under 1.2.10. I have always been using vnodes and
 CQL3. I recently moved to using LZ4 instead of Snappy..

  

 I tried to startup again and it gave me a slightly different error

  

 INFO [main] 2013-09-24 22:58:28,218 LegacyLeveledManifest.java (line 89)
 Migrating manifest for struqrealtime/impressionstorev2

 INFO [main] 2013-09-24 22:58:28,218 LegacyLeveledManifest.java (line 119)
 Snapshotting struqrealtime, impressionstorev2 to pre-sstablemetamigration*
 ***

 ERROR [main] 2013-09-24 22:58:28,222 CassandraDaemon.java (line 459)
 Exception encountered during startup

 java.lang.RuntimeException: Tried to create duplicate hard link to
 /disk3/cassandra/data/struqrealtime/impressionstorev2/snapshots/pre-sstablemetamigration/struqrealtime-impressionstorev2-ic-1030-TOC.txt
 

 at
 org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:71)**
 **

 at
 org.apache.cassandra.db.compaction.LegacyLeveledManifest.snapshotWithoutCFS(LegacyLeveledManifest.java:129)
 

 at
 org.apache.cassandra.db.compaction.LegacyLeveledManifest.migrateManifests(LegacyLeveledManifest.java:91

Re: 1.2.10 - 2.0.1 migration issue

2013-09-25 Thread Marcus Eriksson
you probably have to remove the old snapshots before trying to restart


On Wed, Sep 25, 2013 at 3:05 PM, Christopher Wirt chris.w...@struq.comwrote:

 Should also say. I have managed to move one node from 1.2.10 to 2.0.0. I’m
 seeing this error on the machine I tried to migrate earlier to 2.0.1

 ** **

 Thanks

 ** **

 *From:* Christopher Wirt [mailto:chris.w...@struq.com]
 *Sent:* 25 September 2013 14:04
 *To:* 'user@cassandra.apache.org'
 *Subject:* RE: 1.2.10 - 2.0.1 migration issue

 ** **

 Hi Marcus,

 ** **

 I’ve seen your patch. This works with what I’m seeing. The first data
 directory only contained the JSON manifest at that time.

 ** **

 As a workaround I’ve made sure that each of the snapshot directories now
 exist before starting up.

 ** **

 I still end up with the second exception I posted regarding a duplicate
 hard link. Possibly two unrelated exceptions.

 ** **

 After getting this error. Looking at the datadirs

 Data1 contains 

 JSON manifests

 Loads of data files

 Snapshot directory

 Data2 contains

 Just the snapshot directory

 Data3 contains

 Just the snapshot directory

 ** **

 INFO 12:56:22,766 Migrating manifest for struqrealtime/impressionstorev2**
 **

 INFO 12:56:22,767 Snapshotting struqrealtime, impressionstorev2 to
 pre-sstablemetamigration

 ERROR 12:56:22,787 Exception encountered during startup

 java.lang.RuntimeException: Tried to create duplicate hard link to
 /disk1/cassandra/data/struqrealtime/impressionstorev2/snapshots/pre-sstablemetamigration/impressionstorev2.json
 

 at
 org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:71)**
 **

 at
 org.apache.cassandra.db.compaction.LegacyLeveledManifest.snapshotWithoutCFS(LegacyLeveledManifest.java:138)
 

 at
 org.apache.cassandra.db.compaction.LegacyLeveledManifest.migrateManifests(LegacyLeveledManifest.java:91)
 

 at
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:247)
 

 at
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:443)
 

 at
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:486)
 

 java.lang.RuntimeException: Tried to create duplicate hard link to
 /disk1/cassandra/data/struqrealtime/impressionstorev2/snapshots/pre-sstablemetamigration/impressionstorev2.json
 

 at
 org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:71)**
 **

 at
 org.apache.cassandra.db.compaction.LegacyLeveledManifest.snapshotWithoutCFS(LegacyLeveledManifest.java:138)
 

 at
 org.apache.cassandra.db.compaction.LegacyLeveledManifest.migrateManifests(LegacyLeveledManifest.java:91)
 

 at
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:247)
 

 at
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:443)
 

 at
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:486)
 

 Exception encountered during startup: Tried to create duplicate hard link
 to
 /disk1/cassandra/data/struqrealtime/impressionstorev2/snapshots/pre-sstablemetamigration/impressionstorev2.json
 

 ** **

 Thanks,

 ** **

 Chris

 ** **

 ** **

 *From:* Marcus Eriksson [mailto:krum...@gmail.com]
 *Sent:* 25 September 2013 13:11

 *To:* user@cassandra.apache.org
 *Subject:* Re: 1.2.10 - 2.0.1 migration issue

 ** **

 you are probably reading trunk NEWS.txt

 ** **

 read the ticket for explanation of what the issue was (it is a proper bug)
 

 ** **

 On Wed, Sep 25, 2013 at 12:59 PM, Christopher Wirt chris.w...@struq.com
 wrote:

 Hi Marcus,

 Thanks for having a look at this.

  

 Just noticed this in the NEWS.txt 

  

 For *leveled* compaction users, 2.0 must be *atleast* started before

  upgrading to 2.1 due to the fact that the old JSON *leveled*

  manifest is migrated into the *sstable* *metadata* files on startup**
 **

  in 2.0 and this code is gone from 2.1.

  

 Basically, my fault for skimming over this too quickly. 

  

 We will move from 1.2.10 - 2.0 - 2.1

  

 Thanks,

 Chris

  

  

 *From:* Marcus Eriksson [mailto:krum...@gmail.com]
 *Sent:* 25 September 2013 09:37
 *To:* user@cassandra.apache.org
 *Subject:* Re: 1.2.10 - 2.0.1 migration issue

  

 cant really reproduce, could you update the ticket with a bit more info
 about your setup?

  

 do you have multiple .json files in your data dirs?

  

 On Wed, Sep 25, 2013 at 10:07 AM, Marcus Eriksson krum...@gmail.com
 wrote:

 this is most likely a bug, filed
 https://issues.apache.org/jira/browse/CASSANDRA-6093 and will try to have
 a look today.

  

 On Wed, Sep 25, 2013 at 1:48 AM, Christopher Wirt chris.w...@struq.com

Re: manually removing sstable

2013-07-10 Thread Marcus Eriksson
yep that works, you need to remove all components of the sstable though,
not just -Data.db

and, in 2.0 there is this:
https://issues.apache.org/jira/browse/CASSANDRA-5228

/Marcus


On Wed, Jul 10, 2013 at 2:09 PM, Theo Hultberg t...@iconara.net wrote:

 Hi,

 I think I remember reading that if you have sstables that you know contain
 only data that whose ttl has expired, it's safe to remove them manually by
 stopping c*, removing the *-Data.db files and then starting up c* again. is
 this correct?

 we have a cluster where everything is written with a ttl, and sometimes c*
 needs to compact over a 100 gb of sstables where we know ever has expired,
 and we'd rather just manually get rid of those.

 T#



Re: old data / tombstones are not deleted after ttl

2013-03-05 Thread Marcus Eriksson
you could consider enabling leveled compaction:
http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra


On Tue, Mar 5, 2013 at 9:46 AM, Matthias Zeilinger 
matthias.zeilin...@bwinparty.com wrote:

 Short question afterwards:

 I have read in the documentation, that after a major compaction, minor
 compactions are no longer automatically trigger.
 Does this mean, that I have to do the nodetool compact regulary? Or is
 there a way to get back to the automatically minor compactions?

 Thx,

 Br,
 Matthias Zeilinger
 Production Operation – Shared Services

 P: +43 (0) 50 858-31185
 M: +43 (0) 664 85-34459
 E: matthias.zeilin...@bwinparty.com

 bwin.party services (Austria) GmbH
 Marxergasse 1B
 A-1030 Vienna

 www.bwinparty.com


 -Original Message-
 From: Matthias Zeilinger [mailto:matthias.zeilin...@bwinparty.com]
 Sent: Dienstag, 05. März 2013 08:03
 To: user@cassandra.apache.org
 Subject: RE: old data / tombstones are not deleted after ttl

 Yes it was a major compaction.
 I know it´s not a great solution, but I needed something to get rid of the
 old data, because I went out of diskspace.

 Br,
 Matthias Zeilinger
 Production Operation – Shared Services

 P: +43 (0) 50 858-31185
 M: +43 (0) 664 85-34459
 E: matthias.zeilin...@bwinparty.com

 bwin.party services (Austria) GmbH
 Marxergasse 1B
 A-1030 Vienna

 www.bwinparty.com


 -Original Message-
 From: Michal Michalski [mailto:mich...@opera.com]
 Sent: Dienstag, 05. März 2013 07:47
 To: user@cassandra.apache.org
 Subject: Re: old data / tombstones are not deleted after ttl

 Was it a major compaction? I ask because it's definitely a solution that
 had to work, but it's also a solution that - in general - probably no-one
 here would suggest you to use.

 M.

 W dniu 05.03.2013 07:08, Matthias Zeilinger pisze:
  Hi,
 
  I have done a manually compaction over the nodetool and this worked.
  But thx for the explanation, why it wasn´t compacted
 
  Br,
  Matthias Zeilinger
  Production Operation – Shared Services
 
  P: +43 (0) 50 858-31185
  M: +43 (0) 664 85-34459
  E: matthias.zeilin...@bwinparty.com
 
  bwin.party services (Austria) GmbH
  Marxergasse 1B
  A-1030 Vienna
 
  www.bwinparty.com
 
  From: Bryan Talbot [mailto:btal...@aeriagames.com]
  Sent: Montag, 04. März 2013 23:36
  To: user@cassandra.apache.org
  Subject: Re: old data / tombstones are not deleted after ttl
 
  Those older files won't be included in a compaction until there are
 min_compaction_threshold (4) files of that size.  When you get another SS
 table -Data.db file that is about 12-18GB then you'll have 4 and they will
 be compacted together into one new file.  At that time, if there are any
 rows with only tombstones that are all older than gc_grace the row will be
 removed (assuming the row exists exclusively in the 4 input SS tables).
  Columns with data that is more than TTL seconds old will be written with a
 tombstone.  If the row does have column values in SS tables that are not
 being compacted, the row will not be removed.
 
 
  -Bryan
 
  On Sun, Mar 3, 2013 at 11:07 PM, Matthias Zeilinger 
 matthias.zeilin...@bwinparty.commailto:matthias.zeilin...@bwinparty.com
 wrote:
  Hi,
 
  I´m running Cassandra 1.1.5 and have following issue.
 
  I´m using a 10 days TTL on my CF. I can see a lot of tombstones in
 there, but they aren´t deleted after compaction.
 
  I have tried a nodetool –cleanup and also a restart of Cassandra, but
 nothing happened.
 
  total 61G
  drwxr-xr-x  2 cassandra dba  20K Mar  4 06:35 .
  drwxr-xr-x 10 cassandra dba 4.0K Dec 10 13:05 ..
  -rw-r--r--  1 cassandra dba  15M Dec 15 22:04
  whatever-he-1398-CompressionInfo.db
  -rw-r--r--  1 cassandra dba  19G Dec 15 22:04 whatever-he-1398-Data.db
  -rw-r--r--  1 cassandra dba  15M Dec 15 22:04
  whatever-he-1398-Filter.db
  -rw-r--r--  1 cassandra dba 357M Dec 15 22:04
  whatever-he-1398-Index.db
  -rw-r--r--  1 cassandra dba 4.3K Dec 15 22:04
  whatever-he-1398-Statistics.db
  -rw-r--r--  1 cassandra dba 9.5M Feb  6 15:45
  whatever-he-5464-CompressionInfo.db
  -rw-r--r--  1 cassandra dba  12G Feb  6 15:45 whatever-he-5464-Data.db
  -rw-r--r--  1 cassandra dba  48M Feb  6 15:45
  whatever-he-5464-Filter.db
  -rw-r--r--  1 cassandra dba 736M Feb  6 15:45
  whatever-he-5464-Index.db
  -rw-r--r--  1 cassandra dba 4.3K Feb  6 15:45
  whatever-he-5464-Statistics.db
  -rw-r--r--  1 cassandra dba 9.7M Feb 21 19:13
  whatever-he-6829-CompressionInfo.db
  -rw-r--r--  1 cassandra dba  12G Feb 21 19:13 whatever-he-6829-Data.db
  -rw-r--r--  1 cassandra dba  47M Feb 21 19:13
  whatever-he-6829-Filter.db
  -rw-r--r--  1 cassandra dba 792M Feb 21 19:13
  whatever-he-6829-Index.db
  -rw-r--r--  1 cassandra dba 4.3K Feb 21 19:13
  whatever-he-6829-Statistics.db
  -rw-r--r--  1 cassandra dba 3.7M Mar  1 10:46
  whatever-he-7578-CompressionInfo.db
  -rw-r--r--  1 cassandra dba 4.3G Mar  1 10:46 whatever-he-7578-Data.db
  -rw-r--r--  1 cassandra dba  12M Mar  1 10:46
  

Re: how stable is 1.0 these days?

2012-03-02 Thread Marcus Eriksson
beware of https://issues.apache.org/jira/browse/CASSANDRA-3820 though if
you have many keys per node

other than that, yep, it seems solid

/Marcus

On Wed, Feb 29, 2012 at 6:20 PM, Thibaut Britz 
thibaut.br...@trendiction.com wrote:

 Thanks!

 We will test it on our test cluster in the coming weeks and hopefully put
 it into production on our 200 node main cluster. :)

 Thibaut

 On Wed, Feb 29, 2012 at 5:52 PM, Edward Capriolo edlinuxg...@gmail.comwrote:

 On Wed, Feb 29, 2012 at 10:35 AM, Thibaut Britz
 thibaut.br...@trendiction.com wrote:
  Any more feedback on larger deployments of 1.0.*?
 
  We are eager to try out the new features in production, but don't want
 to
  run into bugs as on former 0.7 and 0.8 versions.
 
  Thanks,
  Thibaut
 
 
 
  On Tue, Jan 31, 2012 at 6:59 AM, Ben Coverston 
 ben.covers...@datastax.com
  wrote:
 
  I'm not sure what Carlo is referring to, but generally if you have
 done,
  thousands of migrations you can end up in a situation where the
 migrations
  take a long time to replay, and there are some race conditions that
 can be
  problematic in the case where there are thousands of migrations that
 may
  need to be replayed while a node is bootstrapped. If you get into this
  situation it can be fixed by copying migrations from a known good
 schema to
  the node that you are trying to bootstrap.
 
  Generally I would advise against frequent schema updates. Unlike rows
 in
  column families the schema itself is designed to be relatively static.
 
  On Mon, Jan 30, 2012 at 2:14 PM, Jim Newsham jnews...@referentia.com
  wrote:
 
 
  Could you also elaborate for creating/dropping column families?  We're
  currently working on moving to 1.0 and using dynamically created
 tables, so
  I'm very interested in what issues we might encounter.
 
  So far the only thing I've encountered (with 1.0.7 + hector 1.0-2) is
  that dropping a cf may sometimes fail with UnavailableException.  I
 think
  this happens when the cf is busy being compacted.  When I sleep/retry
 within
  a loop it eventually succeeds.
 
  Thanks,
  Jim
 
 
  On 1/26/2012 7:32 AM, Pierre-Yves Ritschard wrote:
 
  Can you elaborate on the composite types instabilities ? is this
  specific to hector as the radim's posts suggests ?
  These one liner answers are quite stressful :)
 
  On Thu, Jan 26, 2012 at 1:28 PM, Carlo Pirescarlopi...@gmail.com
   wrote:
 
  If you need to use composite types and create/drop column families
 on
  the
  fly you must be prepared to instabilities.
 
 
 
 
 
  --
  Ben Coverston
  DataStax -- The Apache Cassandra Company
 
 

 I would call 1.0.7 rock fricken solid. Incredibly stable. It has been
 that way since I updated to 0.8.8  really. TBs of data, billions of
 requests a day, and thanks to JAMM, memtable type auto-tuning, and
 other enhancements I rarely, if ever, find a node in a state where it
 requires a restart. My clusters are beast-ing.

 There always is bugs in software, but coming from a guy who ran
 cassandra 0.6.1.Administration on my Cassandra cluster is like a
 vacation now.