Re: cassandra 3.10

2017-05-12 Thread Michael Shuler
On 05/12/2017 01:11 PM, Gopal, Dhruva wrote:
> Since, we’re trying to qualify this for Production, 3.11 isn’t
> officially released, yet is it – it’s why we’re planning on using
> 3.10. The concern stems from the build failing with byteman. We’re
> novices at building our own rpms for Cassandra and are concerned that
> there may be other issues with the 3.10 release and maybe we should
> hold off till 3.11 is released and ready. Any opinions/feedback will
> help.

Here's the current changelog from 3.10 -> 3.11.0
https://github.com/apache/cassandra/blob/cassandra-3.11/CHANGES.txt#L1-L115

That's a lot of fixed bugs since 3.10, so use that information as you
see fit.

The 3.11.0 release is currently down to making test fixes and/or
tracking down the last few remaining bugs, if test failures are due to
actual bugs.

The 3.11.X release series will be a long-term maintenance branch, so
fixes will be ongoing in this branch and new releases of 3.11.X will be
made when needed.

If you are working on evaluating a release for production, I would
suggest that evaluating 3.11.0 now would be a better choice, and your
byteman issues are already fixed in-tree, so no hacking around and doing
your own thing. 3.10 won't ever get updated, so if you need fixes for
issues you might find there, the path would be to upgrade to 3.11.X
anyway, so it doesn't make sense to me to start on 3.10.

If you have additional RPM package fixes for the cassandra-3.11 branch
and want to get them in for the 3.11.0 release, please include those on:
https://issues.apache.org/jira/browse/CASSANDRA-13433

Want to build RPMS easily and directly from git branches? This is where
the project is working on that build infrastructure to streamline
releases, and can be used to build packages on your own. See the README on:
https://github.com/apache/cassandra-builds

-- 
Kind regards,
Michael Shuler

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: cassandra 3.10

2017-05-12 Thread Gopal, Dhruva
Hi Anthony –
  The link you shared below is where we initially started. That build fails 
(there is an issue with byteman as indicated by this Jira: 
https://issues.apache.org/jira/browse/CASSANDRA-13316). The source tarball 
already exists (released version), so we decided to skip rebuilding the 
artifacts and just pulled the release tarball from: 
http://apache.mirrors.tds.net/cassandra/3.10/apache-cassandra-3.10-src.tar.gz 
and created an rpm build with that. We did use the orginal spec and made a few 
minor tweaks to it. Basically, the tweaks were:

-  Apply the patches when doing the build (one was for byteman, the 
other was a change to the Cassandra unit file to make it unkillable by 
oom-killer).

-  Post install/erase actions (systemctl commands – such as reloading 
the daemon, autostarting the service etc).

That’s pretty much it, everything else should be the same as the original spec. 
Attaching it for convenience. Since, we’re trying to qualify this for 
Production, 3.11 isn’t officially released, yet is it – it’s why we’re planning 
on using 3.10. The concern stems from the build failing with byteman. We’re 
novices at building our own rpms for Cassandra and are concerned that there may 
be other issues with the 3.10 release and maybe we should hold off till 3.11 is 
released and ready. Any opinions/feedback will help.

Regards,
Dhruva


From: Anthony Grasso 
Date: Thursday, May 11, 2017 at 5:06 PM
To: "Gopal, Dhruva" 
Cc: "user@cassandra.apache.org" 
Subject: Re: cassandra 3.10

Hi Dhruva,

There are definitely some performance improvements to Storage Engine in 
Cassandra 3.10 which make it worth the upgrade. Note that Cassandra 3.11 has 
further bug fixes and it may be worth considering a migration to that version.

Regarding the issue of building a Cassandra 3.10 RPM, it sounds like the team 
have built their own custom spec file? Has the team looked at using the project 
spec file and associated instructions in Apache Cassandra GitHub mirror?

https://github.com/apache/cassandra/tree/cassandra-3.10/redhat

Kind regards,
Anthony


On 11 May 2017 at 14:20, Gopal, Dhruva 
> wrote:
Hi –
  We’re currently on 3.9 and have been told that Cassandra 3.10 is a more 
stable version to be on. We’ve been using the datastax-ddc rpms in our 
production and dev environments (on 3.9) and it appears there is no 3.10 rpm 
version out yet. We tried to build our own rpm (our devops processes use rpms, 
so changing to using tarballs is not easily done) and found that the build 
process fails (to do with the byteman-3.0.3 jar) that we manage to patch and 
get working (with rpmbuild). My concerns/questions are these:

-  Is the 3.10 version actually stable enough given that the build 
failed (we obtained the source from this location: 
http://apache.mirrors.tds.net/cassandra/3.10/apache-cassandra-3.10-src.tar.gz) 
and used the attached patch file for byteman during the build process)?

-  Are there any other issues with the binaries that we need to be 
aware of (other patches)?

I’m concerned that there may be other issues and that we really won’t know 
since we’re not Cassandra experts, so looking for feedback from this group on 
whether we should just stay with 3.9 or if it’s safe to proceed with this 
approach. I can share the spec file and patch files that we’ve setup for the 
build process, if desired.


Regards,
DHRUVA GOPAL
sr. MANAGER, ENGINEERING
REPORTING, ANALYTICS AND BIG DATA
+1 408.325.2011 WORK
+1 408.219.1094 MOBILE
UNITED STATES
dhruva.go...@aspect.com
aspect.com
[scription: http://webapp2.aspect.com/EmailSigLogo-rev.jpg]

This email (including any attachments) is proprietary to Aspect Software, Inc. 
and may contain information that is confidential. If you have received this 
message in error, please do not read, copy or forward this message. Please 
notify the sender immediately, delete it from your system and destroy any 
copies. You may not further disclose or distribute this email or its 
attachments.


-
To unsubscribe, e-mail: 
user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: 
user-h...@cassandra.apache.org

This email (including any attachments) is proprietary to Aspect Software, Inc. 
and may contain information that is confidential. If you have received this 
message in error, please do not read, copy or forward this message. Please 
notify the sender immediately, delete it from your system and destroy any 
copies. You may not further disclose or distribute this email or its 
attachments.


cassandra-3.10-build.patch
Description: 

Re: LCS, range tombstones, and eviction

2017-05-12 Thread Blake Eggleston
The start and end points of a range tombstone are basically stored as special 
purpose rows alongside the normal data in an sstable. As part of a read, 
they're reconciled with the data from the other sstables into a single 
partition, just like the other rows. The only difference is that they don't 
contain any 'real' data, and, of course, they prevent 'deleted' data from being 
returned in the read. It's a bit more complicated than that, but that's the 
general idea.


On May 12, 2017 at 6:23:01 AM, Stefano Ortolani (ostef...@gmail.com) wrote:

Thanks a lot Blake, that definitely helps!

I actually found a ticket re range tombstones and how they are accounted for: 
https://issues.apache.org/jira/browse/CASSANDRA-8527

I am wondering now what happens when a node receives a read request. Are the 
range tombstones read before scanning the SStables? More interestingly, given 
that a single partition might be split across different levels, and that some 
range tombstones might be in L0 while all the rest of the data in L1, are all 
the tombstones prefetched from _all_ the involved SStables before doing any 
table scan?

Regards,
Stefano

On Thu, May 11, 2017 at 7:58 PM, Blake Eggleston  wrote:
Hi Stefano,

Based on what I understood reading the docs, if the ratio of garbage 
collectable tomstones exceeds the "tombstone_threshold", C* should start 
compacting and evicting.

If there are no other normal compaction tasks to be run, LCS will attempt to 
compact the sstables it estimates it will be able to drop the most tombstones 
from. It does this by estimating the number of tombstones an sstable has that 
have passed the gc grace period. Whether or not a tombstone will actually be 
evicted is more complicated. Even if a tombstone has passed gc grace, it can't 
be dropped if the data it's deleting still exists in another sstable, otherwise 
the data would appear to return. So, a tombstone won't be dropped if there is 
data for the same partition in other sstables that is older than the tombstone 
being evaluated for eviction.

I am quite puzzled however by what might happen when dealing with range 
tombstones. In that case a single tombstone might actually stand for an 
arbitrary number of normal tombstones. In other words, do range tombstones 
contribute to the "tombstone_threshold"? If so, how?

From what I can tell, each end of the range tombstone is counted as a single 
tombstone tombstone. So a range tombstone effectively contributes '2' to the 
count of tombstones for an sstable. I'm not 100% sure, but I haven't seen any 
sstable writing logic that tracks open tombstones and counts covered cells as 
tombstones. So, it's likely that the effect of range tombstones covering many 
rows are under represented in the droppable tombstone estimate.

I am also a bit confused by the "tombstone_compaction_interval". If I am 
dealing with a big partition in LCS which is receiving new records every day, 
and a weekly incremental repair job continously anticompacting the data and 
thus creating SStables, what is the likelhood of the default interval 
(10 days) to be actually hit?

It will be hit, but probably only in the repaired data. Once the data is marked 
repaired, it shouldn't be anticompacted again, and should get old enough to 
pass the compaction interval. That shouldn't be an issue though, because you 
should be running repair often enough that data is repaired before it can ever 
get past the gc grace period. Otherwise you'll have other problems. Also, keep 
in mind that tombstone eviction is a part of all compactions, it's just that 
occasionally a compaction is run specifically for that purpose. Finally, you 
probably shouldn't run incremental repair on data that is deleted. There is a 
design flaw in the incremental repair used in pre-4.0 of cassandra that can 
cause consistency issues. It can also cause a *lot* of over streaming, so you 
might want to take a look at how much streaming your cluster is doing with full 
repairs, and incremental repairs. It might actually be more efficient to run 
full repairs.

Hope that helps,

Blake

On May 11, 2017 at 7:16:26 AM, Stefano Ortolani (ostef...@gmail.com) wrote:

Hi all,

I am trying to wrap my head around how C* evicts tombstones when using LCS.
Based on what I understood reading the docs, if the ratio of garbage 
collectable tomstones exceeds the "tombstone_threshold", C* should start 
compacting and evicting.

I am quite puzzled however by what might happen when dealing with range 
tombstones. In that case a single tombstone might actually stand for an 
arbitrary number of normal tombstones. In other words, do range tombstones 
contribute to the "tombstone_threshold"? If so, how?

I am also a bit confused by the "tombstone_compaction_interval". If I am 
dealing with a big partition in LCS which is receiving new records every day, 
and a weekly incremental repair job continously anticompacting the data and 
thus creating SStables, what 

Re: AWS Cassandra backup/Restore tools

2017-05-12 Thread Alexander Dejanovski
Hi,

here are the main techniques that I know of to perform backups for
Cassandra :

   - Tablesnap (https://github.com/JeremyGrosser/tablesnap) : performs
   continuous backups on S3. Comes with tableslurp to restore backups (one
   table at a time only) and tablechop to delete outdated sstables from S3.
   - incremental backup : activate it in the cassandra.yaml file and it
   will create snapshots for all newly flushed SSTables. It's up to you to
   move the snapshots off-node and delete them. I don't really like that
   technique since it creates a lot of small sstables that eventually contain
   a lot of outdated data. Upon restore you'll have to wait until compaction
   catches up on compacting all the history (which could take a while and use
   a lot of power). Your backups could also grow indefinitely with this
   technique since there's no compaction, so no purge. You'll have to build
   the restore script/procedure.
   - scheduled snapshots : you perform full snapshots by yourself and move
   them off node. You'll have to build the restore script/procedure.
   - EBS snapshots : probably the easiest way to perform backups if you are
   using M4/R4 instances on AWS.


Cheers,

On Thu, May 11, 2017 at 11:01 PM Manikandan Srinivasan <
msriniva...@datastax.com> wrote:

> Blake is correct. OpsCenter 6.0 and up doesn't work with OSS C*. @Nitan:
> We have made some substantial changes to the Opscenter 6.1 backup service,
> specifically when it comes to S3 backups. Having said this, I am not going
> to be sale-sy here. If folks need some help or need more clarity to know
> more about these improvements, please send me an email directly:
> msriniva...@datastax.com
>
> Regards
> Mani
>
> On Thu, May 11, 2017 at 1:54 PM, Nitan Kainth  wrote:
>
>> Also , Opscenter backup/restore does not work for large databases
>>
>> Sent from my iPhone
>>
>> On May 11, 2017, at 3:41 PM, Blake Eggleston 
>> wrote:
>>
>> OpsCenter 6.0 and up don't work with Cassandra.
>>
>> On May 11, 2017 at 12:31:08 PM, cass savy (casss...@gmail.com) wrote:
>>
>> AWS Backup/Restore process/tools for C*/DSE C*:
>>
>> Has anyone used Opscenter 6.1 backup tool to backup/restore data for
>> larger datasets online ?
>>
>> If yes, did you run into issues using that tool to backup/restore data in
>> PROD that caused any performance or any other impact to the cluster?
>>
>> If no, what are other tools that people have used or recommended for
>> backup and restore of Cassandra keyspaces?
>>
>> Please advice.
>>
>>
>>
>
>
> --
> Regards,
>
> Manikandan Srinivasan
>
> Director, Product Management| +1.408.887.3686 |
> manikandan.sriniva...@datastax.com
>
> [image: linkedin.png]  [image:
> facebook.png]  [image: twitter.png]
>  [image: g+.png]
> 
> 
>
> --
-
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


Re: repair question (-dc option)

2017-05-12 Thread Gopal, Dhruva
We’re making sure all the nodes up, when we run it. I don’t believe we are 
using LOCAL_XXX and the repair was being planned to be run only on local DC 
since that was where the node was down. Do we need to run a full cluster repair?

From: Varun Gupta 
Date: Thursday, May 11, 2017 at 1:33 PM
To: "Gopal, Dhruva" 
Cc: "user@cassandra.apache.org" 
Subject: Re: repair question (-dc option)


If there was no node down during that period, and you are using LOCAL_QUORUM 
read/write, then yes above command works.

On Thu, May 11, 2017 at 11:59 AM, Gopal, Dhruva 
> wrote:
Hi –
  I have a question on running a repair after bringing up a node that was down 
(brought down gracefully) for a few days within a data center. Can we just run 
nodetool repair –dc  on a single node (within that DC – specifically the 
downed node, after it is brought online) and have that entire DC repaired?

Regards,
Dhruva

This email (including any attachments) is proprietary to Aspect Software, Inc. 
and may contain information that is confidential. If you have received this 
message in error, please do not read, copy or forward this message. Please 
notify the sender immediately, delete it from your system and destroy any 
copies. You may not further disclose or distribute this email or its 
attachments.

This email (including any attachments) is proprietary to Aspect Software, Inc. 
and may contain information that is confidential. If you have received this 
message in error, please do not read, copy or forward this message. Please 
notify the sender immediately, delete it from your system and destroy any 
copies. You may not further disclose or distribute this email or its 
attachments.


Re: LCS, range tombstones, and eviction

2017-05-12 Thread Stefano Ortolani
Thanks a lot Blake, that definitely helps!

I actually found a ticket re range tombstones and how they are accounted
for: https://issues.apache.org/jira/browse/CASSANDRA-8527

I am wondering now what happens when a node receives a read request. Are
the range tombstones read before scanning the SStables? More interestingly,
given that a single partition might be split across different levels, and
that some range tombstones might be in L0 while all the rest of the data in
L1, are all the tombstones prefetched from _all_ the involved SStables
before doing any table scan?

Regards,
Stefano

On Thu, May 11, 2017 at 7:58 PM, Blake Eggleston 
wrote:

> Hi Stefano,
>
> Based on what I understood reading the docs, if the ratio of garbage
> collectable tomstones exceeds the "tombstone_threshold", C* should start
> compacting and evicting.
>
>
> If there are no other normal compaction tasks to be run, LCS will attempt
> to compact the sstables it estimates it will be able to drop the most
> tombstones from. It does this by estimating the number of tombstones an
> sstable has that have passed the gc grace period. Whether or not a
> tombstone will actually be evicted is more complicated. Even if a tombstone
> has passed gc grace, it can't be dropped if the data it's deleting still
> exists in another sstable, otherwise the data would appear to return. So, a
> tombstone won't be dropped if there is data for the same partition in other
> sstables that is older than the tombstone being evaluated for eviction.
>
> I am quite puzzled however by what might happen when dealing with range
> tombstones. In that case a single tombstone might actually stand for an
> arbitrary number of normal tombstones. In other words, do range tombstones
> contribute to the "tombstone_threshold"? If so, how?
>
>
> From what I can tell, each end of the range tombstone is counted as a
> single tombstone tombstone. So a range tombstone effectively contributes
> '2' to the count of tombstones for an sstable. I'm not 100% sure, but I
> haven't seen any sstable writing logic that tracks open tombstones and
> counts covered cells as tombstones. So, it's likely that the effect of
> range tombstones covering many rows are under represented in the droppable
> tombstone estimate.
>
> I am also a bit confused by the "tombstone_compaction_interval". If I am
> dealing with a big partition in LCS which is receiving new records every
> day,
> and a weekly incremental repair job continously anticompacting the data
> and
> thus creating SStables, what is the likelhood of the default interval
> (10 days) to be actually hit?
>
>
> It will be hit, but probably only in the repaired data. Once the data is
> marked repaired, it shouldn't be anticompacted again, and should get old
> enough to pass the compaction interval. That shouldn't be an issue though,
> because you should be running repair often enough that data is repaired
> before it can ever get past the gc grace period. Otherwise you'll have
> other problems. Also, keep in mind that tombstone eviction is a part of all
> compactions, it's just that occasionally a compaction is run specifically
> for that purpose. Finally, you probably shouldn't run incremental repair on
> data that is deleted. There is a design flaw in the incremental repair used
> in pre-4.0 of cassandra that can cause consistency issues. It can also
> cause a *lot* of over streaming, so you might want to take a look at how
> much streaming your cluster is doing with full repairs, and incremental
> repairs. It might actually be more efficient to run full repairs.
>
> Hope that helps,
>
> Blake
>
> On May 11, 2017 at 7:16:26 AM, Stefano Ortolani (ostef...@gmail.com)
> wrote:
>
> Hi all,
>
> I am trying to wrap my head around how C* evicts tombstones when using LCS.
> Based on what I understood reading the docs, if the ratio of garbage
> collectable tomstones exceeds the "tombstone_threshold", C* should start
> compacting and evicting.
>
> I am quite puzzled however by what might happen when dealing with range
> tombstones. In that case a single tombstone might actually stand for an
> arbitrary number of normal tombstones. In other words, do range tombstones
> contribute to the "tombstone_threshold"? If so, how?
>
> I am also a bit confused by the "tombstone_compaction_interval". If I am
> dealing with a big partition in LCS which is receiving new records every
> day,
> and a weekly incremental repair job continously anticompacting the data
> and
> thus creating SStables, what is the likelhood of the default interval
> (10 days) to be actually hit?
>
> Hopefully somebody will be able to shed some lights here!
>
> Thanks in advance!
> Stefano
>
>


Moving SSTable to fix JBOD imbalance

2017-05-12 Thread Axel Colin de Verdiere
Hello !

I'm experiencing a data imbalance issue with one of my nodes within a
3-nodes C* 2.1.4 cluster. All of them are using JBOD (2 physical disks),
and this particular node seems to have recently made a relatively big
compaction (I'm using STCS), creating a 56Go SSTable file, which results in
one of the disks being 94% used and the other only 34%. I've looked around
for similar issues, and this was supposed to be fixed in 2.1.3 (
CASSANDRA-7386 ). DSE
docs

suggest
stopping the node and moving some SSTables around between the disks to
force a better balance, while trying to make as few moves as possible. Can
I just stop the node, move the 56Go SSTable (so I guess the Summary, TOC,
Digest, Statistics, CompressionInfo, Data, Index and Filter files) and
restart the node?

Thanks a lot for your help,
Best,

Axel


RE: Reg:- CQL SOLR Query Not gives result

2017-05-12 Thread Jacques-Henri Berthemet
While this is indeed a problem with DSE, your problem looks related to CJK 
Lucene indexing, in this context I think your query does not make sense.
(see CJK: https://en.wikipedia.org/wiki/CJK_characters)

If you properly configured your indexing to handle CJK, as it looks like you’re 
searching for Chinese, using wildcards with CJK does not make sense. 中 can be 
considered as a word, not a letter, so partial matches using wildcards don’t 
make sense. Also, CJK analyzer is indexing bi-grams, so you should search for 
pairs of characters.

--
Jacques-Henri Berthemet

From: Jonathan Haddad [mailto:j...@jonhaddad.com]
Sent: vendredi 12 mai 2017 04:21
To: @Nandan@ ; user@cassandra.apache.org
Subject: Re: Reg:- CQL SOLR Query Not gives result

This is a question for datastax support, not the Apache mailing list. Folks 
here are more than happy to help with open source, Apache Cassandra questions, 
if you've got one.
On Thu, May 11, 2017 at 9:06 PM @Nandan@ 
> wrote:
Hi ,

In my table, I am having few records and implemented SOLR for partial search 
but not able to retrieve data.

SELECT * from revall_book_by_title where solr_query = 'language:中';
SELECT * from revall_book_by_title where solr_query = 'language:中*';

None of them are working.
Any suggestions.


Re: Cassandra Snapshots and directories

2017-05-12 Thread Daniel Hölbling-Inzko
Hi Varun,
yes you are right - that's the structure that gets created. But if I want
to backup ALL columnfamilies at once this requires a quite complex rsync as
Vladimir mentioned.
I can't just copy over the /data/keyspace directory as that contains all
the data AND all the snapshots. I really have to go through this
columnfamily by columnfamily which is annoying.

greetings Daniel

On Thu, 11 May 2017 at 22:48 Varun Gupta  wrote:

>
> I did not get your question completely, with "snapshot files are mixed
> with files and backup files".
>
> When you call nodetool snapshot, it will create a directory with snapshot
> name if specified or current timestamp at
> /data///backup/. This directory will
> have all sstables, metadata files and schema.cql (if using 3.0.9 or higher).
>
>
> On Thu, May 11, 2017 at 2:37 AM, Daniel Hölbling-Inzko <
> daniel.hoelbling-in...@bitmovin.com> wrote:
>
>> Hi,
>> I am going through this guide to do backup/restore of cassandra data to a
>> new cluster:
>>
>> http://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_backup_snapshot_restore_t.html#task_ds_cmf_11r_gk
>>
>> When creating a snapshot I get the snapshot files mixed in with the
>> normal data files and backup files, so it's all over the place and very
>> hard (especially with lots of tables per keyspace) to transfer ONLY the
>> snapshot.
>> (Mostly since there is a snapshot directory per table..)
>>
>> Am I missing something or is there some arcane shell command that filters
>> out only the snapshots?
>> Because this way it's much easier to just backup the whole data directory.
>>
>> greetings Daniel
>>
>
>