Re: Backup restore with a different name

2016-11-03 Thread kurt Greaves
On 2 November 2016 at 22:10, Jens Rantil  wrote:

> I mean "exposing that state for reference while keeping the (corrupt)
> current state in the live cluster".


The following should work:


   1. Create a new table with the same schema but different name (in the
   same or a different keyspace).
   2. Rename all the snapshotted SSTables to match the *new* table name.
   3. Copy SSTables into new table directory.
   4. nodetool refresh or restart Cassandra.


Kurt Greaves
k...@instaclustr.com
www.instaclustr.com


Re: Commercial Support Providers?

2016-11-03 Thread Ben Slater
I can confirm that we do offer support contracts for OSS Apache Cassandra
at Instaclustr (in addition to our managed service) - either drop me an
email direct (signature below) or contact sa...@instaclustr.com and would
be happy to discuss details.

Cheers
Ben

On Fri, 4 Nov 2016 at 14:02 Max C  wrote:

> Hello -
>
> We’re rolling out a small cluster at my work (2 DCs of 3 nodes each —
> hosted on-premises), and my boss has asked us to look into commercial
> support offerings.
>
> The main thing we’re looking for is a company that we can call day or
> night if/when things go “kaboom” and I can’t figure out what the problem is
> (ex: an upgrade fails unexpectedly, weird error messages in the logs,
> repairs keep failing, etc).  If they offer their own tested, supported,
> patched, version of Cassandra that would be ideal, and certainly management
> tools like OpsCenter are a bonus.
>
> The obvious choice here is DataStax, and we’re definitely talking to
> them.  Are there any other providers which offer this sort of service?
> Maybe Instaclustr?
>
> Thanks.
>
> - Max


Commercial Support Providers?

2016-11-03 Thread Max C
Hello -

We’re rolling out a small cluster at my work (2 DCs of 3 nodes each — hosted 
on-premises), and my boss has asked us to look into commercial support 
offerings.

The main thing we’re looking for is a company that we can call day or night 
if/when things go “kaboom” and I can’t figure out what the problem is (ex: an 
upgrade fails unexpectedly, weird error messages in the logs, repairs keep 
failing, etc).  If they offer their own tested, supported, patched, version of 
Cassandra that would be ideal, and certainly management tools like OpsCenter 
are a bonus.

The obvious choice here is DataStax, and we’re definitely talking to them.  Are 
there any other providers which offer this sort of service?  Maybe Instaclustr?

Thanks.

- Max

Cassandra on Cloud platforms experience

2016-11-03 Thread cass savy
I would like to hear from the community on their experiences or lesson
learnt on hosting Cassandra in cloud platforms like

1. Google Cloud Platform
2. AWS
3. Azure

1.  Which cloud hosting is better and Why?
2.  What differences of C* over vendor provided NoSQL DB like (Bigtable,
Dynamo,Azure Document DB)
3. AWS is more mature in his offerings and Azure is getting there or its
there already based on what I have been investigating so far?

4. What is drive to pick one vs another -Is it cost, infrastructure,
hardware SKU, availability, scalability, performance,ease of deployment and
maintenance,..etc?

Please let me know your thoughts and suggestions if somebody has done a
deep dive into these 3 cloud platforms for C*.


We use datastax cassandra and exploring new usecases in AWS and also
evaluating  or POC it in Azure/GCP


RE: Question on Read Repair

2016-11-03 Thread Anubhav Kale
Does it work the same way for writes as well ? If “nodetool status” shows that 
a node is DN, would writes fail right away assuming enough nodes are down to 
fail QUORUM ?

From: Jeff Jirsa [mailto:jeff.ji...@crowdstrike.com]
Sent: Tuesday, October 11, 2016 1:13 PM
To: user@cassandra.apache.org
Subject: Re: Question on Read Repair

Yes:

https://github.com/apache/cassandra/blob/81f6c784ce967fadb6ed7f58de1328e713eaf53c/src/java/org/apache/cassandra/db/ConsistencyLevel.java#L286



From: Anubhav Kale 
>
Reply-To: "user@cassandra.apache.org" 
>
Date: Tuesday, October 11, 2016 at 11:45 AM
To: "user@cassandra.apache.org" 
>
Subject: RE: Question on Read Repair

Thank you.

Interesting detail. Does it work the same way for other consistency levels as 
well ?

From: Jeff Jirsa [mailto:jeff.ji...@crowdstrike.com]
Sent: Tuesday, October 11, 2016 10:29 AM
To: user@cassandra.apache.org
Subject: Re: Question on Read Repair

If the failuredetector knows that the node is down, it won’t attempt a read, 
because the consistency level can’t be satisfied – none of the other replicas 
will be repaired.


From: Anubhav Kale 
>
Reply-To: "user@cassandra.apache.org" 
>
Date: Tuesday, October 11, 2016 at 10:24 AM
To: "user@cassandra.apache.org" 
>
Subject: Question on Read Repair

Hello,

This is more of a theory / concept question. I set CL=ALL and do a read. Say 
one replica was down, will the rest of the replicas get repaired as part of 
this ? (I am hoping the answer is yes).

Thanks !

CONFIDENTIALITY NOTE: This e-mail and any attachments are confidential and may 
be legally privileged. If you are not the intended recipient, do not disclose, 
copy, distribute, or use this email or any attachments. If you have received 
this in error please let the sender know and then delete the email and all 
attachments.

CONFIDENTIALITY NOTE: This e-mail and any attachments are confidential and may 
be legally privileged. If you are not the intended recipient, do not disclose, 
copy, distribute, or use this email or any attachments. If you have received 
this in error please let the sender know and then delete the email and all 
attachments.


Re: Improving cassandra documentation

2016-11-03 Thread SmartCat - Scott Hirleman
Totally agree, I just saw DataStax + docs so I figured it was about DSE,
not OSS C* *shrug*.

On Thu, Nov 3, 2016 at 2:24 PM, Justin Cameron 
wrote:

> Maybe a little off-tangent, but there is also a set of open source
> documentation now available on the Apache Cassandra website:
> http://cassandra.apache.org/doc/latest/
>
> You can contribute to them directly via git
>
> On Thu, 3 Nov 2016 at 12:11 SmartCat - Scott Hirleman 
> wrote:
>
>> http://docs.datastax.com/en/landing_page/doc/landing_page/contact.html
>> Looks like it is still just email d...@datastax.com
>>
>> On Thu, Nov 3, 2016 at 9:34 AM, Oleg Krayushkin 
>> wrote:
>>
>> Hi, from time to time I find errors in datastax cassandra docs. Is there
>> a right & easy way to report them?
>>
>> Thanks.
>>
>> --
>>
>> Oleg Krayushkin
>>
>>
>>
>>
>> --
>> *Scott Hirleman*
>> *Head of US Marketing and Sales*
>> www.smartcat.io
>> https://github.com/smartcat-labs 
>>
>> 
>>
> --
>
> Justin Cameron
>
> Senior Software Engineer | Instaclustr
>
>
>
>
> This email has been sent on behalf of Instaclustr Pty Ltd (Australia) and
> Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>
>


-- 
*Scott Hirleman*
*Head of US Marketing and Sales*
www.smartcat.io
https://github.com/smartcat-labs 




Re: Improving cassandra documentation

2016-11-03 Thread Lahiru Gamathige
Hi Oleg,

I highly recommend to contribute to Apache documentation. I think C* needs
lot more non-datastax documentation.

Lahiru

On Thu, Nov 3, 2016 at 1:24 PM, Justin Cameron 
wrote:

> Maybe a little off-tangent, but there is also a set of open source
> documentation now available on the Apache Cassandra website:
> http://cassandra.apache.org/doc/latest/
>
> You can contribute to them directly via git
>
> On Thu, 3 Nov 2016 at 12:11 SmartCat - Scott Hirleman 
> wrote:
>
>> http://docs.datastax.com/en/landing_page/doc/landing_page/contact.html
>> Looks like it is still just email d...@datastax.com
>>
>> On Thu, Nov 3, 2016 at 9:34 AM, Oleg Krayushkin 
>> wrote:
>>
>> Hi, from time to time I find errors in datastax cassandra docs. Is there
>> a right & easy way to report them?
>>
>> Thanks.
>>
>> --
>>
>> Oleg Krayushkin
>>
>>
>>
>>
>> --
>> *Scott Hirleman*
>> *Head of US Marketing and Sales*
>> www.smartcat.io
>> https://github.com/smartcat-labs 
>>
>> 
>>
> --
>
> Justin Cameron
>
> Senior Software Engineer | Instaclustr
>
>
>
>
> This email has been sent on behalf of Instaclustr Pty Ltd (Australia) and
> Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>
>


Re: Improving cassandra documentation

2016-11-03 Thread Justin Cameron
Maybe a little off-tangent, but there is also a set of open source
documentation now available on the Apache Cassandra website:
http://cassandra.apache.org/doc/latest/

You can contribute to them directly via git

On Thu, 3 Nov 2016 at 12:11 SmartCat - Scott Hirleman 
wrote:

> http://docs.datastax.com/en/landing_page/doc/landing_page/contact.html
> Looks like it is still just email d...@datastax.com
>
> On Thu, Nov 3, 2016 at 9:34 AM, Oleg Krayushkin 
> wrote:
>
> Hi, from time to time I find errors in datastax cassandra docs. Is there a
> right & easy way to report them?
>
> Thanks.
>
> --
>
> Oleg Krayushkin
>
>
>
>
> --
> *Scott Hirleman*
> *Head of US Marketing and Sales*
> www.smartcat.io
> https://github.com/smartcat-labs 
>
> 
>
-- 

Justin Cameron

Senior Software Engineer | Instaclustr




This email has been sent on behalf of Instaclustr Pty Ltd (Australia) and
Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.


Re: Improving cassandra documentation

2016-11-03 Thread SmartCat - Scott Hirleman
http://docs.datastax.com/en/landing_page/doc/landing_page/contact.html
Looks like it is still just email d...@datastax.com

On Thu, Nov 3, 2016 at 9:34 AM, Oleg Krayushkin 
wrote:

> Hi, from time to time I find errors in datastax cassandra docs. Is there a
> right & easy way to report them?
>
> Thanks.
>
> --
>
> Oleg Krayushkin
>



-- 
*Scott Hirleman*
*Head of US Marketing and Sales*
www.smartcat.io
https://github.com/smartcat-labs 




Re: Backup restore with a different name

2016-11-03 Thread Rajath Subramanyam
Hi Jens,

Looks like what you need is an "any point in time" recovery solution. I
suggest that you go back to the snapshot that you issued that was closest
to "20161102" and restore that snapshot using the bulk loader to a new
table called "users_20161102". If you need to recover precisely to a
particular timestamp, you might have to parse every row in the SSTable and
filter out some rows.

Btw, we at Datos IO are working on exactly this solution. We have built a
data protection software for scale-out databases called RecoverX. We also
support Cassandra. One of the features that RecoverX supports is a repair
free recovery/restore that allows you to go back any point in time.

If you need more information, visit our website datos.io or drop us a note
at i...@datos.io, joe.schwa...@datos.io.

Hope this helps.

Full disclaimer: I am an engineer at Datos.io.

Regards,
Rajath


Rajath Subramanyam


On Wed, Nov 2, 2016 at 3:10 PM, Jens Rantil  wrote:

> Bryan,
>
> On Wed, Nov 2, 2016 at 11:38 AM, Bryan Cheng 
> wrote:
>
>> do you mean restoring the cluster to that state, or just exposing that
>> state for reference while keeping the (corrupt) current state in the live
>> cluster?
>
>
> I mean "exposing that state for reference while keeping the (corrupt)
> current state in the live cluster".
>
> Cheers,
> Jens
>
> --
> Jens Rantil
> Backend engineer
> Tink AB
>
> Email: jens.ran...@tink.se
> Phone: +46 708 84 18 32
> Web: www.tink.se
>
> Facebook  Linkedin
> 
>  Twitter 
>


Re: Schema not translated completely from Thrift protocol to CQL protocol

2016-11-03 Thread Nitin Pasari
bump.

On Tue, Nov 1, 2016 at 1:16 PM, Nitin Pasari  wrote:

> Hi,
>
> I am trying to move from using pycassa to native protocol in my project
> (which will let us upgrade the version of cassandra). My schema was defined
> using pycassa so it created a Column family using compact storage and it
> has 3 columns which are not part of the composite primary key (I know this
> is not allowed by CQL protocol.)
>
> The version of my current Cassandra cluster is 2.0.17. The schema when I
> do "show schema" using the thrift protocol comes to be:
>
> create column family store
>   with column_type = 'Standard'
>   and comparator = 
> 'CompositeType(org.apache.cassandra.db.marshal.ReversedType(org.apache.cassandra.db.marshal.LongType),org.apache.cassandra.db.marshal.AsciiType,org.apache.cassandra.db.marshal.AsciiType)'
>   and default_validation_class = 'DoubleType'
>   and key_validation_class = 'AsciiType'
>   and column_metadata = [
> {column_name : 'something1',
> validation_class : AsciiType},
> {column_name : 'something2',
> validation_class : AsciiType}]
>
>
> But when I check the schema on the native protocol, it is missing
> "column3" and "value" columns. It comes out as follows:
>
> CREATE TABLE store (
>   key ascii,
>   column1 bigint,
>   column2 ascii,
>   something1 ascii,
>   something2 ascii,
>   PRIMARY KEY ((key), column1, column2)) WITH COMPACT STORAGE AND
>   CLUSTERING ORDER BY (column1 DESC, column2 ASC)
>
>
> Now, because of this discrepancy, I cannot transition from pycassa to the
> native protocol on the client side. I haven't been able to find anything to
> overcome this problem and make sure that the native protocol sees the right
> schema. Is there anything you could suggest me to fix this? Any help is
> appreciated!
>
> Thanks,
> Nitin
>


Re: Handle Leap Seconds with Cassandra

2016-11-03 Thread Eric Stevens
You're able to set the timestamp of the write in the client application.
If you have a table which is especially sensitive to out of order writes
and want to deal with the repeated second correctly, you could do slewing
at your client application layer and be explicit with the timestamp for
those statements.

On Wed, Nov 2, 2016 at 9:08 PM Ben Bromhead  wrote:

> Based on most of what I've said previously pretty much most ways of
> avoiding your ordering issue of the leap second is going to be a "hack" and
> there will be some amount of hope involved.
>
> If the updates occur more than 300ms apart and you are confident your
> nodes have clocks that are within 150ms of each other, then I'd close my
> eyes and hope they all leap second at the same time within that 150ms.
>
> If they are less then 300ms (I'm guessing you meant less 300ms), then I
> would look to figure out what the smallest gap is between those two updates
> and make sure your nodes clocks are close enough in that gap that the leap
> second will occur on all nodes within that gap.
>
> If that's not good enough, you could just halt those scenarios for 2
> seconds over the leap second and then resume them once you've confirmed all
> clocks have skipped.
>
>
> On Wed, 2 Nov 2016 at 18:13 Anuj Wadehra  wrote:
>
> Thanks Ben for taking out time for the detailed reply !!
>
> We dont need strict ordering for all operations but we are looking for
> scenarios where 2 quick updates to same column of same row are possible. By
> quick updates, I mean >300 ms. Configuring NTP properly (as mentioned in
> some blogs in your link) should give fair relative accuracy between the
> Cassandra nodes. But leap second takes the clock back for an ENTIRE one
> sec (huge) and the probability of old write overwriting the new one
> increases drastically. So, we want to be proactive with things.
>
> I agree that you should avoid such scebaruos with design (if possible).
>
> Good to know that you guys have setup your own NTP servers as per the
> recommendation. Curious..Do you also do some monitoring around NTP?
>
>
>
> Thanks
> Anuj
>
> On Fri, 28 Oct, 2016 at 12:25 AM, Ben Bromhead
>
>  wrote:
> If you need guaranteed strict ordering in a distributed system, I would
> not use Cassandra, Cassandra does not provide this out of the box. I would
> look to a system that uses lamport or vector clocks. Based on your
> description of how your systems runs at the moment (and how close your
> updates are together), you have either already experienced out of order
> updates or there is a real possibility you will in the future.
>
> Sorry to be so dire, but if you do require causal consistency / strict
> ordering, you are not getting it at the moment. Distributed systems theory
> is really tricky, even for people that are "experts" on distributed systems
> over unreliable networks (I would certainly not put myself in that
> category). People have made a very good name for themselves by showing that
> the vast majority of distributed databases have had bugs when it comes to
> their various consistency models and the claims these databases make.
>
> So make sure you really do need guaranteed causal consistency/strict
> ordering or if you can design around it (e.g. using conflict free
> replicated data types) or choose a system that is designed to provide it.
>
> Having said that... here are some hacky things you could do in Cassandra
> to try and get this behaviour, which I in no way endorse doing :)
>
>- Cassandra counters do leverage a logical clock per shard and you
>could hack something together with counters and lightweight transactions,
>but you would want to do your homework on counters accuracy during before
>diving into it... as I don't know if the implementation is safe in the
>context of your question. Also this would probably require a significant
>rework of your application plus a significant performance hit. I would
>invite a counter guru to jump in here...
>
>
>- You can leverage the fact that timestamps are monotonic if you
>isolate writes to a single node for a single shared... but you then loose
>Cassandra's availability guarantees, e.g. a keyspace with an RF of 1 and a
>CL of > ONE will get monotonic timestamps (if generated on the server
>side).
>
>
>- Continuing down the path of isolating writes to a single node for a
>given shard you could also isolate writes to the primary replica using your
>client driver during the leap second (make it a minute either side of the
>leap), but again you lose out on availability and you are probably already
>experiencing out of ordered writes given how close your writes and updates
>are.
>
>
> A note on NTP: NTP is generally fine if you use it to keep the clocks
> synced between the Cassandra nodes. If you are interested in how we have
> implemented NTP at Instaclustr, see our blogpost on 

Re: Issue with Unexpected exception

2016-11-03 Thread Sylvain Lebresne
>From the trace, "Connection reset by peer" simply mean the client
disconnected, which isn't necessary a problem/abnormal per se (and if it
is, it sounds more like a client issue than anything else). That said, I'm
not sure why 3.0.8 log this at INFO now, as that's not really a problem, so
if you can reproduce on 3.0.9 too, feel free to open a JIRA ticket. That
said, it's kind of harmless unless you have other symptoms of something not
working.

On Thu, Nov 3, 2016 at 4:59 PM, Oleg Krayushkin 
wrote:

> Hi, about month ago I already asked about my problem here (with subject
> "Error while read after upgrade from 2.2.7 to 3.0.8") and also at
> stackoverflow .
> Unfortunately, I still didn't find a solution.
>
> It's "Unexpected exception" -- maybe it's a good idea to make an Issue
> with it? ..or is it my mistake somewhere?
>
> Thanks
> --
>
> Oleg Krayushkin
>


Issue with Unexpected exception

2016-11-03 Thread Oleg Krayushkin
Hi, about month ago I already asked about my problem here (with subject
"Error while read after upgrade from 2.2.7 to 3.0.8") and also at
stackoverflow . Unfortunately,
I still didn't find a solution.

It's "Unexpected exception" -- maybe it's a good idea to make an Issue with
it? ..or is it my mistake somewhere?

Thanks
-- 

Oleg Krayushkin


Improving cassandra documentation

2016-11-03 Thread Oleg Krayushkin
Hi, from time to time I find errors in datastax cassandra docs. Is there a
right & easy way to report them?

Thanks.

-- 

Oleg Krayushkin


Re: failing bootstraps with OOM

2016-11-03 Thread Oleksandr Shulgin
On Thu, Nov 3, 2016 at 2:32 PM, Mike Torra  wrote:

> Hi Alex - I do monitor sstable counts and pending compactions, but
> probably not closely enough. In 3/4 regions the cluster is running in, both
> counts are very high - ~30-40k sstables for one particular CF, and on many
> nodes >1k pending compactions.
>

It is generally a good idea to try to keep the number of pending
compactions minimal.  We usually see it is close to zero on every node
during normal operations and less than some tens during maintenance such as
repair.

I had noticed this before, but I didn't have a good sense of what a "high"
> number for these values was.
>

I would say anything higher than 20 probably requires someone to have a
look and over 1k is very troublesome.

It makes sense to me why this would cause the issues I've seen. After
> increasing concurrent_compactors and compaction_throughput_mb_per_sec (to
> 8 and 64mb, respectively), I'm starting to see those counts go down
> steadily. Hopefully that will resolve the OOM issues, but it looks like it
> will take a while for compactions to catch up.
>
> Thanks for the suggestions, Alex
>

Welcome. :-)

--
Alex


Re: failing bootstraps with OOM

2016-11-03 Thread Mike Torra
Hi Alex - I do monitor sstable counts and pending compactions, but probably not 
closely enough. In 3/4 regions the cluster is running in, both counts are very 
high - ~30-40k sstables for one particular CF, and on many nodes >1k pending 
compactions. I had noticed this before, but I didn't have a good sense of what 
a "high" number for these values was.

It makes sense to me why this would cause the issues I've seen. After 
increasing concurrent_compactors and compaction_throughput_mb_per_sec (to 8 and 
64mb, respectively), I'm starting to see those counts go down steadily. 
Hopefully that will resolve the OOM issues, but it looks like it will take a 
while for compactions to catch up.

Thanks for the suggestions, Alex

From: Oleksandr Shulgin 
>
Reply-To: "user@cassandra.apache.org" 
>
Date: Wednesday, November 2, 2016 at 1:07 PM
To: "user@cassandra.apache.org" 
>
Subject: Re: failing bootstraps with OOM

On Wed, Nov 2, 2016 at 3:35 PM, Mike Torra 
> wrote:
>
> Hi All -
>
> I am trying to bootstrap a replacement node in a cluster, but it consistently 
> fails to bootstrap because of OOM exceptions. For almost a week I've been 
> going through cycles of bootstrapping, finding errors, then restarting / 
> resuming bootstrap, and I am struggling to move forward. Sometimes the 
> bootstrapping node itself fails, which usually manifests first as very high 
> GC times (sometimes 30s+!), then nodetool commands start to fail with 
> timeouts, then the node will crash with an OOM exception. Other times, a node 
> streaming data to this bootstrapping node will have a similar failure. In 
> either case, when it happens I need to restart the crashed node, then resume 
> the bootstrap.
>
> On top of these issues, when I do need to restart a node it takes a lng 
> time 
> (http://stackoverflow.com/questions/40141739/why-does-cassandra-sometimes-take-a-hours-to-start).
>  This exasperates the problem because it takes so long to find out if a 
> change to the cluster helps or if it still fails. I am in the process of 
> upgrading all nodes in the cluster from m4.xlarge to c4.4xlarge, and I am 
> running Cassandra DDC 3.5 on all nodes. The cluster has 26 nodes spread 
> across 4 regions in EC2. Here is some other relevant cluster info (also in 
> stack overflow post):
>
> Cluster Info
>
> Cassandra DDC 3.5
> EC2MultiRegionSnitch
> m4.xlarge, moving to c4.4xlarge
>
> Schema Info
>
> 3 CF's, all 'write once' (ie no updates), 1 week ttl, STCS (default)
> no secondary indexes
>
> I am unsure what to try next. The node that is currently having this 
> bootstrap problem is a pretty beefy box, with 16 cores, 30G of ram, and a 
> 3.2T EBS volume. The slow startup time might be because of the issues with a 
> high number of SSTables that Jeff Jirsa mentioned in a comment on the SO 
> post, but I am at a loss for the OOM issues. I've tried:
>
> Changing from CMS to G1 GC, which seemed to have helped a bit
> Upgrading from 3.5 to 3.9, which did not seem to help
> Upgrading instance types from m4.xlarge to c4.4xlarge, which seems to help, 
> but I'm still having issues
>
> I'd appreciate any suggestions on what else I can try to track down the cause 
> of these OOM exceptions.

Hi,

Do you monitor pending compactions and actual number of SSTable files?

On startup Cassandra needs to touch most of the data files and also seems to 
keep some metadata about every relevant file in memory.  We once went into 
situation where we ended up with hundreds of thousands of files per node which 
resulted in OOMs on every other node of the ring, and startup time was of over 
half an hour (this was on version 2.1).

If you have much more files than you expect, then you should check and adjust 
your concurrent_compactors and compaction_throughput_mb_per_sec settings.  
Increase concurrent_compactors if you're behind (pending compactions metric is 
a hint) and consider un-throttling compaction before your situation is back to 
normal.

Cheers,
--
Alex



Re: Secondary Index on Boolean column with TTL

2016-11-03 Thread Oleg Krayushkin
Thanks a lot, DuyHai!

2016-10-31 19:53 GMT+03:00 DuyHai Doan :

> Technically TTL should be handled properly. However, be careful of expired
> data turning into tombstones. For the original table, it may be a tombstone
> on a skinny partition but for the 2nd index, it may be a tombstone set on a
> wide partition and you'll start getting into trouble when reading a
> partition with a lot of them
>
> On Mon, Oct 31, 2016 at 5:08 PM, Oleg Krayushkin 
> wrote:
>
>> Hi, DuyHai, thank you.
>>
>> I got the idea of caveat with too low cardinality, but still wondering of
>> possible troubles at the idea to put TTL (months) on indexed column (not
>> bool, say, 100 different values of int).
>>
>> 2016-10-31 16:33 GMT+03:00 DuyHai Doan :
>>
>>> http://www.planetcassandra.org/blog/cassandra-native-seconda
>>> ry-index-deep-dive/
>>>
>>> See section E Caveats which applies to your boolean use-case
>>>
>>> On Mon, Oct 31, 2016 at 2:19 PM, Oleg Krayushkin 
>>> wrote:
>>>
 Hi,

 Is it a good approach to make a boolean column with TTL and build a
 secondary index on it?
 (For example, I want to get rows which need to be updated after a
 certain time, but I don't want, say, to add a filed "update_date" as
 clustering column or to create another table)

 In what kind of trouble it could lead me?

 Thanks in advance for any suggestions.

 --

 Oleg Krayushkin

>>>
>>>
>>
>>
>> --
>>
>> Oleg Krayushkin
>>
>
>


-- 

Oleg Krayushkin


Re: Rebuilding with vnodes

2016-11-03 Thread Oleksandr Shulgin
On Wed, Nov 2, 2016 at 8:59 PM, Anubhav Kale 
wrote:

> Hello,
>
>
>
> I am trying to rebuild a new Data Center with 50 Nodes, and expect 1 TB /
> node. Nodes are backed by SSDs, and the rebuild is happening from another
> DC in same physical region. This is with 2.1.13.
>
>
>
> I am doing this with stream_throughput=200 MB,
>

concurrent_compactors=256, compactionthroughput=0,
>

Hi,

How many CPU cores do you have per node?  In my experience unthrottled
compaction is CPU bound, but I never tried to raise the concurrency higher
than the number of core a node has.

How many actual concurrent compactions can you see from nodetool
compactionstats?

--
Alex


Problem with Jython UDF

2016-11-03 Thread Maciej Bryński
Hi,
I have following problem with Jython UDF.

1) I'm using Cassandra 3.9 deb packages and Ubuntu 14.04. I'm running
Oracle Java 1.8.0_101-b13)

2) I added jython jar to /usr/share/cassandra/lib. (jython version 2.7.0)
This makes creating python function possible

3) I want to test function.

cqlsh:e> CREATE FUNCTION IF NOT EXISTS test123 (input bigint) CALLED ON
NULL INPUT RETURNS text LANGUAGE python AS 'return "123"';

This worked, but running select with udf returns exception:
Traceback (most recent call last):
  File "/usr/bin/cqlsh.py", line 1264, in perform_simple_statement
result = future.result()
  File
"/usr/share/cassandra/lib/cassandra-driver-internal-only-3.5.0.post0-d8d0456.zip/cassandra-driver-3.5.0.post0-d8d0456/cassandra/cluster.py",
line 3650, in result
raise self._final_exception
FunctionFailure: Error from server: code=1400 [User Defined Function
failure] message="execution of 'e.test123[bigint]' failed:
java.security.AccessControlException: access denied:
("java.lang.RuntimePermission"
"accessClassInPackage.org.python.jline.console")

4) I tried to modify /etc/java-8-oracle/security/java.policy and added:

grant codeBase "file:/usr/share/cassandra/lib/*" {
permission java.security.AllPermission;
};

Still no improvement.

Any ideas how to run python UDFs in Cassandra ?

Regards,
-- 
Maciek Bryński