Re: Testing Cassandra connectivity at application startup

2023-08-25 Thread Shaurya Gupta
We don't plan to open a new connection. It should use the same
connection(s) which the application will use.

On Fri, Aug 25, 2023 at 10:59 AM Raphael Mazelier  wrote:

> Mind that a new connection is really costly for C*.
> So at startup it's fine. but not in a  liveness or readiness check imo.
> For the query why not select 1; ?
>
> --
>
> Raphael Mazelier
>
>
> On 25/08/2023 19:38, Shaurya Gupta wrote:
>
> Hi community
>
> We want to validate cassandra connectivity from the application container
> when it starts up and before it reports as healthy to K8s. Is doing
>
>> select * from our_keyspace.table limit 1
>
> fine Or is it an inefficient query and should not be fired on a prod
> cluster ?
>
> Any other suggestions ?
>
> --
> Shaurya Gupta
>
>
>

-- 
Shaurya Gupta


Testing Cassandra connectivity at application startup

2023-08-25 Thread Shaurya Gupta
Hi community

We want to validate cassandra connectivity from the application container
when it starts up and before it reports as healthy to K8s. Is doing

> select * from our_keyspace.table limit 1

fine Or is it an inefficient query and should not be fired on a prod
cluster ?

Any other suggestions ?

-- 
Shaurya Gupta


Re: Cassandra p95 latencies

2023-08-25 Thread Shaurya Gupta
Thanks everyone.
Updating this thread -
We increased the key cache size from 100 MB to 200 MB and we believe that
has brought down the latency from 40 ms p95 to 6 ms p95. I think there is
still scope for improvement as both writes and reads are presently at p95 6
ms. I would expect writes to be lower. But we are good with 6 ms for now at
least.

On Mon, Aug 14, 2023 at 11:56 AM Elliott Sims via user <
user@cassandra.apache.org> wrote:

> 1.  Check for Nagle/delayed-ack, but probably nodelay is getting set by
> the driver so it shouldn't be a problem.
> 2.  Check for network latency (just regular old ping among hosts, during
> traffic)
> 3.  Check your GC metrics and see if garbage collections line up with
> outliers.  Some tuning can help there, depending on the pattern, but 40ms
> p99 at least would be fairly normal for G1GC.
> 4.  Check actual local write times, and I/O times with iostat.  If you
> have spinning drives 40ms is fairly expected.  It's high but not totally
> unexpected for consumer-grade SSDs.  For enterprise-grade SSDs commit times
> that long would be very unusual.  What are your commitlog_sync settings?
>
> On Mon, Aug 14, 2023 at 8:43 AM Josh McKenzie 
> wrote:
>
>> The queries are rightly designed
>>
>> Data modeling in Cassandra is 100% gray space; there unfortunately is no
>> right or wrong design. You'll need to share basic shapes / contours of your
>> data model for other folks to help you; seemingly innocuous things in a
>> data model can cause unexpected issues w/C*'s storage engine paradigm
>> thanks to the partitioning and data storage happening under the hood.
>>
>> If you were seeing single digit ms on 3.0.X or 3.11.X and 40ms p95 on 4.0
>> I'd immediately look to the DB as being the culprit. For all other cases,
>> you should be seeing single digit ms as queries in C* generally boil down
>> to key/value lookups (partition key) to a list of rows you either point
>> query (key/value #2) or range scan via clustering keys and pull back out.
>>
>> There's also paging to take into consideration (whether you're using it
>> or not, what your page size is) and the data itself (do you have thousands
>> of columns? Multi-MB blobs you're pulling back out? etc). All can play into
>> this.
>>
>> On Fri, Aug 11, 2023, at 3:40 PM, Jeff Jirsa wrote:
>>
>> You’re going to have to help us help you
>>
>> 4.0 is pretty widely deployed. I’m not aware of a perf regression
>>
>> Can you give us a schema (anonymized) and queries and show us a trace ?
>>
>>
>> On Aug 10, 2023, at 10:18 PM, Shaurya Gupta 
>> wrote:
>>
>> 
>> The queries are rightly designed as I already explained. 40 ms is way too
>> high as compared to what I seen with other DBs and many a times with
>> Cassandra 3.x versions.
>> CPU consumed as I mentioned is not high, it is around 20%.
>>
>> On Thu, Aug 10, 2023 at 5:14 PM MyWorld  wrote:
>>
>> Hi,
>> P95 should not be a problem if rightly designed. Levelled compaction
>> strategy further reduces this, however it consume some resources. For read,
>> caching is also helpful.
>> Can you check your cpu iowait as it could be the reason for delay
>>
>> Regards,
>> Ashish
>>
>> On Fri, 11 Aug, 2023, 04:58 Shaurya Gupta, 
>> wrote:
>>
>> Hi community
>>
>> What is the expected P95 latency for Cassandra Read and Write queries
>> executed with Local_Quorum over a table with 3 replicas ? The queries are
>> done using the partition + clustering key and row size in bytes is not too
>> much, maybe 1-2 KB maximum.
>> Assuming CPU is not a crunch ?
>>
>> We observe those to be 40 ms P95 Reads and same for Writes. This looks
>> very high as compared to what we expected. We are using Cassandra 4.0.
>>
>> Any documentation / numbers will be helpful.
>>
>> Thanks
>> --
>> Shaurya Gupta
>>
>>
>>
>> --
>> Shaurya Gupta
>>
>>
>>
> This email, including its contents and any attachment(s), may contain
> confidential and/or proprietary information and is solely for the review
> and use of the intended recipient(s). If you have received this email in
> error, please notify the sender and permanently delete this email, its
> content, and any attachment(s). Any disclosure, copying, or taking of any
> action in reliance on an email received in error is strictly prohibited.
>


-- 
Shaurya Gupta


Re: Cassandra p95 latencies

2023-08-10 Thread Shaurya Gupta
The queries are rightly designed as I already explained. 40 ms is way too
high as compared to what I seen with other DBs and many a times with
Cassandra 3.x versions.
CPU consumed as I mentioned is not high, it is around 20%.

On Thu, Aug 10, 2023 at 5:14 PM MyWorld  wrote:

> Hi,
> P95 should not be a problem if rightly designed. Levelled compaction
> strategy further reduces this, however it consume some resources. For read,
> caching is also helpful.
> Can you check your cpu iowait as it could be the reason for delay
>
> Regards,
> Ashish
>
> On Fri, 11 Aug, 2023, 04:58 Shaurya Gupta,  wrote:
>
>> Hi community
>>
>> What is the expected P95 latency for Cassandra Read and Write queries
>> executed with Local_Quorum over a table with 3 replicas ? The queries are
>> done using the partition + clustering key and row size in bytes is not too
>> much, maybe 1-2 KB maximum.
>> Assuming CPU is not a crunch ?
>>
>> We observe those to be 40 ms P95 Reads and same for Writes. This looks
>> very high as compared to what we expected. We are using Cassandra 4.0.
>>
>> Any documentation / numbers will be helpful.
>>
>> Thanks
>> --
>> Shaurya Gupta
>>
>>
>>

-- 
Shaurya Gupta


Cassandra p95 latencies

2023-08-10 Thread Shaurya Gupta
Hi community

What is the expected P95 latency for Cassandra Read and Write queries
executed with Local_Quorum over a table with 3 replicas ? The queries are
done using the partition + clustering key and row size in bytes is not too
much, maybe 1-2 KB maximum.
Assuming CPU is not a crunch ?

We observe those to be 40 ms P95 Reads and same for Writes. This looks very
high as compared to what we expected. We are using Cassandra 4.0.

Any documentation / numbers will be helpful.

Thanks
-- 
Shaurya Gupta


native_transport_port_ssl and native_transport_port

2023-07-28 Thread Shaurya Gupta
Hi C* Community

We had a strange observation which we are unable to understand -

Initial config -
native_transport_port_ssl : true
native_transport_port : 9042

With the above config application / driver was unable to connect to
different nodes and whenever even one node went down / was drained DSE
driver complained with NoNodeAvailableException.
This is with Cassandra 4.0. Our hunch is that we faced -
https://issues.apache.org/jira/browse/CASSANDRA-16999.

Then we changed the config to -
Remove - (native_transport_port_ssl : true)
native_transport_port : 9142

Now the application behaves fine and nodetool drain doesn't cause any
issues.
However, what was strange was that as soon as cassandra.yaml was changed to
above config the application / driver stopped showing errors which used to
appear in DSE trace level logs while the driver used to connect to the 9042
port.

How is it possible that the configs got applied even before a single
cassandra node was restarted ?

Thanks
Shaurya

-- 
Shaurya Gupta


Reactive DSE Java Driver seeing high response times from Cassandra

2023-06-02 Thread Shaurya Gupta
Hi

We are seeing high response times on the application side from the Java
Reactive DSE driver.
Cassandra server side metrics show response times of 2 ms but the DSE logs
show a response time (95th percentile) of 40ms.
Following
https://docs.datastax.com/en/developer/java-driver/4.2/manual/core/pooling/
it appears that the connections are open for sometime during a load run and
they go to 0, then again open-connections goes to the configured value
(4) + 1 and then again to 0.
Why is that ? Load run is at 30K RPM.
This happens even when I reduce the heartbeat interval to as low as 250
milliseconds.
The CPU consumed by the container is just 35% of available.

Can someone help with this ?

Thanks
-- 
Shaurya Gupta


Re: Enabling SSL on a live cluster

2021-11-12 Thread Shaurya Gupta
Hi Kiran

I think you are right. 3.x does not have such an option in cassandra.yaml!

Thanks!
Shaurya

On Sat, Nov 13, 2021 at 8:42 AM Kiran mk  wrote:

> Hi Andy,
>
> Internode encryption is not possible without downtime prior to Apache
> Cassandra 4.0.As there is no optional option before 4.0 under
> server_encryption_options, If you try to enable it, cassandra running
> on version 3.x wouldn't start as the property isnt available.
> optional is only available client_encryption_options for the clusters
> running < 4.0 (prior 4.0).
>
> E.g.,
> Exception encountered during startup: Invalid yaml. Please remove
> properties [optional] from your cassandra.yaml
>
> Below link clearly gives idea about the fix in 4.0 and states why it's
> not possible to implement internode encryption without down time
> before 4.0.
>
> https://issues.apache.org/jira/browse/CASSANDRA-10404
>
> By Any chance, did you try to enable internode encryption in 3.x
> without downtime and were successful?  Can you please confirm.
> Best Regards,
> Kiran.M.K.
>
>
> On Wed, Nov 10, 2021 at 12:04 PM Tolbert, Andy 
> wrote:
> >
> > Hi Shaurya,
> >
> > On Tue, Nov 9, 2021 at 11:57 PM Shaurya Gupta 
> wrote:
> >>
> >> Hi,
> >>
> >> We want to enable node-to-node SSL on a live cluster. Could it be done
> without any down time ?
> >
> >
> > Yup, this is definitely doable for both internode and client
> connections.  You will have to bounce your cassandra nodes, but you should
> be able to achieve this operation without any downtime.  See
> server_encryption_options in cassandra.yaml (
> https://cassandra.apache.org/doc/4.0/cassandra/configuration/cass_yaml_file.html#server_encryption_options
> )
> >
> >>
> >> Would the nodes which have been restarted be able to communicate with
> the nodes which have not yet come up and vice versa ?
> >
> >
> > The idea would be to:
> >
> > 1. Set optional to true in server_encryption_options and bounce the
> cluster safely into it.  As nodes come up, they will first attempt to
> connect to other nodes via ssl, and fallback on the unencrypted
> storage_port.
> > 2. Once you have bounced the entire cluster once, switch optional to
> false and then bounce the cluster again.
> >
> > Before 4.0, a separate port (ssl_storage_port) was used for connecting
> with internode via ssl.  In 4.0, storage_port can be used for both
> unencrypted and encrypted connections, and enable_legacy_ssl_storage port
> can be used to maintain ssl_storage_port. Once the entire cluster is on 4.0
> you can set this option to false so storage_port is used over
> ssl_storage_port.
> >
> > One important thing to point out is that prior to C* 4.0, Cassandra does
> not hot reload keystore changes, so whenever you update the certificates in
> your keystores (e.g. to avoid your certificates expiring) you would need to
> bounce your cassandra instances. See:
> https://cassandra.apache.org/doc/4.0/cassandra/operating/security.html#ssl-certificate-hot-reloading
> for explanation on how that works.
> >
> > Thanks,
> > Andy
> >
> >>
> >>
> >> Regards
> >> --
> >> Shaurya Gupta
> >>
> >>
>
>
> --
> Best Regards,
> Kiran.M.K.
>


-- 
Shaurya Gupta


Re: Enabling SSL on a live cluster

2021-11-12 Thread Shaurya Gupta
Thanks Andy! It was very helpful.

On Wed, Nov 10, 2021 at 12:04 PM Tolbert, Andy  wrote:

> Hi Shaurya,
>
> On Tue, Nov 9, 2021 at 11:57 PM Shaurya Gupta 
> wrote:
>
>> Hi,
>>
>> We want to enable node-to-node SSL on a live cluster. Could it be done
>> without any down time ?
>>
>
> Yup, this is definitely doable for both internode and client connections.
> You will have to bounce your cassandra nodes, but you should be able to
> achieve this operation without any downtime.  See server_encryption_options
> in cassandra.yaml (
> https://cassandra.apache.org/doc/4.0/cassandra/configuration/cass_yaml_file.html#server_encryption_options
> )
>
>
>> Would the nodes which have been restarted be able to communicate with the
>> nodes which have not yet come up and vice versa ?
>>
>
> The idea would be to:
>
> 1. Set optional to true in server_encryption_options and bounce the
> cluster safely into it.  As nodes come up, they will first attempt to
> connect to other nodes via ssl, and fallback on the unencrypted
> storage_port.
> 2. Once you have bounced the entire cluster once, switch optional to false
> and then bounce the cluster again.
>
> Before 4.0, a separate port (ssl_storage_port) was used for connecting
> with internode via ssl.  In 4.0, storage_port can be used for both
> unencrypted and encrypted connections, and enable_legacy_ssl_storage port
> can be used to maintain ssl_storage_port. Once the entire cluster is on 4.0
> you can set this option to false so storage_port is used over
> ssl_storage_port.
>
> One important thing to point out is that prior to C* 4.0, Cassandra does
> not hot reload keystore changes, so whenever you update the certificates in
> your keystores (e.g. to avoid your certificates expiring) you would need to
> bounce your cassandra instances. See:
> https://cassandra.apache.org/doc/4.0/cassandra/operating/security.html#ssl-certificate-hot-reloading
> for explanation on how that works.
>
> Thanks,
> Andy
>
>
>>
>> Regards
>> --
>> Shaurya Gupta
>>
>>
>>

-- 
Shaurya Gupta


Enabling SSL on a live cluster

2021-11-09 Thread Shaurya Gupta
Hi,

We want to enable node-to-node SSL on a live cluster. Could it be done
without any down time ?
Would the nodes which have been restarted be able to communicate with the
nodes which have not yet come up and vice versa ?

Regards
-- 
Shaurya Gupta


Re: Adding new DC

2021-07-22 Thread Shaurya Gupta
Thanks Erick!

On Thu, Jul 22, 2021 at 2:29 PM Erick Ramirez 
wrote:

> I wouldn't use either of the steps you outlined. Neither of them are
> correct.
>
> Follow the procedure documented here instead --
> https://docs.datastax.com/en/cassandra-oss/3.x/cassandra/operations/opsAddDCToCluster.html.
> Cheers!
>


-- 
Shaurya Gupta


Adding new DC

2021-07-22 Thread Shaurya Gupta
Hi

Which of the below approaches should be followed to add a new DC to an
existing DC -

Approach 1-

   1. Create a new DC say DC2 of say 3 nodes while booting the first node
   as seed=true and subsequently using it as seed for the remaining two nodes.
   2. Now restart all the 3 nodes while providing the seed list from the
   already running DC, say DC1.
   3. Change of RFs of required keyspaces in DC1 to mention DC2.


Approach 2-

   1. Boot a new node with DC=DC2 and seed=false while providing seed list
   from DC1.
   2. Similarly, boot remaining two nodes while providing DC=DC2 and seed
   list from DC2 (node booted in previous step) & DC1.
   3. Change of RFs of required keyspaces in DC1 to mention DC2.


If someone could point to some already available documentation then that
would be great too.
-- 
Shaurya Gupta


Re: Number of DCs in Cassandra

2021-07-15 Thread Shaurya Gupta
Hi Jeff

Thanks for a detailed answer.

We are using Cassandra 3.11.2, I believe it would fall in the new version
category and IIUC should be able to handle the issue with anti-entropy
repair by adjusting merkel tree depth ?
The cross dc read repair chance is disabled in our setup so we should be
good.

Again, in 12x3, if one replica goes down beyond the hint window, when it
> comes up it's getting 35 copies of data, which is going to overwhelm it
> when it streams and compacts.

Thanks for pointing this out. IIUC this would also increase the bootstrap
time when the node comes back up.
Will it also affect the time taken for a replaced node to report UN ? AFAIK
during node replacement a particular token range is streamed from only one
of the replicas, so the time taken for replacing a node should remain the
same ?

On Wed, Jul 14, 2021 at 8:00 PM Jeff Jirsa  wrote:

> Hi,
>
> So, there's two things where you'll see the impact of "lots of datacenters"
>
> On the query side, global quorum queries (and queries with cross-dc
> probabilistic read repair) may touch more DCs and be slower, and
> read-repairs during those queries get more expensive. Your geography
> matters a ton for latency, and your write consistency and network quality
> matters a ton for read repairs. During the read, the coordinator will track
> which replicas are mismatching, and build mutations to make them in sync -
> that buildup will accumulate more data if you're very out of sync.
>
> The other thing you should expect is different behavior during repairs.
> The anti-entropy repairs do pair-wise merkle trees. If you imagine 6, 8, 12
> datacenters of 3 copies each, you've got 18, 24, 36 copies of data, each of
> those holds a merkle tree. The repair coordinator will have a lot more data
> in memory (adjusting the tree depth in newer versions, or using the offheap
> option in 4.0) starts removing the GC pressure on the coordinator in those
> types of topologies. In older versions, using subrange repair and lots of
> smaller ranges will avoid very deep trees and keep memory tolerable. ALSO,
> when you do have a mismatch, you're going to stream a LOT of data. Again,
> in 12x3, if one replica goes down beyond the hint window, when it comes up
> it's getting 35 copies of data, which is going to overwhelm it when it
> streams and compacts. CASSANDRA-3200 helps this in 4.0, and incremental
> repair helps this if you're running incremental repair (again, probably
> after CASSANDRA-9143 in 4.0), but the naive approach can lead to really bad
> surprises.
>
>
>
> On Wed, Jul 14, 2021 at 7:17 AM Shaurya Gupta 
> wrote:
>
>> Hi, Multiple DCs are required to maintain lower latencies for requests
>> across the globe. I agree that it's a lot of redundant copies of data.
>>
>> On Wed, Jul 14, 2021, 7:00 PM Jim Shaw  wrote:
>>
>>> Shaurya:
>>> What's the purpose to partise too many data centers ?
>>> RF=3,  is within a center,  you have 3 copies of data.
>>> If you have 3 DCs, means 9 copies of data.
>>> Think about space wasted, Network bandwidth wasted for number of copies.
>>> BTW, Ours just 2 DCs for regional DR.
>>>
>>> Thanks,
>>> Jim
>>>
>>> On Wed, Jul 14, 2021 at 2:27 AM Shaurya Gupta 
>>> wrote:
>>>
>>>> Hi
>>>>
>>>> Does someone have any suggestions on the maximum number of Data Centers
>>>> which NetworkTopology strategy can have for a keyspace. Not only
>>>> technically but considering performance as well.
>>>> In each Data Center RF is 3.
>>>>
>>>> Thanks!
>>>> --
>>>> Shaurya Gupta
>>>>
>>>>
>>>>

-- 
Shaurya Gupta


Re: Number of DCs in Cassandra

2021-07-14 Thread Shaurya Gupta
Hi, Multiple DCs are required to maintain lower latencies for requests
across the globe. I agree that it's a lot of redundant copies of data.

On Wed, Jul 14, 2021, 7:00 PM Jim Shaw  wrote:

> Shaurya:
> What's the purpose to partise too many data centers ?
> RF=3,  is within a center,  you have 3 copies of data.
> If you have 3 DCs, means 9 copies of data.
> Think about space wasted, Network bandwidth wasted for number of copies.
> BTW, Ours just 2 DCs for regional DR.
>
> Thanks,
> Jim
>
> On Wed, Jul 14, 2021 at 2:27 AM Shaurya Gupta 
> wrote:
>
>> Hi
>>
>> Does someone have any suggestions on the maximum number of Data Centers
>> which NetworkTopology strategy can have for a keyspace. Not only
>> technically but considering performance as well.
>> In each Data Center RF is 3.
>>
>> Thanks!
>> --
>> Shaurya Gupta
>>
>>
>>


Re: Number of DCs in Cassandra

2021-07-14 Thread Shaurya Gupta
We are planning to go with 5 DCs with RF of 3 in each. All DCs will have
reads and writes. Most queries are done at LOCAL_QUORUM.
A very few Simple and CAS queries (<0.1%) will be done at QUORUM
consistency.

On Wed, Jul 14, 2021 at 12:19 PM manish khandelwal <
manishkhandelwa...@gmail.com> wrote:

> I don't think there is any restriction on the number of data centers. So
> technically you can add as many data centers you want.
>  Performance depends on how you use your cluster. For example, one of your
> data centers could be read only, or is there traffic on all the data
> centers.
>
>
> On Wed, Jul 14, 2021 at 12:03 PM Shaurya Gupta 
> wrote:
>
>> Hi
>>
>> Does someone have any suggestions on the maximum number of Data Centers
>> which NetworkTopology strategy can have for a keyspace. Not only
>> technically but considering performance as well.
>> In each Data Center RF is 3.
>>
>> Thanks!
>> --
>> Shaurya Gupta
>>
>>
>>

-- 
Shaurya Gupta


Number of DCs in Cassandra

2021-07-14 Thread Shaurya Gupta
Hi

Does someone have any suggestions on the maximum number of Data Centers
which NetworkTopology strategy can have for a keyspace. Not only
technically but considering performance as well.
In each Data Center RF is 3.

Thanks!
-- 
Shaurya Gupta


UDTs with LWT

2020-07-15 Thread Shaurya Gupta
Hi

I have a few questions -

   1. Is it possible to have an UNFROZEN UDT in a list ?
   2. Is it possible to execute LWT on the basis of a value contained in
   List of a UDT ?


e.g. Consider a table -

CREATE TABLE test.registration_form3 (

student_id int PRIMARY KEY,

name text,

permanent_add list>,

registration_fees int

) WITH bloom_filter_fp_chance = 0.01

AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}

AND comment = ''

AND compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}

AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}

AND crc_check_chance = 1.0

AND dclocal_read_repair_chance = 0.1

AND default_time_to_live = 0

AND gc_grace_seconds = 864000

AND max_index_interval = 2048

AND memtable_flush_period_in_ms = 0

AND min_index_interval = 128

AND read_repair_chance = 0.0

AND speculative_retry = '99PERCENTILE';


cassandra@cqlsh:test> select * from registration_form3;


 *student_id* | *name*   | *permanent_add*

|
*registration_fees*

+++---

*123* | *Ashish* | *[{**house_no**: **123**, **name**: **'ashish'**,
**city**: **'patna'**, **pin**: **00**}, {**house_no**: **544**, **name**:
**'ashish'**, **city**: **'california'**, **pin**: **2019**}, {**house_no**:
**124**, **name**: **'rana'**, **city**: **'delhi'**, **pin**: **2020**}]* |
*2500*

*I want to change the above entry to-*

cassandra@cqlsh:test> select * from registration_form3;


 *student_id* | *name*   | *permanent_add*

|
*registration_fees*

+++---

*123* | *Ashish* | *[{**house_no**: **123**, **name**: **'ashish'**,
**city**: **'patna'**, **pin: 246746**}, {**house_no**: **544**, **name**: *
*'ashish'**, **city**: **'california'**, **pin**: **2019**}, {**house_no**:
**124**, **name**: **'rana'**, **city**: **'delhi'**, **pin**: **2020**}]*
|  *2500*


*and this should happen atomically.*


Thanks!
-- 
Shaurya Gupta


Point in time restore not working when primary key is blob

2019-08-07 Thread Shaurya Gupta
Hi,

I'm trying to do a point in time restore using commit logs.
It seems to be working fine if I have primary as text but it does not work
and just restores all the rows if the primary key is blob (which is default
if I create a keyspace using cassandra-stress).

Is this a known issue?

Thanks
-- 
Shaurya Gupta


UNLOGGED BATCH for same partition key but across tables

2019-07-08 Thread Shaurya Gupta
Hi,

This https://www.datastax.com/dev/blog/row-level-isolation link mentions:
"
For atomicity, the guarantee actually extends *across column families
(within the same keyspace): updates for the same partition key are
persisted atomically even for different column families*. This is not the
case however for isolation (updates to different column families are not
isolated)
"

*Though it is for Cassandra 1.1 but, is this still true and is this also
true for unlogged batch operations?*

-- 
Shaurya Gupta


Re: Mixing LWT and normal operations for a partition

2019-05-02 Thread Shaurya Gupta
Hi,


*1. The below sequence of commands also does not appear to give an expected
output.*

Since, there is a delete command in the batch and then an LWT update using
IF EXISTS, in the final result row with id = 5 must get deleted.


cassandra@cqlsh> select * from demo.tweets;




 *id* | *body* | *latitude* | *longitude* | *time*
  | *user*

+--+--+---+-+---

  *5* | *old body* |  *32.6448* |  *78.21672* | *2019-01-14
18:30:00.00+* | *user5*


(1 rows)

cassandra@cqlsh>

cassandra@cqlsh> begin batch update demo.tweets SET body='new body' where
id = 5 IF EXISTS; delete from demo.tweets where id = 5 IF EXISTS; apply
batch;


 *[applied]*

---

  *True*


cassandra@cqlsh> select * from demo.tweets;


 *id* | *body* | *latitude* | *longitude* | *time* | *user*

+--+--+---+--+--

  *5* | *new body* | *null* |  *null* | *null* | *null*


(1 rows)

cassandra@cqlsh>



*2. On the contrary below sequence of commands gives the expected output:*


cassandra@cqlsh> insert into  demo.tweets (id, user, body, time, latitude,
longitude) values (5, 'user5', 'old body', '2019-01-15', 32.644800,
78.216721);
  cassandra@cqlsh>

cassandra@cqlsh> select * from demo.tweets;


 *id* | *body* | *latitude* | *longitude* | *time*
  | *user*

+--+--+---+-+---

  *5* | *old body* |  *32.6448* |  *78.21672* | *2019-01-14
18:30:00.00+* | *user5*


(1 rows)

cassandra@cqlsh> delete from demo.tweets where id = 5 IF EXISTS;


 *[applied]*

---

  *True*


cassandra@cqlsh> select * from demo.tweets;


 *id* | *body* | *latitude* | *longitude* | *time* | *user*

+--+--+---+--+--


(0 rows)

cassandra@cqlsh> update demo.tweets SET body='new body' where id = 5 IF
EXISTS;


 *[applied]*

---

 *False*


cassandra@cqlsh> select * from demo.tweets;


 *id* | *body* | *latitude* | *longitude* | *time* | *user*

+--+--+---+--+--


(0 rows)


Thanks

Shaurya




On Fri, May 3, 2019 at 1:02 AM Shaurya Gupta  wrote:

> One suggestion - I think Cassandra community is already having a drive to
> update the documentation. This could be added to CQLSH documentation or
> some other relevant documentation.
>
> On Fri, May 3, 2019 at 12:56 AM Shaurya Gupta 
> wrote:
>
>> Thanks Jeff.
>>
>> On Fri, May 3, 2019 at 12:38 AM Jeff Jirsa  wrote:
>>
>>> No. Don’t mix LWT and normal writes.
>>>
>>> --
>>> Jeff Jirsa
>>>
>>>
>>> > On May 2, 2019, at 11:43 AM, Shaurya Gupta 
>>> wrote:
>>> >
>>> > Hi,
>>> >
>>> > We are seeing really odd behaviour while try to delete a row which is
>>> simultaneously being updated in a light weight transaction.
>>> > The delete command succeeds and the LWT update fails with timeout
>>> exception but still the next select statement shows that the row still
>>> exists. This occurs ones in many such scenarios.
>>> >
>>> > Is it fine to mix LWT and normal operations for the same partition? Is
>>> it expected to work?
>>> >
>>> > Thanks
>>> > Shaurya
>>>
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>>
>>>
>>
>> --
>> Shaurya Gupta
>>
>>
>>
>
> --
> Shaurya Gupta
>
>
>

-- 
Shaurya Gupta


Re: Mixing LWT and normal operations for a partition

2019-05-02 Thread Shaurya Gupta
One suggestion - I think Cassandra community is already having a drive to
update the documentation. This could be added to CQLSH documentation or
some other relevant documentation.

On Fri, May 3, 2019 at 12:56 AM Shaurya Gupta 
wrote:

> Thanks Jeff.
>
> On Fri, May 3, 2019 at 12:38 AM Jeff Jirsa  wrote:
>
>> No. Don’t mix LWT and normal writes.
>>
>> --
>> Jeff Jirsa
>>
>>
>> > On May 2, 2019, at 11:43 AM, Shaurya Gupta 
>> wrote:
>> >
>> > Hi,
>> >
>> > We are seeing really odd behaviour while try to delete a row which is
>> simultaneously being updated in a light weight transaction.
>> > The delete command succeeds and the LWT update fails with timeout
>> exception but still the next select statement shows that the row still
>> exists. This occurs ones in many such scenarios.
>> >
>> > Is it fine to mix LWT and normal operations for the same partition? Is
>> it expected to work?
>> >
>> > Thanks
>> > Shaurya
>>
>> -------------
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>
>
> --
> Shaurya Gupta
>
>
>

-- 
Shaurya Gupta


Re: Mixing LWT and normal operations for a partition

2019-05-02 Thread Shaurya Gupta
Thanks Jeff.

On Fri, May 3, 2019 at 12:38 AM Jeff Jirsa  wrote:

> No. Don’t mix LWT and normal writes.
>
> --
> Jeff Jirsa
>
>
> > On May 2, 2019, at 11:43 AM, Shaurya Gupta 
> wrote:
> >
> > Hi,
> >
> > We are seeing really odd behaviour while try to delete a row which is
> simultaneously being updated in a light weight transaction.
> > The delete command succeeds and the LWT update fails with timeout
> exception but still the next select statement shows that the row still
> exists. This occurs ones in many such scenarios.
> >
> > Is it fine to mix LWT and normal operations for the same partition? Is
> it expected to work?
> >
> > Thanks
> > Shaurya
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>

-- 
Shaurya Gupta


Mixing LWT and normal operations for a partition

2019-05-02 Thread Shaurya Gupta
Hi,

We are seeing really odd behaviour while try to delete a row which is
simultaneously being updated in a light weight transaction.
The delete command succeeds and the LWT update fails with timeout exception
but still the next select statement shows that the row still exists. This
occurs ones in many such scenarios.

Is it fine to mix LWT and normal operations for the same partition? Is it
expected to work?

Thanks
Shaurya


Re: Cassandra upsert ordering issue for list type fields

2019-04-22 Thread Shaurya Gupta
Which version of Cassandra are you using?
https://docs.datastax.com/en/cql/3.3/cql/cql_using/useInsertList.html and
many other relevant pages do not mention any thing of the CQLSH command
mentioned by you above.

--
Shaurya

On Mon, Apr 22, 2019 at 6:51 PM Naman Gupta 
wrote:

> I am facing issue with cassandra ordering in the tables for column types
> of list.
>
> Suppose I have a table as follows...
> FirstName:  of string
> LastName:   of string
>
> now if were to Issue
> update  (FirstName,LastName) add values ("Leonardo","DiCaprio")
> update  (FirstName,LastName) add values ("Brad","Pitt")
> update  (FirstName,LastName) add values ("mathhew","mcconehey")
> update  (FirstName,LastName) add values ("Kate","Beckinsale")
> update  (FirstName,LastName) add values ("Eva","Green")
>
>
> If I use the upserts with some time gap in between I get expected results
> i.e
>
> cqlsh output:->
> Firsname: Leonardo | Brad | Matthew  | Kate| Eva
> LastName: DiCaprio | Pitt | Mcconahey | Beckinsale | Green
>
>
> But If I updsert in a quick burst (Imagine in a for loop), I get
> unexpected results
>
> cqlsh
> Firstname: Leonardo | brad| Matthew  | Kate| Eva
> Lastname:  pitt| dicaprio  | Mcconahey | Beckinsale | Green
>
> As you can see above, generally two or so values in a column (here
> lastname) are interchanged. When the data is more, the tendency of
> unordering increases than 5 upsert queries
>
> When I flushed the db tables and took a ssTable dump using sstabledump
> , I observed that the ordering reflected in the cqlsh output
> is exactly the way it is written in the sstable. Which means that in the
> example 2 above, for column "firstname" leonardo was written before brad
> and for column "lastname" pitt was written before dicarpio.
>
> Now I am confused as to why writes which should be one one at a time are
> seemed to be written in an unordered fashion across columns. Please note
> that a write where the whole pair (firstname,lastname) TOGETHER changes
> it's position is still acceptable i.e.
>
> cqlsh
> Firstname: brad | leonardo| Matthew  | Kate| Eva
> Lastname:  pitt| dicaprio  | Mcconahey | Beckinsale | Green
>
> ... would have been completely acceptable provided the fact I will
> retrieve them by using the indedx [0] would mean brad pitt and [1] would
> mean leonardo dicaprio in my applications
> But the same indexed based retrieval would fail in case 2 where [0] would
> mean leonardo pitt and [1] would mean brad dicaprio.
>
> Please help with any insights, I would be really grateful.
>


-- 
Shaurya Gupta


Re: [EXTERNAL] Re: Caused by: com.datastax.driver.core.exceptions.ReadTimeoutException:

2019-04-17 Thread Shaurya Gupta
As already mentioned in this thread, ALLOW FILTERING should be avoided in
any scenario.

It seems to work in test scenarios, but as soon as the data increases to
certain size(a few MBs), it starts failing miserably and fails almost
always.

Thanks
Shaurya


On Wed, Apr 17, 2019, 6:44 PM Durity, Sean R 
wrote:

> If you are just trying to get a sense of the data, you could try adding a
> limit clause to limit the amount of results and hopefully beat the timeout.
>
> However, ALLOW FILTERING really means "ALLOW ME TO DESTROY MY APPLICATION
> AND CLUSTER." It means the data model does not support the query and will
> not scale -- in this case, not even on one node. Design a new table to
> support the query with a proper partition key (and any clustering keys).
>
>
> Sean Durity
>
>
> -Original Message-
> From: Dinesh Joshi 
> Sent: Wednesday, April 17, 2019 2:39 AM
> To: user@cassandra.apache.org
> Subject: [EXTERNAL] Re: Caused by:
> com.datastax.driver.core.exceptions.ReadTimeoutException:
>
> More info with detailed explanation:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.instaclustr.com_apache-2Dcassandra-2Dscalability-2Dallow-2Dfiltering-2Dpartition-2Dkeys_=DwIFAg=MtgQEAMQGqekjTjiAhkudQ=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ=HXMAEKpR-N5O0-U5rclUrsVk5QPmSUQYels4VTOVZWI=LHl3QGlLsAdszkJ6XK3O2w7_EcSyRyaSFjBgEcK9nfo=
>
> Dinesh
>
> > On Apr 16, 2019, at 11:24 PM, Mahesh Daksha  wrote:
> >
> > Hi,
> >
> > how much data you are trying to read in the single query? is it large in
> size or normal text data.
> > Looking at the exception it seems the node is unable to deliver data
> within stipulated time. I have faced similar issue with the response data
> in huge in size (some binary data). But it was solved as we spread the data
> across multiple rows.
> >
> > Thanks,
> > Mahesh Daksha
> >
> > On Wed, Apr 17, 2019 at 11:42 AM Krishnanand Khambadkone <
> kkhambadk...@yahoo.com.invalid> wrote:
> > Hi,  I have a single instance cassandra server.  I am trying to execute
> a query with ALLOW FILTERING option.  When I run this same query from cqlsh
> it runs fine but when I try to execute the query through the java driver it
> throws this exception.  I have increased all the timeouts in cassandra.yaml
> file and also included the read timeout option in the SimpleStetement query
> I am running.  Any idea how I can fix this issue.
> > Caused by: com.datastax.driver.core.exceptions.ReadTimeoutException:
> Cassandra timeout during read query at consistency LOCAL_ONE (1 responses
> were required but only 0 replica responded)
> >
> >
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>
> 
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: Apache Cassandra meetup @ Instagram HQ

2019-02-18 Thread Shaurya Gupta
Hi,

This looks very interesting to me. Can I attend this remotely?

Thanks
Shaurya


On Tue, Feb 19, 2019 at 5:37 AM dinesh.jo...@yahoo.com.INVALID
 wrote:

> Hi all,
>
> Apologies for the cross-post. In case you're in the SF Bay Area, Instagram
> is hosting a meetup. Interesting talks on Cassandra Traffic management,
> Cassandra on Kubernetes. See details in the attached link -
>
>
> https://www.eventbrite.com/e/cassandra-traffic-management-at-instagram-cassandra-and-k8s-with-instaclustr-tickets-54986803008
>
> Thanks,
>
> Dinesh
>


-- 
Shaurya Gupta


Re: request_scheduler functionalities for CQL Native Transport

2018-11-28 Thread Shaurya Gupta
Hi,

CASSANDRA-8303 talks about more granular control at the query level. What
we are looking at is throttling on the basis of the number of queries
received for different keyspaces. This is what request_scheduler and
request_scheduler_options provide for clients connecting via thrift.

Regards

On Wed, Nov 28, 2018 at 2:27 PM dinesh.jo...@yahoo.com.INVALID
 wrote:

> I think what you're looking for might be solved by CASSANDRA-8303.
> However, I am not sure if anybody is working on it. Generally you want to
> create different clusters for users to physically isolate them. What you
> propose has been discussed in the past and it is something that is
> currently unsupported.
>
> Dinesh
>
>
> On Tuesday, November 27, 2018, 11:05:32 PM PST, Shaurya Gupta <
> shaurya.n...@gmail.com> wrote:
>
>
> Hi,
>
> We want to throttle maximum queries on any keyspace for clients connecting
> via CQL native transport. This option is available for clients connecting
> via thrift by property of request_scheduler in cassandra.yaml.
> Is there some option available for clients connecting via CQL native
> transport.
> If not is there any plan to do so in future.
> It is a must have feature if we want to support multiple teams on a single
> cassandra cluster or to prevent one keyspace from interfering with the
> performance of the other keyspaces.
>
> Regards
> Shaurya Gupta
>
>
>

-- 
Shaurya Gupta


request_scheduler functionalities for CQL Native Transport

2018-11-27 Thread Shaurya Gupta
Hi,

We want to throttle maximum queries on any keyspace for clients connecting
via CQL native transport. This option is available for clients connecting
via thrift by property of request_scheduler in cassandra.yaml.
Is there some option available for clients connecting via CQL native
transport.
If not is there any plan to do so in future.
It is a must have feature if we want to support multiple teams on a single
cassandra cluster or to prevent one keyspace from interfering with the
performance of the other keyspaces.

Regards
Shaurya Gupta