Exact use case for CustomPayloads in v4 protocol version

2020-01-10 Thread Goutham reddy
Hello all,
I was trying to explore more about the custom payloads which was introduced
in protocol version v4. I could not get the actual use case using custom
payload. Can somebody shed some light on this? Appreciate your help:)

Thanks and regards,
Goutham


Re: Is it possible to build multi cloud cluster for Cassandra

2019-09-05 Thread Goutham reddy
Thanks Jon that explained me everything.

On Thu, Sep 5, 2019 at 10:00 AM Jon Haddad  wrote:

> Technically, not a problem.  Use GossipingPropertyFileSnitch to keep
> things simple and you can go across whatever cloud providers you want
> without issue.
>
> The biggest issue you're going to have isn't going to be Cassandra, it's
> having the expertise in the different cloud providers to understand their
> strengths and weaknesses.  You'll want to benchmark every resource, and
> properly sizing your instances to C* is now 2x (or 3x for 3 cloud
> providers) the work.
>
> I recommend using Terraform to make provisioning a bit easier.
>
> On Thu, Sep 5, 2019 at 9:36 AM Goutham reddy 
> wrote:
>
>> Hello,
>> Is it wise and advisable to build multi cloud environment for Cassandra
>> for High Availability.
>> AWS as one datacenter and Azure as another datacenter.
>> If yes are there any challenges involved?
>>
>> Thanks and regards,
>> Goutham.
>>
>


Is it possible to build multi cloud cluster for Cassandra

2019-09-05 Thread Goutham reddy
Hello,
Is it wise and advisable to build multi cloud environment for Cassandra for
High Availability.
AWS as one datacenter and Azure as another datacenter.
If yes are there any challenges involved?

Thanks and regards,
Goutham.


Re: Insert constant value for all the rows for a given column

2019-02-27 Thread Goutham reddy
The value is constant as of now but we are expecting lot many different
values to come in
future.

Secondly, is Presto a distributed system ?

Thanks and Regards,
Goutham.

On Wed, Feb 27, 2019 at 5:09 PM Kenneth Brotman
 wrote:

> If you know the value already, why do you need to store it in every row of
> a table?  Seems like something is wrong.  Why do you need to do that, if
> you can share that information?
>
>
>
> *From:* Kenneth Brotman [mailto:kenbrot...@yahoo.com]
> *Sent:* Wednesday, February 27, 2019 5:08 PM
> *To:* 'user@cassandra.apache.org'
> *Subject:* RE: Insert constant value for all the rows for a given column
>
>
>
> Yup
>
>
>
> *From:* Goutham reddy [mailto:goutham.chiru...@gmail.com]
> *Sent:* Wednesday, February 27, 2019 5:06 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Insert constant value for all the rows for a given column
>
>
>
> Are you taking about SQL engine Presto ;)
>
>
>
> On Wed, Feb 27, 2019 at 4:59 PM Kenneth Brotman
>  wrote:
>
> Who are Presto?
>
>
>
> *From:* Goutham reddy [mailto:goutham.chiru...@gmail.com]
> *Sent:* Wednesday, February 27, 2019 4:52 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Insert constant value for all the rows for a given column
>
>
>
> Thanks Kenneth, writing Spark application is our last option and we are
> looking out for some hack way to update the column.
>
>
>
> On Wed, Feb 27, 2019 at 4:48 PM Kenneth Brotman
>  wrote:
>
> Ouch!  I was sure I saw it in some material I studied but no.  It looks
> like you have to provide the value before Cassandra, maybe through your
> application or something in the stream before Cassandra, or add it after
> Cassandra or use something like Spark to process it.
>
>
>
> Kenneth Brotman
>
>
>
> *From:* Goutham reddy [mailto:goutham.chiru...@gmail.com]
> *Sent:* Wednesday, February 27, 2019 4:32 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Insert constant value for all the rows for a given column
>
>
>
> True Bharath, there is no option to add a default value in Cassandra for a
> column.
>
> Regards
>
> Goutham Reddy
>
>
>
>
>
> On Wed, Feb 27, 2019 at 4:31 PM kumar bharath 
> wrote:
>
> Kenneth,
>
>
>
> I didn't see any such option  for adding a default value on a column.
>
>
>
> Thanks,
>
> Bharath Kumar B
>
>
>
> On Wed, Feb 27, 2019 at 4:25 PM Kenneth Brotman
>  wrote:
>
> How about using a default value on the column?
>
>
>
> *From:* Kenneth Brotman [mailto:kenbrotman@yahoo.comINVALID
> ]
> *Sent:* Wednesday, February 27, 2019 4:23 PM
> *To:* user@cassandra.apache.org
> *Subject:* RE: Insert constant value for all the rows for a given column
>
>
>
> Good point.
>
>
>
> *From:* Goutham reddy [mailto:goutham.chiru...@gmail.com]
> *Sent:* Wednesday, February 27, 2019 4:11 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Insert constant value for all the rows for a given column
>
>
>
> Kenneth,
>
> I believe "static column" applies for one partition key. Correct me if my
> understanding is wrong.
>
>
>
> Regards
>
> Goutham Reddy
>
>
>
>
>
> On Wed, Feb 27, 2019 at 4:08 PM Kenneth Brotman
>  wrote:
>
> Sounds like what’s called a “static column”.
>
>
>
> *From:* Goutham reddy [mailto:goutham.chiru...@gmail.com]
>
> *Sent:* Wednesday, February 27, 2019 4:06 PM
> *To:* u...@cassandraapache.org 
> *Subject:* Insert constant value for all the rows for a given column
>
>
>
> Hi,
>
> We have a requirement to add constant value to all the rows for a
> particular column, and we could not find any solution. Can anybody provide
> standard procedure for the problem. Appreciate your help.
>
>
>
> Regards
>
> Goutham Reddy
>
> --
>
> Regards
>
> Goutham Reddy
>
> --
>
> Regards
>
> Goutham Reddy
>
-- 
Regards
Goutham Reddy


Re: Insert constant value for all the rows for a given column

2019-02-27 Thread Goutham reddy
Are you taking about SQL engine Presto ;)

On Wed, Feb 27, 2019 at 4:59 PM Kenneth Brotman
 wrote:

> Who are Presto?
>
>
>
> *From:* Goutham reddy [mailto:goutham.chiru...@gmail.com]
> *Sent:* Wednesday, February 27, 2019 4:52 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Insert constant value for all the rows for a given column
>
>
>
> Thanks Kenneth, writing Spark application is our last option and we are
> looking out for some hack way to update the column.
>
>
>
> On Wed, Feb 27, 2019 at 4:48 PM Kenneth Brotman
>  wrote:
>
> Ouch!  I was sure I saw it in some material I studied but no.  It looks
> like you have to provide the value before Cassandra, maybe through your
> application or something in the stream before Cassandra, or add it after
> Cassandra or use something like Spark to process it.
>
>
>
> Kenneth Brotman
>
>
>
> *From:* Goutham reddy [mailto:goutham.chiru...@gmail.com]
> *Sent:* Wednesday, February 27, 2019 4:32 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Insert constant value for all the rows for a given column
>
>
>
> True Bharath, there is no option to add a default value in Cassandra for a
> column.
>
> Regards
>
> Goutham Reddy
>
>
>
>
>
> On Wed, Feb 27, 2019 at 4:31 PM kumar bharath 
> wrote:
>
> Kenneth,
>
>
>
> I didn't see any such option  for adding a default value on a column.
>
>
>
> Thanks,
>
> Bharath Kumar B
>
>
>
> On Wed, Feb 27, 2019 at 4:25 PM Kenneth Brotman
>  wrote:
>
> How about using a default value on the column?
>
>
>
> *From:* Kenneth Brotman [mailto:kenbrot...@yahoo.com.INVALID]
> *Sent:* Wednesday, February 27, 2019 4:23 PM
> *To:* user@cassandra.apache.org
> *Subject:* RE: Insert constant value for all the rows for a given column
>
>
>
> Good point.
>
>
>
> *From:* Goutham reddy [mailto:goutham.chiru...@gmail.com]
> *Sent:* Wednesday, February 27, 2019 4:11 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Insert constant value for all the rows for a given column
>
>
>
> Kenneth,
>
> I believe "static column" applies for one partition key. Correct me if my
> understanding is wrong.
>
>
>
> Regards
>
> Goutham Reddy
>
>
>
>
>
> On Wed, Feb 27, 2019 at 4:08 PM Kenneth Brotman
>  wrote:
>
> Sounds like what’s called a “static column”.
>
>
>
> *From:* Goutham reddy [mailto:goutham.chiru...@gmail.com]
>
> *Sent:* Wednesday, February 27, 2019 4:06 PM
> *To:* u...@cassandraapache.org 
> *Subject:* Insert constant value for all the rows for a given column
>
>
>
> Hi,
>
> We have a requirement to add constant value to all the rows for a
> particular column, and we could not find any solution. Can anybody provide
> standard procedure for the problem. Appreciate your help.
>
>
>
> Regards
>
> Goutham Reddy
>
> --
>
> Regards
>
> Goutham Reddy
>
-- 
Regards
Goutham Reddy


Re: Insert constant value for all the rows for a given column

2019-02-27 Thread Goutham reddy
Thanks Kenneth, writing Spark application is our last option and we are
looking out for some hack way to update the column.

On Wed, Feb 27, 2019 at 4:48 PM Kenneth Brotman
 wrote:

> Ouch!  I was sure I saw it in some material I studied but no.  It looks
> like you have to provide the value before Cassandra, maybe through your
> application or something in the stream before Cassandra, or add it after
> Cassandra or use something like Spark to process it.
>
>
>
> Kenneth Brotman
>
>
>
> *From:* Goutham reddy [mailto:goutham.chiru...@gmail.com]
> *Sent:* Wednesday, February 27, 2019 4:32 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Insert constant value for all the rows for a given column
>
>
>
> True Bharath, there is no option to add a default value in Cassandra for a
> column.
>
> Regards
>
> Goutham Reddy
>
>
>
>
>
> On Wed, Feb 27, 2019 at 4:31 PM kumar bharath 
> wrote:
>
> Kenneth,
>
>
>
> I didn't see any such option  for adding a default value on a column.
>
>
>
> Thanks,
>
> Bharath Kumar B
>
>
>
> On Wed, Feb 27, 2019 at 4:25 PM Kenneth Brotman
>  wrote:
>
> How about using a default value on the column?
>
>
>
> *From:* Kenneth Brotman [mailto:kenbrot...@yahoo.com.INVALID]
> *Sent:* Wednesday, February 27, 2019 4:23 PM
> *To:* user@cassandra.apache.org
> *Subject:* RE: Insert constant value for all the rows for a given column
>
>
>
> Good point.
>
>
>
> *From:* Goutham reddy [mailto:goutham.chiru...@gmail.com]
> *Sent:* Wednesday, February 27, 2019 4:11 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Insert constant value for all the rows for a given column
>
>
>
> Kenneth,
>
> I believe "static column" applies for one partition key. Correct me if my
> understanding is wrong.
>
>
>
> Regards
>
> Goutham Reddy
>
>
>
>
>
> On Wed, Feb 27, 2019 at 4:08 PM Kenneth Brotman
>  wrote:
>
> Sounds like what’s called a “static column”.
>
>
>
> *From:* Goutham reddy [mailto:goutham.chiru...@gmail.com]
>
> *Sent:* Wednesday, February 27, 2019 4:06 PM
> *To:* u...@cassandraapache.org 
> *Subject:* Insert constant value for all the rows for a given column
>
>
>
> Hi,
>
> We have a requirement to add constant value to all the rows for a
> particular column, and we could not find any solution. Can anybody provide
> standard procedure for the problem. Appreciate your help.
>
>
>
> Regards
>
> Goutham Reddy
>
> --
Regards
Goutham Reddy


Re: Insert constant value for all the rows for a given column

2019-02-27 Thread Goutham reddy
True Bharath, there is no option to add a default value in Cassandra for a
column.
Regards
Goutham Reddy


On Wed, Feb 27, 2019 at 4:31 PM kumar bharath 
wrote:

> Kenneth,
>
> I didn't see any such option  for adding a default value on a column.
>
> Thanks,
> Bharath Kumar B
>
> On Wed, Feb 27, 2019 at 4:25 PM Kenneth Brotman
>  wrote:
>
>> How about using a default value on the column?
>>
>>
>>
>> *From:* Kenneth Brotman [mailto:kenbrot...@yahoo.com.INVALID]
>> *Sent:* Wednesday, February 27, 2019 4:23 PM
>> *To:* user@cassandra.apache.org
>> *Subject:* RE: Insert constant value for all the rows for a given column
>>
>>
>>
>> Good point.
>>
>>
>>
>> *From:* Goutham reddy [mailto:goutham.chiru...@gmail.com]
>> *Sent:* Wednesday, February 27, 2019 4:11 PM
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: Insert constant value for all the rows for a given column
>>
>>
>>
>> Kenneth,
>>
>> I believe "static column" applies for one partition key. Correct me if my
>> understanding is wrong.
>>
>>
>>
>> Regards
>>
>> Goutham Reddy
>>
>>
>>
>>
>>
>> On Wed, Feb 27, 2019 at 4:08 PM Kenneth Brotman
>>  wrote:
>>
>> Sounds like what’s called a “static column”.
>>
>>
>>
>> *From:* Goutham reddy [mailto:goutham.chiru...@gmail.com]
>> *Sent:* Wednesday, February 27, 2019 4:06 PM
>> *To:* user@cassandra.apache.org
>> *Subject:* Insert constant value for all the rows for a given column
>>
>>
>>
>> Hi,
>>
>> We have a requirement to add constant value to all the rows for a
>> particular column, and we could not find any solution. Can anybody provide
>> standard procedure for the problem. Appreciate your help.
>>
>>
>>
>> Regards
>>
>> Goutham Reddy
>>
>>


Re: Insert constant value for all the rows for a given column

2019-02-27 Thread Goutham reddy
Kenneth,
I believe "static column" applies for one partition key. Correct me if my
understanding is wrong.

Regards
Goutham Reddy


On Wed, Feb 27, 2019 at 4:08 PM Kenneth Brotman
 wrote:

> Sounds like what’s called a “static column”.
>
>
>
> *From:* Goutham reddy [mailto:goutham.chiru...@gmail.com]
> *Sent:* Wednesday, February 27, 2019 4:06 PM
> *To:* user@cassandra.apache.org
> *Subject:* Insert constant value for all the rows for a given column
>
>
>
> Hi,
>
> We have a requirement to add constant value to all the rows for a
> particular column, and we could not find any solution. Can anybody provide
> standard procedure for the problem. Appreciate your help.
>
>
>
> Regards
>
> Goutham Reddy
>


Insert constant value for all the rows for a given column

2019-02-27 Thread Goutham reddy
Hi,
We have a requirement to add constant value to all the rows for a
particular column, and we could not find any solution. Can anybody provide
standard procedure for the problem. Appreciate your help.

Regards
Goutham Reddy


Re: Partition key with 300K rows can it be queried and distributed using Spark

2019-01-17 Thread Goutham reddy
Thanks Jeff, yes we have 18 columns in total. But my question was does
spark can retrieve data by partitioning 300k data into spark nodes?

On Thu, Jan 17, 2019 at 1:30 PM Jeff Jirsa  wrote:

> The reason big rows are painful in Cassandra is that by default, we index
> it every 64kb. With 300k objects, it may or may not have a lot of those
> little index blocks/objects. How big is each row?
>
> If you try to read it and it's very wide, you may see heap pressure / GC.
> If so, you could try changing the column index size from 64k to something
> larger (128k, 256k, etc) - small point reads will be more disk IO, but less
> heap pressure.
>
>
>
> On Thu, Jan 17, 2019 at 12:15 PM Goutham reddy 
> wrote:
>
>> Hi,
>> As each partition key can hold up to 2 Billion rows, even then it is an
>> anti-pattern to have such huge data set for one partition key in our case
>> it is 300k rows only, but when trying to query for one particular key we
>> are getting timeout exception. If I use Spark to get the 300k rows for a
>> particular key does it solve the problem of timeouts and distribute the
>> data across the spark nodes or will it still throw timeout exceptions. Can
>> you please help me with the best practice to retrieve the data for the key
>> with 300k rows. Any help is highly appreciated.
>>
>> Regards
>> Goutham.
>>
> --
Regards
Goutham Reddy


Partition key with 300K rows can it be queried and distributed using Spark

2019-01-17 Thread Goutham reddy
Hi,
As each partition key can hold up to 2 Billion rows, even then it is an
anti-pattern to have such huge data set for one partition key in our case
it is 300k rows only, but when trying to query for one particular key we
are getting timeout exception. If I use Spark to get the 300k rows for a
particular key does it solve the problem of timeouts and distribute the
data across the spark nodes or will it still throw timeout exceptions. Can
you please help me with the best practice to retrieve the data for the key
with 300k rows. Any help is highly appreciated.

Regards
Goutham.


Re: Deployment

2019-01-11 Thread Goutham reddy
In the cloud world all the Apps and servers are deployed independently.
Having said that application has to submit request to post or get the data
it’s the Cassandra takes care of routing it internally to get the data.
Yes it’s better to have on same cloud for better performance. Hope you got
some insight from above.

Regards,
Goutham.

On Fri, Jan 11, 2019 at 7:24 PM amit sehas  wrote:

> I am new to Cassandra, i am wondering how the Cassandra applications are
> deployed in the cloud. Does Cassandra have a client server architecture and
> the application is deployed as a 3rd tier that sends over queries to the
> clients, which then submit them to the Cassandra servers?  Or does the
> application submit the request directly to any of the Cassandra server
> which then decides where the query will be routed to, and then gathers the
> response and returns that to the application.
>
> Does the application accessing the data get deployed on the same nodes in
> the cloud as the Cassandra cluster itself? Or on separate nodes?  Are there
> any best practices available in this regard?
>
> thanks
>
-- 
Regards
Goutham Reddy


Re: [EXTERNAL] Re: Good way of configuring Apache spark with Apache Cassandra

2019-01-09 Thread Goutham reddy
Thanks Sean. But what if I want to have both Spark and elasticsearch with
Cassandra as separare data center. Does that cause any overhead ?

On Wed, Jan 9, 2019 at 7:28 AM Durity, Sean R 
wrote:

> I think you could consider option C: Create a (new) analytics DC in
> Cassandra and run your spark nodes there. Then you can address the scaling
> just on that DC. You can also use less vnodes, only replicate certain
> keyspaces, etc. in order to perform the analytics more efficiently.
>
>
>
>
>
> Sean Durity
>
>
>
> *From:* Dor Laor 
> *Sent:* Friday, January 04, 2019 4:21 PM
> *To:* user@cassandra.apache.org
> *Subject:* [EXTERNAL] Re: Good way of configuring Apache spark with
> Apache Cassandra
>
>
>
> I strongly recommend option B, separate clusters. Reasons:
>
>  - Networking of node-node is negligible compared to networking within the
> node
>
>  - Different scaling considerations
>
>Your workload may require 10 Spark nodes and 20 database nodes, so why
> bundle them?
>
>This ratio may also change over time as your application evolves and
> amount of data changes.
>
>  - Isolation - If Spark has a spike in cpu/IO utilization, you wouldn't
> want it to affect Cassandra and the opposite.
>
>If you isolate it with cgroups, you may have too much idle time when
> the above doesn't happen.
>
>
>
>
>
> On Fri, Jan 4, 2019 at 12:47 PM Goutham reddy 
> wrote:
>
> Hi,
>
> We have requirement of heavy data lifting and analytics requirement and
> decided to go with Apache Spark. In the process we have come up with two
> patterns
>
> a. Apache Spark and Apache Cassandra co-located and shared on same nodes.
>
> b. Apache Spark on one independent cluster and Apache Cassandra as one
> independent cluster.
>
>
>
> Need good pattern how to use the analytic engine for Cassandra. Thanks in
> advance.
>
>
>
> Regards
>
> Goutham.
>
>
> --
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>
-- 
Regards
Goutham Reddy


Re: Good way of configuring Apache spark with Apache Cassandra

2019-01-04 Thread Goutham reddy
Thanks Jonathan, I believe we have to reconsider the way analytics have to
be performed.

On Fri, Jan 4, 2019 at 1:46 PM Jonathan Haddad  wrote:

> If you absolutely have to use Cassandra as the source of your data, I
> agree with Dor.
>
> That being said, if you're going to be doing a lot of analytics, I
> recommend using something other than Cassandra with Spark.  The performance
> isn't particularly wonderful and you'll likely get anywhere from 10-50x
> improvement from putting the data in an analytics friendly format (parquet)
> and on a block / blob store (DFS or S3) instead.
>
> On Fri, Jan 4, 2019 at 1:43 PM Goutham reddy 
> wrote:
>
>> Thank you very much Dor for the detailed information, yes that should be
>> the primary reason why we have to isolate from Cassandra.
>>
>> Thanks and Regards,
>> Goutham Reddy
>>
>>
>> On Fri, Jan 4, 2019 at 1:29 PM Dor Laor  wrote:
>>
>>> I strongly recommend option B, separate clusters. Reasons:
>>>  - Networking of node-node is negligible compared to networking within
>>> the node
>>>  - Different scaling considerations
>>>Your workload may require 10 Spark nodes and 20 database nodes, so
>>> why bundle them?
>>>This ratio may also change over time as your application evolves and
>>> amount of data changes.
>>>  - Isolation - If Spark has a spike in cpu/IO utilization, you wouldn't
>>> want it to affect Cassandra and the opposite.
>>>If you isolate it with cgroups, you may have too much idle time when
>>> the above doesn't happen.
>>>
>>>
>>> On Fri, Jan 4, 2019 at 12:47 PM Goutham reddy <
>>> goutham.chiru...@gmail.com> wrote:
>>>
>>>> Hi,
>>>> We have requirement of heavy data lifting and analytics requirement and
>>>> decided to go with Apache Spark. In the process we have come up with two
>>>> patterns
>>>> a. Apache Spark and Apache Cassandra co-located and shared on same
>>>> nodes.
>>>> b. Apache Spark on one independent cluster and Apache Cassandra as one
>>>> independent cluster.
>>>>
>>>> Need good pattern how to use the analytic engine for Cassandra. Thanks
>>>> in advance.
>>>>
>>>> Regards
>>>> Goutham.
>>>>
>>>
>
> --
> Jon Haddad
> http://www.rustyrazorblade.com
> twitter: rustyrazorblade
>
-- 
Regards
Goutham Reddy


Re: Good way of configuring Apache spark with Apache Cassandra

2019-01-04 Thread Goutham reddy
Thank you very much Dor for the detailed information, yes that should be
the primary reason why we have to isolate from Cassandra.

Thanks and Regards,
Goutham Reddy


On Fri, Jan 4, 2019 at 1:29 PM Dor Laor  wrote:

> I strongly recommend option B, separate clusters. Reasons:
>  - Networking of node-node is negligible compared to networking within the
> node
>  - Different scaling considerations
>Your workload may require 10 Spark nodes and 20 database nodes, so why
> bundle them?
>This ratio may also change over time as your application evolves and
> amount of data changes.
>  - Isolation - If Spark has a spike in cpu/IO utilization, you wouldn't
> want it to affect Cassandra and the opposite.
>If you isolate it with cgroups, you may have too much idle time when
> the above doesn't happen.
>
>
> On Fri, Jan 4, 2019 at 12:47 PM Goutham reddy 
> wrote:
>
>> Hi,
>> We have requirement of heavy data lifting and analytics requirement and
>> decided to go with Apache Spark. In the process we have come up with two
>> patterns
>> a. Apache Spark and Apache Cassandra co-located and shared on same nodes.
>> b. Apache Spark on one independent cluster and Apache Cassandra as one
>> independent cluster.
>>
>> Need good pattern how to use the analytic engine for Cassandra. Thanks in
>> advance.
>>
>> Regards
>> Goutham.
>>
>


Good way of configuring Apache spark with Apache Cassandra

2019-01-04 Thread Goutham reddy
Hi,
We have requirement of heavy data lifting and analytics requirement and
decided to go with Apache Spark. In the process we have come up with two
patterns
a. Apache Spark and Apache Cassandra co-located and shared on same nodes.
b. Apache Spark on one independent cluster and Apache Cassandra as one
independent cluster.

Need good pattern how to use the analytic engine for Cassandra. Thanks in
advance.

Regards
Goutham.


Re: Is Apache Cassandra supports Data at rest

2018-11-13 Thread Goutham reddy
Thanks Jeff for the clarification :)

On Mon, Nov 12, 2018 at 10:44 PM Jeff Jirsa  wrote:

> If you mean encryption at rest: no, it’s not currently supported. It’ll
> eventually be implemented in
> https://issues.apache.org/jira/browse/CASSANDRA-9633 , but that ticket is
> currently unassigned and there’s no ETA.
>
>
> --
> Jeff Jirsa
>
>
> On Nov 12, 2018, at 10:21 PM, Goutham reddy 
> wrote:
>
> Hi,
> Does Apache Cassandra supports data at rest, because datastax Cassandra
> supports it. Can anybody help me.
>
> Thanks and Regards,
> Goutham.
> --
> Regards
> Goutham Reddy
>
> --
Regards
Goutham Reddy


Is Apache Cassandra supports Data at rest

2018-11-12 Thread Goutham reddy
Hi,
Does Apache Cassandra supports data at rest, because datastax Cassandra
supports it. Can anybody help me.

Thanks and Regards,
Goutham.
-- 
Regards
Goutham Reddy


Re: Re: Re: How to set num tokens on live node

2018-11-02 Thread Goutham reddy
Omnester,
Thanks a ton, will try to execute the same. I will look into that thread.

Thanks and Regards,
Goutham Reddy Aenugu.

On Fri, Nov 2, 2018 at 1:36 AM onmstester onmstester
 wrote:

> I think that is not possible.
> If currently both DC's are in use, you should remove one of them (gently,
> by changing replication config), then change num_tokens in removed dc, add
> it again with changing replication config, and finally do the same for the
> other dc.
>
> P.S A while ago, there was a thread in this forum, discussing that
> num_tokens 256 is not a good default in Cassandra and should use a smaller
> number like 4,8 or 16, i recommend you to read it through, maybe the whole
> migration (from 8 to 256) became unnecessary
>
> Sent using Zoho Mail <https://www.zoho.com/mail/>
>
>
> ======== Forwarded message 
> From : Goutham reddy 
> To : 
> Date : Fri, 02 Nov 2018 11:52:53 +0330
> Subject : Re: Re: How to set num tokens on live node
>  Forwarded message 
>
> Onmstester,
> Thanks for the reply, but for both the DC’s I need to change my num_token
> value from 8 to 256. So that is the challenge I am facing. Any comments.
>
> Thanks and Regards,
> Goutham
>
> On Fri, Nov 2, 2018 at 1:08 AM onmstester onmstester <
> onmstes...@zoho.com.invalid> wrote:
>
> --
> Regards
> Goutham Reddy
>
>
> IMHO, the best option with two datacenters is to config replication
> strategy to stream data from dc with wrong num_token to correct one, and
> then a repair on each node would move your data to the other dc
>
> Sent using Zoho Mail <https://www.zoho.com/mail/>
>
>
>  Forwarded message 
> From : Goutham reddy 
> To : 
> Date : Fri, 02 Nov 2018 10:46:10 +0330
> Subject : Re: How to set num tokens on live node
>  Forwarded message 
>
> Elliott,
> Thanks Elliott, how about if we have two Datacenters, any comments?
>
> Thanks and Regards,
> Goutham.
>
> On Thu, Nov 1, 2018 at 5:40 PM Elliott Sims  wrote:
>
> --
> Regards
> Goutham Reddy
>
> As far as I know, it's not possible to change it live.  You have to create
> a new "datacenter" with new hosts using the new num_tokens value, then
> switch everything to use the new DC and tear down the old.
>
> On Thu, Nov 1, 2018 at 6:16 PM Goutham reddy 
> wrote:
>
> Hi team,
> Can someone help me out I don’t find anywhere how to change the numtokens
> on a running nodes. Any help is appreciated
>
> Thanks and Regards,
> Goutham.
>
>
> --
> Regards
> Goutham Reddy
>
>
>
>
> --
Regards
Goutham Reddy


Re: Re: How to set num tokens on live node

2018-11-02 Thread Goutham reddy
Onmstester,
Thanks for the reply, but for both the DC’s I need to change my num_token
value from 8 to 256. So that is the challenge I am facing. Any comments.

Thanks and Regards,
Goutham

On Fri, Nov 2, 2018 at 1:08 AM onmstester onmstester
 wrote:

> IMHO, the best option with two datacenters is to config replication
> strategy to stream data from dc with wrong num_token to correct one, and
> then a repair on each node would move your data to the other dc
>
> Sent using Zoho Mail <https://www.zoho.com/mail/>
>
>
>  Forwarded message ====
> From : Goutham reddy 
> To : 
> Date : Fri, 02 Nov 2018 10:46:10 +0330
> Subject : Re: How to set num tokens on live node
>  Forwarded message 
>
> Elliott,
> Thanks Elliott, how about if we have two Datacenters, any comments?
>
> Thanks and Regards,
> Goutham.
>
> On Thu, Nov 1, 2018 at 5:40 PM Elliott Sims  wrote:
>
> --
> Regards
> Goutham Reddy
>
> As far as I know, it's not possible to change it live.  You have to create
> a new "datacenter" with new hosts using the new num_tokens value, then
> switch everything to use the new DC and tear down the old.
>
> On Thu, Nov 1, 2018 at 6:16 PM Goutham reddy 
> wrote:
>
> Hi team,
> Can someone help me out I don’t find anywhere how to change the numtokens
> on a running nodes. Any help is appreciated
>
> Thanks and Regards,
> Goutham.
>
>
> --
> Regards
> Goutham Reddy
>
>
>
> --
Regards
Goutham Reddy


Re: How to set num tokens on live node

2018-11-02 Thread Goutham reddy
Elliott,
Thanks Elliott, how about if we have two Datacenters, any comments?

Thanks and Regards,
Goutham.

On Thu, Nov 1, 2018 at 5:40 PM Elliott Sims  wrote:

> As far as I know, it's not possible to change it live.  You have to create
> a new "datacenter" with new hosts using the new num_tokens value, then
> switch everything to use the new DC and tear down the old.
>
> On Thu, Nov 1, 2018 at 6:16 PM Goutham reddy 
> wrote:
>
>> Hi team,
>> Can someone help me out I don’t find anywhere how to change the numtokens
>> on a running nodes. Any help is appreciated
>>
>> Thanks and Regards,
>> Goutham.
>> --
>> Regards
>> Goutham Reddy
>>
> --
Regards
Goutham Reddy


How to set num tokens on live node

2018-11-01 Thread Goutham reddy
Hi team,
Can someone help me out I don’t find anywhere how to change the numtokens
on a running nodes. Any help is appreciated

Thanks and Regards,
Goutham.
-- 
Regards
Goutham Reddy


Re: Advantage over Cassandra in Kubernetes

2018-10-13 Thread Goutham reddy
Thanks Ben for the detailed insight. Btw we are planning to set up
Kubernetes Cassandra cluster in Development. We wanted to know what are the
possible problems we face if we go by Kubernetes

On Thu, Oct 11, 2018 at 4:06 PM Ben Bromhead  wrote:

> This is a fairly high-level question which could end up going quite deep,
> but below is a quick summary off the top of my head.
>
> You can get a few advantages when running Cassandra in Kubernetes,
> particularly:
>
>- Easy discovery and network connectivity with other services running
>on K8s
>- Reproducible, repeatable operations and deployments
>- A cloud-independent approach to container orchestration, that is
>supported by all major cloud providers.
>- Easy backups, deployments, scaling etc via statefulsets or an
>operator (see https://github.com/instaclustr/cassandra-operator).
>
> There are also some rough edges with running Cassandra on Kubernetes:
>
>- Failure domain placement with statefulsets is still challenging
>(v1.12 goes a long way to fixing this -
>
> https://kubernetes.io/blog/2018/10/11/topology-aware-volume-provisioning-in-kubernetes/
>)
>- Getting resource constraints correct and working out scheduling in
>constrained environments can be maddening.
>- Only a few, small deployments (that I'm aware of) are running
>Cassandra in Kubernetes in production. So you will be breaking new ground
>and encounter problems that haven't been solved before.
>- The Cassandra examples in the official Kubernetes documentation is
>not something you want to take into production.
>
> Cheers
>
> Ben
>
> On Thu, Oct 11, 2018 at 6:50 PM Goutham reddy 
> wrote:
>
>> Hi,
>> We are in the process of setting up Cassandra cluster with high
>> availability. So the debate is installing Cassandra in Kubernetes cluster.
>> Can someone  throw some light, what advantages can I get when created
>> Cassandra cluster inside Kubernetes cluster. Any comments are highly
>> appreciated:)
>>
>> Thanks and Regards,
>> Goutham Reddy Aenugu.
>> --
>> Regards
>> Goutham Reddy
>>
> --
> Ben Bromhead
> CTO | Instaclustr <https://www.instaclustr.com/>
> +1 650 284 9692
> Reliability at Scale
> Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer
>
-- 
Regards
Goutham Reddy


Advantage over Cassandra in Kubernetes

2018-10-11 Thread Goutham reddy
Hi,
We are in the process of setting up Cassandra cluster with high
availability. So the debate is installing Cassandra in Kubernetes cluster.
Can someone  throw some light, what advantages can I get when created
Cassandra cluster inside Kubernetes cluster. Any comments are highly
appreciated:)

Thanks and Regards,
Goutham Reddy Aenugu.
-- 
Regards
Goutham Reddy


Re: Timeout for only one keyspace in cluster

2018-07-21 Thread Goutham reddy
Hi,
As it is a single partition key, try to update the key with only partition
key instead of passing other columns. And try to set consistency level ONE.

Cheers,
Goutham.

On Fri, Jul 20, 2018 at 6:57 AM learner dba 
wrote:

> Anybody has any ideas about this? This is happening in production and we
> really need to fix it.
>
> On Thursday, July 19, 2018, 10:41:59 AM CDT, learner dba
>  wrote:
>
>
> Our foreignid is unique idetifier and we did check for wide partitions;
> cfhistorgrams show all partitions are evenly sized:
>
> Percentile  SSTables Write Latency  Read LatencyPartition Size
>   Cell Count
>
>   (micros)  (micros)   (bytes)
>
>
> 50% 0.00 29.52  0.00  1916
>   12
>
> 75% 0.00 42.51  0.00  2299
>   12
>
> 95% 0.00 61.21  0.00  2759
>   14
>
> 98% 0.00 73.46  0.00  2759
>   17
>
> 99% 0.00 88.15  0.00  2759
>   17
>
> Min 0.00  9.89  0.00   150
> 2
>
> Max 0.00 88.15  0.00   7007506
> 42510
> any thing else that we can check?
>
> On Wednesday, July 18, 2018, 10:44:29 PM CDT, wxn...@zjqunshuo.com <
> wxn...@zjqunshuo.com> wrote:
>
>
> Your partition key is foreignid. You may have a large partition. Why not
> use foreignid+timebucket as partition key?
>
>
> *From:* learner dba 
> *Date:* 2018-07-19 01:48
> *To:* User cassandra.apache.org 
> *Subject:* Timeout for only one keyspace in cluster
> Hi,
>
> We have a cluster with multiple keyspaces. All queries are performing good
> but write operation on few tables in one specific keyspace gets write
> timeout. Table has counter column and counter update query times out
> always. Any idea?
>
> CREATE TABLE x.y (
>
> foreignid uuid,
>
> timebucket text,
>
> key text,
>
> timevalue int,
>
> value counter,
>
> PRIMARY KEY (foreignid, timebucket, key, timevalue)
>
> ) WITH CLUSTERING ORDER BY (timebucket ASC, key ASC, timevalue ASC)
>
> AND bloom_filter_fp_chance = 0.01
>
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>
> AND comment = ''
>
> AND compaction = {'class':
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
> 'max_threshold': '32', 'min_threshold': '4'}
>
> AND compression = {'chunk_length_in_kb': '64', 'class':
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
>
> AND crc_check_chance = 1.0
>
> AND dclocal_read_repair_chance = 0.1
>
> AND default_time_to_live = 0
>
> AND gc_grace_seconds = 864000
>
> AND max_index_interval = 2048
>
> AND memtable_flush_period_in_ms = 0
>
> AND min_index_interval = 128
>
> AND read_repair_chance = 0.0
>
> AND speculative_retry = '99PERCENTILE';
>
> Query and Error:
>
> UPDATE x.y SET value = value + 1 where foreignid = ? AND timebucket = ? AND 
> key = ? AND timevalue = ?, err = {s:\"gocql: no response 
> received from cassandra within timeout period
>
>
> I verified CL=local_serial
>
> We had been working on this issue for many days; any help will be much 
> appreciated.
>
>
>
> --
Regards
Goutham Reddy


Re: which driver to use with cassandra 3

2018-07-21 Thread Goutham reddy
Hi,
Consider overriding default java driver provided by spring boot if you are
using Datastax clusters with with any of the 3.X Datastax driver. I agree
to Patrick, always have one key space specified to one application in that
way you achieve domain driven applications and cause less overhead avoiding
switching between key spaces.

Cheers,
Goutham

On Fri, Jul 20, 2018 at 10:10 AM Patrick McFadin  wrote:

> Vitaliy,
>
> The DataStax Java driver is very actively maintained by a good size team
> and a lot of great community contributors. It's version 3.x compatible and
> even has some 4.x features starting to creep in. Support for virtual tables
> (https://issues.apache.org/jira/browse/CASSANDRA-7622)  was just merged
> as an example. Even the largest DataStax customers have a mix of enterprise
> + OSS and we want to support them either way. Giving developers the most
> consistent experience is part of that goal.
>
> As for spring-data-cassandra, it does pull the latest driver as a part of
> its own build, so you will already have it in your classpath. Spring adds
> some auto-magic that you should be aware. The part you mentioned about the
> schema management, is one to be careful with using. If you use it in dev,
> it's not a huge problem. If it gets out to prod, you could potentially have
> A LOT of concurrent schema changes happening which can lead to bad things.
> Also, some of the spring API features such as findAll() can expose
> typical c* anti-patterns such as "allow filtering" Just be aware of what
> feature does what. And finally, another potential production problem is
> that if you use a lot of keyspaces, Spring will instantiate a new Driver
> Session object per keyspace which can lead to a lot of redundant connection
> to the database. From the driver, a better way is to specify a keyspace per
> query.
>
> As you are using spring-data-cassandra, please share your experiences if
> you can. There are a lot of developers that would benefit from some
> real-world stories.
>
> Patrick
>
>
> On Fri, Jul 20, 2018 at 4:54 AM Vitaliy Semochkin 
> wrote:
>
>> Thank you very much Duy Hai Doan!
>> I have relatively simple demands and since spring using datastax
>> driver I can always get back to it,
>> though  I would prefer to use spring in order to do bootstrapping and
>> resource management for me.
>> On Fri, Jul 20, 2018 at 4:51 PM DuyHai Doan  wrote:
>> >
>> > Spring data cassandra is so so ... It has less features (at last at the
>> time I looked at it) than the default Java driver
>> >
>> > For driver, right now most of people are using Datastax's ones
>> >
>> > On Fri, Jul 20, 2018 at 3:36 PM, Vitaliy Semochkin <
>> vitaliy...@gmail.com> wrote:
>> >>
>> >> Hi,
>> >>
>> >> Which driver to use with cassandra 3
>> >>
>> >> the one that is provided by datastax, netflix or something else.
>> >>
>> >> Spring uses driver from datastax, though is it a reliable solution for
>> >> a long term project, having in mind that datastax and cassandra
>> >> parted?
>> >>
>> >> Regards,
>> >> Vitaliy
>> >>
>> >> -----
>> >> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> >> For additional commands, e-mail: user-h...@cassandra.apache.org
>> >>
>> >
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>> --
Regards
Goutham Reddy


Re: How to query 100 primary keys at once

2018-07-05 Thread Goutham reddy
Thank you Jeff for the solution.

On Thu, Jul 5, 2018 at 12:01 PM Jeff Jirsa  wrote:

> Either of those solutions are fine, you just need to consider
> throttling/limiting the number of concurrent queries (either in the
> application, or on the server side) to avoid timeouts.
>
>
>
> On Thu, Jul 5, 2018 at 11:16 AM, Goutham reddy  > wrote:
>
>> Hi users,
>> Querying multiple primary keys can be achieved using IN operator but it
>> cause load only on single node and which inturn causes READ timeout issues.
>> Calling asynchronously each primary key is also not a right choice for big
>> partition key.
>> Can anyone suggest me best practice to query it and if Cassandra is not
>> the right fit for these type of queries which NoSQL dB solves this problem.
>> Any help is highly appreciated
>>
>> Thanks,
>> Goutham,
>> T- Mobile.
>> --
>> Regards
>> Goutham Reddy
>>
>
> --
Regards
Goutham Reddy


How to query 100 primary keys at once

2018-07-05 Thread Goutham reddy
Hi users,
Querying multiple primary keys can be achieved using IN operator but it
cause load only on single node and which inturn causes READ timeout issues.
Calling asynchronously each primary key is also not a right choice for big
partition key.
Can anyone suggest me best practice to query it and if Cassandra is not the
right fit for these type of queries which NoSQL dB solves this problem. Any
help is highly appreciated

Thanks,
Goutham,
T- Mobile.
-- 
Regards
Goutham Reddy


How is Token function work in Cassandra

2018-05-21 Thread Goutham reddy
I would like to know how the Token Function works in Cassandra. In what
scenario it is best used. Secondly can a range query performed on token
function on a composite primary key. Any help is highly appreciated.

Thanks and Regards,
Goutham Reddy Aenugu.
-- 
Regards
Goutham Reddy


Re: Listening to Cassandra on IPv4 and IPv6 at the same time

2018-03-23 Thread Goutham reddy
Sudheer,
Seems interesting, can you please eloborate what is FQDN and where to
remove  mapping. Appreciate your help.

Thanks and Regards,
Goutham

On Fri, Mar 23, 2018 at 2:34 PM sudheer k <sudheer.hdp...@gmail.com> wrote:

> I found a solution for this. As Cassandra can’t bind to two addresses at a
> point in time according to the comments in cassandra.yaml file, we removed
>  mapping to FQDN and kept only A(IPv4) mapping. So, FQDN resolves to
> IPv4 always and we can use FQDN in the application configuration while
> talking to Cassandra.
>
> Hope this helps someone who uses FQDN in the application config.
>
> Regards
> Sudheer
>
> On Fri, Mar 23, 2018 at 2:26 AM sudheer k <sudheer.hdp...@gmail.com>
> wrote:
>
>> Hi,
>>
>> Can we listen to Cassandra on IPV4 and IPV6 at the same time? When I
>> refer to some documents on the internet, it says I can only bind to one
>> address at a point in time.
>>
>> In our application, we are talking to Cassandra on FQDN and application
>> gets either of IPv4 or IPv6 connecting to Cassandra. So, Cassandra has to
>> bind to both IPv4 and IPv6 at a point in time.
>>
>> If there is any configuration related to this change, can anyone please
>> let me know?
>>
>> --
>> Regards
>> Sudheer
>>
> --
> --
> Regards
> Sudheer
>
-- 
Regards
Goutham Reddy


Re: Migration of keyspace to another new cluster

2018-03-14 Thread Goutham reddy
Nate,
Thank you very much for the reply, even I was trying to implement the same.
Will post it after it is implemented in the above mention fashion.

Thanks and Regards,
Goutham Reddy Aenugu.

Regards
Goutham Reddy

On Tue, Mar 13, 2018 at 6:40 PM, Nate McCall <n...@thelastpickle.com> wrote:

>
> Hi,
>> We got a requirement to migrate only one keyspace data from one cluster
>> to other cluster. And we no longer need the old cluster anymore. Can you
>> suggest what are the best possible ways we can achieve it.
>>
>> Regards
>> Goutham Reddy
>>
>
>
> Temporarily treat the new cluster as a new datacenter for the current
> cluster and follow the process for adding a datacenter for that keyspace.
> When complete remove the old datacenter/cluster similarly.
>
>


Re: Fast Writes to Cassandra Failing Through Python Script

2018-03-13 Thread Goutham reddy
Faraz,
Can you share your code snippet, how you are trying to save the  entity
objects into cassandra.

Thanks and Regards,
Goutham Reddy Aenugu.

Regards
Goutham Reddy

On Tue, Mar 13, 2018 at 3:42 PM, Faraz Mateen <fmat...@an10.io> wrote:

> Hi everyone,
>
> I seem to have hit a problem in which writing to cassandra through a
> python script fails and also occasionally causes cassandra node to crash.
> Here are the details of my problem.
>
> I have a python based streaming application that reads data from kafka at
> a high rate and pushes it to cassandra through datastax's cassandra driver
> for python. My cassandra setup consists of 3 nodes and a replication factor
> of 2. Problem is that my python application crashes after writing ~12000
> records with the following error:
>
> Exception: Error from server: code=1100 [Coordinator node timed out waiting 
> for replica nodes' responses] message="Operation timed out - received only 0 
> responses." info={'received_responses':
>  0, 'consistency': 'LOCAL_ONE', 'required_responses': 1}
>
> Sometimes the  python application crashes with this traceback:
>
> cassandra.OperationTimedOut: errors={'10.128.1.1': 'Client request timeout. 
> See Session.execute[_async](timeout)'}, last_host=10.128.1.1
>
> With the error above, one of the cassandra node crashes as well. When I
> look at cassandra system logs (/var/log/cassandra/system.log), I see the
> following exception:
>
> https://gist.github.com/farazmateen/e7aa5749f963ad2293f8be0c
> a1ccdc22/e3fd274af32c20eb9f534849a31734dcd33745b4
>
> According to the suggestion in post linked below, I have set my JVM Heap
> size to 8GB but the problem still persists.:
> https://dzone.com/articles/diagnosing-and-fixing-cassandra-timeouts
>
> *Cluster:*
>
>- Cassandra version 3.9
>- 3 nodes, with 8 cores and 30GB of RAM each.
>- Keyspace has a replication factor of 2.
>- Write consistency is LOCAL_ONE
>- MAX HEAP SIZE is set to 8GB.
>
> Any help will be greatly appreciated.
>
> --
> Faraz
>


Migration of keyspace to another new cluster

2018-03-13 Thread Goutham reddy
Hi,
We got a requirement to migrate only one keyspace data from one cluster to
other cluster. And we no longer need the old cluster anymore. Can you
suggest what are the best possible ways we can achieve it.

Regards
Goutham Reddy


Re: Cassandra DevCenter

2018-03-10 Thread Goutham reddy
Get the JARS from Cassandra lib folder and put it in your build path. Or
else use Pom.xml maven project to directly download from repository.

Thanks and Regards,
Goutham Reddy Aenugu.

On Sat, Mar 10, 2018 at 9:30 AM Philippe de Rochambeau <phi...@free.fr>
wrote:

> Hello,
> has anyone tried running CQL queries from a Java program using the jars
> provided with DevCenter?
> Many thanks.
> Philippe
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
> --
Regards
Goutham Reddy


Re: Batch too large exception

2018-03-07 Thread Goutham reddy
Mkadek,
Sorry for the late reply. Thanks for the insight that I am unknowingly
using batch inserts (Spring Data Cassandra) using the repository.save where
I am inserting a list of objects at one go. And Cassandra is treating it as
Batch Inserts aborting because of size and write timeout exception. I have
now changed the logic and inserting each partition as all the partition
keys are distributed across all the coordinators on the cluster unlike in
Batch all the set of inserts are redirected to one coordinator node. Hope
somebody avoids this mistake of inserting list of objects.

http://christopher-batey.blogspot.com/2015/02/cassandra-anti-pattern-misuse-of.html?m=1

Above site explained clearly how to perform huge writes into Cassandra.

Thanks and Regards,
Goutham Reddy Aenugu.

On Wed, Feb 28, 2018 at 5:05 AM Marek Kadek -T (mkadek - CONSOL PARTNERS
LTD at Cisco) <mka...@cisco.com> wrote:

> Hi,
>
>
>
> Are you writing the batch to same partition? If not, there is a much
> stricter limit (I think 50Kb).
>
> Check https://docs.datastax.com/en/cql/3.3/cql/cql_using/useBatch.html ,
> and followups.
>
>
>
> *From: *Goutham reddy <goutham.chiru...@gmail.com>
> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
> *Date: *Tuesday, February 27, 2018 at 9:55 PM
> *To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
> *Subject: *Batch too large exception
>
>
>
> Hi,
>
> I have been getting batch too large exception when performing WRITE from
> Client application. My insert size is 5MB, so I have to split the 10 insert
> objects to insert at one go. It save some inserts and closes after some
> uncertain time. And it is a wide column table, we do have 113 columns. Can
> anyone kindly provide solution what was going wrong on my execution.
> Appreciate your help.
>
>
> Regards
>
> Goutham Reddy
>
>
>
-- 
Regards
Goutham Reddy


Re: Cassandra at Instagram with Dikang Gu interview by Jeff Carpenter

2018-03-06 Thread Goutham reddy
It’s an interesting conversation. For more details about the pluggable
storage engine here is the link.

Blog:
https://thenewstack.io/instagram-supercharges-cassandra-pluggable-rocksdb-storage-engine/

JIRA:
https://issues.apache.org/jira/plugins/servlet/mobile#issue/CASSANDRA-13475


On Tue, Mar 6, 2018 at 9:01 AM Kenneth Brotman <kenbrot...@yahoo.com.invalid>
wrote:

> Just released on DataStax Distributed Data Show, DiKang Gu of Instagram
> interviewed by author Jeff Carpenter.
>
> Found it really interesting:  Shadow clustering, migrating from 2.2 to
> 3.0, using the Rocks DB as a pluggable storage engine for Cassandra
>
>
> https://academy.datastax.com/content/distributed-data-show-episode-37-cassandra-instagram-dikang-gu
>
>
>
> Kenneth Brotman
>
-- 
Regards
Goutham Reddy


Batch too large exception

2018-02-27 Thread Goutham reddy
 Hi,
I have been getting batch too large exception when performing WRITE from
Client application. My insert size is 5MB, so I have to split the 10 insert
objects to insert at one go. It save some inserts and closes after some
uncertain time. And it is a wide column table, we do have 113 columns. Can
anyone kindly provide solution what was going wrong on my execution.
Appreciate your help.

Regards
Goutham Reddy