Re: nodetool cleanup - compaction remaining time

2018-09-05 Thread Jeff Jirsa
Probably worth a JIRA (especially if you can repro in 3.0 or higher, since
2.1 is critical fixes only)

On Wed, Sep 5, 2018 at 10:46 PM Steinmaurer, Thomas <
thomas.steinmau...@dynatrace.com> wrote:

> Hello,
>
>
>
> is it a known issue / limitation that cleanup compactions aren’t counted
> in the compaction remaining time?
>
>
>
> nodetool compactionstats -H
>
>
>
> pending tasks: 1
>
>compaction type   keyspace   table   completed total
> unit   progress
>
>CleanupXXX YYY   908.16 GB   1.13 TB
> bytes 78.63%
>
> Active compaction remaining time :   0h00m00s
>
>
>
>
>
> This is with 2.1.18.
>
>
>
>
>
> Thanks,
>
> Thomas
>
>
> The contents of this e-mail are intended for the named addressee only. It
> contains information that may be confidential. Unless you are the named
> addressee or an authorized designee, you may not copy or use it, or
> disclose it to anyone else. If you received it in error please notify us
> immediately and then destroy it. Dynatrace Austria GmbH (registration
> number FN 91482h) is a company registered in Linz whose registered office
> is at 4040 Linz, Austria, Freistädterstraße 313
>


nodetool cleanup - compaction remaining time

2018-09-05 Thread Steinmaurer, Thomas
Hello,

is it a known issue / limitation that cleanup compactions aren't counted in the 
compaction remaining time?

nodetool compactionstats -H

pending tasks: 1
   compaction type   keyspace   table   completed totalunit   
progress
   CleanupXXX YYY   908.16 GB   1.13 TB   bytes 
78.63%
Active compaction remaining time :   0h00m00s


This is with 2.1.18.


Thanks,
Thomas

The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If you received it in error please notify us immediately and 
then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a 
company registered in Linz whose registered office is at 4040 Linz, Austria, 
Freist?dterstra?e 313


Best way to migrate data from DSE to Apache C *

2018-09-05 Thread Pranay akula
Hi all,

What would be the best way to migrate from DSE to Apache C*

1.) It is ok to add Apache C * DC in DSE cluster and once data is
replicated and repaired decommission DSE DC ??

2.) Does SStable loader wrk to migrate data between DSE to Apache C*

3.) Do we need to write an app or sarl job to read Data from DSE and write
to Apache cluster ??



Thanks
Pranay


It's time: testing for 4.0

2018-09-05 Thread Jeff Jirsa
Cassandra Community,

For those of you who don't follow the dev@ list, the Cassandra developers
have frozen trunk for 4.0 (
https://lists.apache.org/thread.html/18c76129a4fe6785a51dad7500e04ee13a407a7f7ac5c8f9a3d83c87@%3Cdev.cassandra.apache.org%3E
and
https://lists.apache.org/thread.html/494c3ced9e83ceeb53fa127e44eec6e2588a01b769896b25867fd59f@%3Cdev.cassandra.apache.org%3E
)
- no new features will go into trunk until we're confident enough to cut a
beta.

There are currently 303 new bug-fixes and features waiting to be tested. A
number of developers (and companies that employ them) have already started
testing, but there's a LOT of changes, so I'm asking for your help: if you
run Cassandra in production, and you have an opportunity to try out the
pre-release-candidate trunk, PLEASE give it a shot.

Here are some quick suggestions:
- Please DON'T use it on production, mission critical workloads. There are
a LOT of changes, some of them may cause data correctness, consistency
issues, or availability problems (crashes, etc). We expect to find them
before release, but we don't want YOU to find them on projects you care
about. Please don't test this in prod unless you really know what you're
doing.
- Please DO use it with mock / simulated / shadow workloads that match your
environment. For example, the project doesn't have access to large scale
AWS/GCP/Azure clusters or every operating system or every version of Java -
we test as much as we can, but if you care about using the ec2 snitch or
plan on running on Windows, please take a few minutes and spin up a
cluster.
- Please DO try upgrading from the version you're currently running. If
it's not easy to do, please open a JIRA. Remember to read NEWS.txt.
- Please DO try out java11 if you're able.
- Please DO tell us about performance regressions in your workloads - this
is especially valuable if you have before/after metrics showing a change in
behavior.
- Please DO share your problems - especially if you find a problem. Please
open a JIRA and add the label "4.0-pre-rc-bugs". We'll have new labels for
alpha/beta/RCs.
- Please DO communicate on the dev@ list if you see something you suspect
may be a bug, but you're not sure.
- Please DONT reply-all to this email with any bug you find.
- Please DONT use this as an opportunity to ask for new features or
enhancements you want to land in 4.0.

If you're feeling especially adventurous, try out some of the new features
- things like streaming entire sstables for bootstrap (
https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L799-L809
) probably speeds up bootstrapping on clusters using LCS by a meaningful
amount, and there's a whole new audit log (
https://github.com/apache/cassandra/blame/trunk/conf/cassandra.yaml#L1218-L1227
) and a new "group commitlog" for folks currently using the especially
durable batch commitlog (
https://github.com/apache/cassandra/blame/trunk/conf/cassandra.yaml#L380-L382
) . There's also a new ability to run multiple instances on different ports
using the same IP - if you're running in containers, you probably want to
try this out (
https://github.com/apache/cassandra/blame/trunk/conf/cassandra.yaml#L433 )

If you write software that interacts with Cassandra (e.g, you run a metrics
collection SaaS company), now would be a great time to think about making
sure the APIs you rely on still exist. For some, you may find there are
newer, easier access methods - we've added a whole new easily-queryable CQL
interface to some of the metrics you likely care about ( see, for example,
https://issues.apache.org/jira/browse/CASSANDRA-14523 ).

If you have questions, please ask!
- Jeff


Re:

2018-09-05 Thread Andrew Baker
Hi Shyam,

  Those are big questions! The book *Cassandra: The Definitive Guide *is a
good place to start, it will walk you through a little bit of each of those
questions. It should be a challenging project. Look around at
http://cassandra.apache.org/ and Datastax has some good tutorials and
videos too as I recall.

-Andrew

On Wed, Sep 5, 2018 at 6:19 AM sha p  wrote:

> Hi all ,
> Me new to Cassandra , i was asked to migrate data from Oracle to Cassandra.
> Please help me giving your valuable guidance.
> 1) Can it be done using open source Cassandra.
> 2) Where should I start data model from?
> 3) I should use java, what kind of  jar/libs/tools I need use ?
> 4) How I decide the size of cluster , please provide some sample
> guidelines.
> 5) this should be in production , so what kind of things i should take
> care for better support or debugging tomorrow?
> 6) Please provide some good books /links which can help me in this task.
>
>
> Thanks in advance.
> Highly appreciated your every amal help.
>
> Regards,
> Shyam
>


Re: Regarding migrating data from Oracle to Cassandra.migrate data from Oracle to Cassandra.

2018-09-05 Thread Jeff Jirsa
It very much depends on your application. You'll PROBABLY want to double
write for some period of time -  start writes to both Cassandra and Oracle,
and then ensure they're both in sync. Once you're sure they're both in
sync, move your reads from Oracle to Cassandra.



On Wed, Sep 5, 2018 at 8:58 PM sha p  wrote:

> Hi all,
> Sir how should I keep track of the data which is moved to Cassandra , what
> are the best strategies available?
>
> Regards,
> Shyam
>
> On Wed, 5 Sep 2018, 18:51 sha p,  wrote:
>
>>
>> Hi all ,
>>> Me new to Cassandra , i was asked to migrate data from Oracle to
>>> Cassandra.
>>> Please help me giving your valuable guidance.
>>> 1) Can it be done using open source Cassandra.
>>> 2) Where should I start data model from?
>>> 3) I should use java, what kind of  jar/libs/tools I need use ?
>>> 4) How I decide the size of cluster , please provide some sample
>>> guidelines.
>>> 5) this should be in production , so what kind of things i should take
>>> care for better support or debugging tomorrow?
>>> 6) Please provide some good books /links which can help me in this task.
>>>
>>>
>>> Thanks in advance.
>>> Highly appreciated your every amal help.
>>>
>>> Regards,
>>> Shyam
>>>
>>


Re: Regarding migrating data from Oracle to Cassandra.migrate data from Oracle to Cassandra.

2018-09-05 Thread sha p
Hi all,
Sir how should I keep track of the data which is moved to Cassandra , what
are the best strategies available?

Regards,
Shyam

On Wed, 5 Sep 2018, 18:51 sha p,  wrote:

>
> Hi all ,
>> Me new to Cassandra , i was asked to migrate data from Oracle to
>> Cassandra.
>> Please help me giving your valuable guidance.
>> 1) Can it be done using open source Cassandra.
>> 2) Where should I start data model from?
>> 3) I should use java, what kind of  jar/libs/tools I need use ?
>> 4) How I decide the size of cluster , please provide some sample
>> guidelines.
>> 5) this should be in production , so what kind of things i should take
>> care for better support or debugging tomorrow?
>> 6) Please provide some good books /links which can help me in this task.
>>
>>
>> Thanks in advance.
>> Highly appreciated your every amal help.
>>
>> Regards,
>> Shyam
>>
>


Re: [EXTERNAL] Regarding migrating data from Oracle to Cassandra.migrate data from Oracle to Cassandra.

2018-09-05 Thread sha p
Thank you all very much.
Migration is due to Oracle not scaling as expected.

Sure I will be posting my queries and doubts time to time for your guidance.


Thank you.
Shyam

On Wed, 5 Sep 2018, 21:28 Rahul Singh,  wrote:

> Look here for some “migration” or data modeling articles.
>
> *https://anant.github.io/awesome-cassandra/*
> 
>
> Rahul Singh
> Chief Executive Officer
> m 202.905.2818
>
> Anant Corporation
> 1010 Wisconsin Ave NW, Suite 250
> Washington, D.C. 20007
>
> We build and manage digital business technology platforms.
> On Sep 5, 2018, 10:47 AM -0500, Jeff Jirsa , wrote:
>
> All of  Sean's points are good, a few more:
> - Apache Cassandra (free, open source, official) is usually sufficient.
> DSE may be faster, but really it's about whether or not you're willing to
> pay for support. If you're trying to stop paying Oracle, I suspect you'd
> probably not want to start paying someone else - try the free version
> first, and you can look for proprietary options after that.
> - http://shop.oreilly.com/product/0636920043041.do is relatively recent
> and mostly pretty good
> - Ask a lot of questions, use this list, but try things out first so
> people have a way to point you in the right direction.
>
>
>
> On Wed, Sep 5, 2018 at 7:58 AM Durity, Sean R 
> wrote:
>
>> 3 starting points:
>>
>> -  DO NOT migrate your tables as they are in Oracle to
>> Cassandra. In most cases, you need a different model for Cassandra
>>
>> -  DO take the (free) DataStax Academy courses to learn much
>> more about Cassandra as you dive in. It is a systematic and bite-size
>> approach to learning all things Cassandra (and eventually, DataStax
>> Enterprise, should you go that way). However, open source Cassandra is fine
>> as a data platform. DSE gives you more options for data models, better
>> administration and monitoring tools, support, etc. It all depends on what
>> you need/want to build/can afford
>>
>> -  Cluster sizing depends on your goals for the data platform.
>> Do you need lots of storage, lots of throughput, high availability, low
>> latency, workload separation, etc.? A couple guidelines – use at least 3
>> nodes per data center (DC) and at least 2 DCs for availability. Use SSDs
>> for storage and keep node size 3 TB or less for reasonable administration.
>> If six nodes are too many – you probably don’t need Cassandra. If you can
>> define what you need your data platform to deliver, then you can start a
>> sizing discussion. The good thing is, you can always scale (as long as the
>> data model is good).
>>
>>
>>
>>
>>
>> Sean Durity
>>
>>
>>
>> *From:* sha p 
>> *Sent:* Wednesday, September 05, 2018 9:21 AM
>> *To:* user@cassandra.apache.org
>> *Subject:* [EXTERNAL] Regarding migrating data from Oracle to
>> Cassandra.migrate data from Oracle to Cassandra.
>>
>>
>>
>>
>>
>> Hi all ,
>>
>> Me new to Cassandra , i was asked to migrate data from Oracle to
>> Cassandra.
>>
>> Please help me giving your valuable guidance.
>>
>> 1) Can it be done using open source Cassandra.
>>
>> 2) Where should I start data model from?
>>
>> 3) I should use java, what kind of  jar/libs/tools I need use ?
>>
>> 4) How I decide the size of cluster , please provide some sample
>> guidelines.
>>
>> 5) this should be in production , so what kind of things i should take
>> care for better support or debugging tomorrow?
>>
>> 6) Please provide some good books /links which can help me in this task.
>>
>>
>>
>>
>>
>> Thanks in advance.
>>
>> Highly appreciated your every amal help.
>>
>>
>>
>> Regards,
>>
>> Shyam
>>
>>
>> --
>>
>> The information in this Internet Email is confidential and may be legally
>> privileged. It is intended solely for the addressee. Access to this Email
>> by anyone else is unauthorized. If you are not the intended recipient, any
>> disclosure, copying, distribution or any action taken or omitted to be
>> taken in reliance on it, is prohibited and may be unlawful. When addressed
>> to our clients any opinions or advice contained in this Email are subject
>> to the terms and conditions expressed in any applicable governing The Home
>> Depot terms of business or client engagement letter. The Home Depot
>> disclaims all responsibility and liability for the accuracy and content of
>> this attachment and for any damages or losses arising from any
>> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
>> items of a destructive nature, which may be contained in this attachment
>> and shall not be liable for direct, indirect, consequential or special
>> damages in connection with this e-mail message or its attachment.
>>
>


Re: [EXTERNAL] Regarding migrating data from Oracle to Cassandra.migrate data from Oracle to Cassandra.

2018-09-05 Thread Rahul Singh
Look here for some “migration” or data modeling articles.

https://anant.github.io/awesome-cassandra/

Rahul Singh
Chief Executive Officer
m 202.905.2818

Anant Corporation
1010 Wisconsin Ave NW, Suite 250
Washington, D.C. 20007

We build and manage digital business technology platforms.
On Sep 5, 2018, 10:47 AM -0500, Jeff Jirsa , wrote:
> All of  Sean's points are good, a few more:
> - Apache Cassandra (free, open source, official) is usually sufficient. DSE 
> may be faster, but really it's about whether or not you're willing to pay for 
> support. If you're trying to stop paying Oracle, I suspect you'd probably not 
> want to start paying someone else - try the free version first, and you can 
> look for proprietary options after that.
> - http://shop.oreilly.com/product/0636920043041.do is relatively recent and 
> mostly pretty good
> - Ask a lot of questions, use this list, but try things out first so people 
> have a way to point you in the right direction.
>
>
>
> > On Wed, Sep 5, 2018 at 7:58 AM Durity, Sean R  
> > wrote:
> > > 3 starting points:
> > > -  DO NOT migrate your tables as they are in Oracle to Cassandra. 
> > > In most cases, you need a different model for Cassandra
> > > -  DO take the (free) DataStax Academy courses to learn much more 
> > > about Cassandra as you dive in. It is a systematic and bite-size approach 
> > > to learning all things Cassandra (and eventually, DataStax Enterprise, 
> > > should you go that way). However, open source Cassandra is fine as a data 
> > > platform. DSE gives you more options for data models, better 
> > > administration and monitoring tools, support, etc. It all depends on what 
> > > you need/want to build/can afford
> > > -  Cluster sizing depends on your goals for the data platform. Do 
> > > you need lots of storage, lots of throughput, high availability, low 
> > > latency, workload separation, etc.? A couple guidelines – use at least 3 
> > > nodes per data center (DC) and at least 2 DCs for availability. Use SSDs 
> > > for storage and keep node size 3 TB or less for reasonable 
> > > administration.  If six nodes are too many – you probably don’t need 
> > > Cassandra. If you can define what you need your data platform to deliver, 
> > > then you can start a sizing discussion. The good thing is, you can always 
> > > scale (as long as the data model is good).
> > >
> > >
> > > Sean Durity
> > >
> > > From: sha p 
> > > Sent: Wednesday, September 05, 2018 9:21 AM
> > > To: user@cassandra.apache.org
> > > Subject: [EXTERNAL] Regarding migrating data from Oracle to 
> > > Cassandra.migrate data from Oracle to Cassandra.
> > >
> > >
> > > > quote_type
> > > > Hi all ,
> > > > Me new to Cassandra , i was asked to migrate data from Oracle to 
> > > > Cassandra.
> > > > Please help me giving your valuable guidance.
> > > > 1) Can it be done using open source Cassandra.
> > > > 2) Where should I start data model from?
> > > > 3) I should use java, what kind of  jar/libs/tools I need use ?
> > > > 4) How I decide the size of cluster , please provide some sample 
> > > > guidelines.
> > > > 5) this should be in production , so what kind of things i should take 
> > > > care for better support or debugging tomorrow?
> > > > 6) Please provide some good books /links which can help me in this task.
> > > >
> > > >
> > > > Thanks in advance.
> > > > Highly appreciated your every amal help.
> > > >
> > > > Regards,
> > > > Shyam
> > >
> > >
> > > The information in this Internet Email is confidential and may be legally 
> > > privileged. It is intended solely for the addressee. Access to this Email 
> > > by anyone else is unauthorized. If you are not the intended recipient, 
> > > any disclosure, copying, distribution or any action taken or omitted to 
> > > be taken in reliance on it, is prohibited and may be unlawful. When 
> > > addressed to our clients any opinions or advice contained in this Email 
> > > are subject to the terms and conditions expressed in any applicable 
> > > governing The Home Depot terms of business or client engagement letter. 
> > > The Home Depot disclaims all responsibility and liability for the 
> > > accuracy and content of this attachment and for any damages or losses 
> > > arising from any inaccuracies, errors, viruses, e.g., worms, trojan 
> > > horses, etc., or other items of a destructive nature, which may be 
> > > contained in this attachment and shall not be liable for direct, 
> > > indirect, consequential or special damages in connection with this e-mail 
> > > message or its attachment.


Re: [EXTERNAL] Regarding migrating data from Oracle to Cassandra.migrate data from Oracle to Cassandra.

2018-09-05 Thread Rahul Singh
The biggest issue you’ll have is that “migration” from a relational to 
Cassandra is not a 1 to 1. The schemas will have to change.

DSE has other technology that is a little more useful - such as Spark / Spark 
SQL / Solr that is built in which helps meet the needs which Oracle was 
previously providing.


Rahul Singh
Chief Executive Officer
m 202.905.2818

Anant Corporation
1010 Wisconsin Ave NW, Suite 250
Washington, D.C. 20007

We build and manage digital business technology platforms.
On Sep 5, 2018, 10:47 AM -0500, Jeff Jirsa , wrote:
> All of  Sean's points are good, a few more:
> - Apache Cassandra (free, open source, official) is usually sufficient. DSE 
> may be faster, but really it's about whether or not you're willing to pay for 
> support. If you're trying to stop paying Oracle, I suspect you'd probably not 
> want to start paying someone else - try the free version first, and you can 
> look for proprietary options after that.
> - http://shop.oreilly.com/product/0636920043041.do is relatively recent and 
> mostly pretty good
> - Ask a lot of questions, use this list, but try things out first so people 
> have a way to point you in the right direction.
>
>
>
> > On Wed, Sep 5, 2018 at 7:58 AM Durity, Sean R  
> > wrote:
> > > 3 starting points:
> > > -  DO NOT migrate your tables as they are in Oracle to Cassandra. 
> > > In most cases, you need a different model for Cassandra
> > > -  DO take the (free) DataStax Academy courses to learn much more 
> > > about Cassandra as you dive in. It is a systematic and bite-size approach 
> > > to learning all things Cassandra (and eventually, DataStax Enterprise, 
> > > should you go that way). However, open source Cassandra is fine as a data 
> > > platform. DSE gives you more options for data models, better 
> > > administration and monitoring tools, support, etc. It all depends on what 
> > > you need/want to build/can afford
> > > -  Cluster sizing depends on your goals for the data platform. Do 
> > > you need lots of storage, lots of throughput, high availability, low 
> > > latency, workload separation, etc.? A couple guidelines – use at least 3 
> > > nodes per data center (DC) and at least 2 DCs for availability. Use SSDs 
> > > for storage and keep node size 3 TB or less for reasonable 
> > > administration.  If six nodes are too many – you probably don’t need 
> > > Cassandra. If you can define what you need your data platform to deliver, 
> > > then you can start a sizing discussion. The good thing is, you can always 
> > > scale (as long as the data model is good).
> > >
> > >
> > > Sean Durity
> > >
> > > From: sha p 
> > > Sent: Wednesday, September 05, 2018 9:21 AM
> > > To: user@cassandra.apache.org
> > > Subject: [EXTERNAL] Regarding migrating data from Oracle to 
> > > Cassandra.migrate data from Oracle to Cassandra.
> > >
> > >
> > > > quote_type
> > > > Hi all ,
> > > > Me new to Cassandra , i was asked to migrate data from Oracle to 
> > > > Cassandra.
> > > > Please help me giving your valuable guidance.
> > > > 1) Can it be done using open source Cassandra.
> > > > 2) Where should I start data model from?
> > > > 3) I should use java, what kind of  jar/libs/tools I need use ?
> > > > 4) How I decide the size of cluster , please provide some sample 
> > > > guidelines.
> > > > 5) this should be in production , so what kind of things i should take 
> > > > care for better support or debugging tomorrow?
> > > > 6) Please provide some good books /links which can help me in this task.
> > > >
> > > >
> > > > Thanks in advance.
> > > > Highly appreciated your every amal help.
> > > >
> > > > Regards,
> > > > Shyam
> > >
> > >
> > > The information in this Internet Email is confidential and may be legally 
> > > privileged. It is intended solely for the addressee. Access to this Email 
> > > by anyone else is unauthorized. If you are not the intended recipient, 
> > > any disclosure, copying, distribution or any action taken or omitted to 
> > > be taken in reliance on it, is prohibited and may be unlawful. When 
> > > addressed to our clients any opinions or advice contained in this Email 
> > > are subject to the terms and conditions expressed in any applicable 
> > > governing The Home Depot terms of business or client engagement letter. 
> > > The Home Depot disclaims all responsibility and liability for the 
> > > accuracy and content of this attachment and for any damages or losses 
> > > arising from any inaccuracies, errors, viruses, e.g., worms, trojan 
> > > horses, etc., or other items of a destructive nature, which may be 
> > > contained in this attachment and shall not be liable for direct, 
> > > indirect, consequential or special damages in connection with this e-mail 
> > > message or its attachment.


Re: [EXTERNAL] Regarding migrating data from Oracle to Cassandra.migrate data from Oracle to Cassandra.

2018-09-05 Thread Jeff Jirsa
All of  Sean's points are good, a few more:
- Apache Cassandra (free, open source, official) is usually sufficient. DSE
may be faster, but really it's about whether or not you're willing to pay
for support. If you're trying to stop paying Oracle, I suspect you'd
probably not want to start paying someone else - try the free version
first, and you can look for proprietary options after that.
- http://shop.oreilly.com/product/0636920043041.do is relatively recent and
mostly pretty good
- Ask a lot of questions, use this list, but try things out first so people
have a way to point you in the right direction.



On Wed, Sep 5, 2018 at 7:58 AM Durity, Sean R 
wrote:

> 3 starting points:
>
> -  DO NOT migrate your tables as they are in Oracle to Cassandra.
> In most cases, you need a different model for Cassandra
>
> -  DO take the (free) DataStax Academy courses to learn much more
> about Cassandra as you dive in. It is a systematic and bite-size approach
> to learning all things Cassandra (and eventually, DataStax Enterprise,
> should you go that way). However, open source Cassandra is fine as a data
> platform. DSE gives you more options for data models, better administration
> and monitoring tools, support, etc. It all depends on what you need/want to
> build/can afford
>
> -  Cluster sizing depends on your goals for the data platform. Do
> you need lots of storage, lots of throughput, high availability, low
> latency, workload separation, etc.? A couple guidelines – use at least 3
> nodes per data center (DC) and at least 2 DCs for availability. Use SSDs
> for storage and keep node size 3 TB or less for reasonable administration.
> If six nodes are too many – you probably don’t need Cassandra. If you can
> define what you need your data platform to deliver, then you can start a
> sizing discussion. The good thing is, you can always scale (as long as the
> data model is good).
>
>
>
>
>
> Sean Durity
>
>
>
> *From:* sha p 
> *Sent:* Wednesday, September 05, 2018 9:21 AM
> *To:* user@cassandra.apache.org
> *Subject:* [EXTERNAL] Regarding migrating data from Oracle to
> Cassandra.migrate data from Oracle to Cassandra.
>
>
>
>
>
> Hi all ,
>
> Me new to Cassandra , i was asked to migrate data from Oracle to Cassandra.
>
> Please help me giving your valuable guidance.
>
> 1) Can it be done using open source Cassandra.
>
> 2) Where should I start data model from?
>
> 3) I should use java, what kind of  jar/libs/tools I need use ?
>
> 4) How I decide the size of cluster , please provide some sample
> guidelines.
>
> 5) this should be in production , so what kind of things i should take
> care for better support or debugging tomorrow?
>
> 6) Please provide some good books /links which can help me in this task.
>
>
>
>
>
> Thanks in advance.
>
> Highly appreciated your every amal help.
>
>
>
> Regards,
>
> Shyam
>
>
> --
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>


RE: [EXTERNAL] Regarding migrating data from Oracle to Cassandra.migrate data from Oracle to Cassandra.

2018-09-05 Thread Durity, Sean R
3 starting points:

-  DO NOT migrate your tables as they are in Oracle to Cassandra. In 
most cases, you need a different model for Cassandra

-  DO take the (free) DataStax Academy courses to learn much more about 
Cassandra as you dive in. It is a systematic and bite-size approach to learning 
all things Cassandra (and eventually, DataStax Enterprise, should you go that 
way). However, open source Cassandra is fine as a data platform. DSE gives you 
more options for data models, better administration and monitoring tools, 
support, etc. It all depends on what you need/want to build/can afford

-  Cluster sizing depends on your goals for the data platform. Do you 
need lots of storage, lots of throughput, high availability, low latency, 
workload separation, etc.? A couple guidelines – use at least 3 nodes per data 
center (DC) and at least 2 DCs for availability. Use SSDs for storage and keep 
node size 3 TB or less for reasonable administration.  If six nodes are too 
many – you probably don’t need Cassandra. If you can define what you need your 
data platform to deliver, then you can start a sizing discussion. The good 
thing is, you can always scale (as long as the data model is good).


Sean Durity

From: sha p 
Sent: Wednesday, September 05, 2018 9:21 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Regarding migrating data from Oracle to Cassandra.migrate 
data from Oracle to Cassandra.


Hi all ,
Me new to Cassandra , i was asked to migrate data from Oracle to Cassandra.
Please help me giving your valuable guidance.
1) Can it be done using open source Cassandra.
2) Where should I start data model from?
3) I should use java, what kind of  jar/libs/tools I need use ?
4) How I decide the size of cluster , please provide some sample guidelines.
5) this should be in production , so what kind of things i should take care for 
better support or debugging tomorrow?
6) Please provide some good books /links which can help me in this task.


Thanks in advance.
Highly appreciated your every amal help.

Regards,
Shyam



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


Regarding migrating data from Oracle to Cassandra.migrate data from Oracle to Cassandra.

2018-09-05 Thread sha p
> Hi all ,
> Me new to Cassandra , i was asked to migrate data from Oracle to Cassandra.
> Please help me giving your valuable guidance.
> 1) Can it be done using open source Cassandra.
> 2) Where should I start data model from?
> 3) I should use java, what kind of  jar/libs/tools I need use ?
> 4) How I decide the size of cluster , please provide some sample
> guidelines.
> 5) this should be in production , so what kind of things i should take
> care for better support or debugging tomorrow?
> 6) Please provide some good books /links which can help me in this task.
>
>
> Thanks in advance.
> Highly appreciated your every amal help.
>
> Regards,
> Shyam
>


[no subject]

2018-09-05 Thread sha p
Hi all ,
Me new to Cassandra , i was asked to migrate data from Oracle to Cassandra.
Please help me giving your valuable guidance.
1) Can it be done using open source Cassandra.
2) Where should I start data model from?
3) I should use java, what kind of  jar/libs/tools I need use ?
4) How I decide the size of cluster , please provide some sample guidelines.
5) this should be in production , so what kind of things i should take care
for better support or debugging tomorrow?
6) Please provide some good books /links which can help me in this task.


Thanks in advance.
Highly appreciated your every amal help.

Regards,
Shyam


RE: Data Corruption due to multiple Cassandra 2.1 processes?

2018-09-05 Thread Steinmaurer, Thomas
Kurt,

I cloned the original ticket. The new one is: 
https://issues.apache.org/jira/browse/CASSANDRA-14691

I can’t change the Assignee resp. unassign it.

Thanks,
Thomas

From: kurt greaves 
Sent: Dienstag, 14. August 2018 04:53
To: User 
Subject: Re: Data Corruption due to multiple Cassandra 2.1 processes?

New ticket for backporting, referencing the existing.

On Mon., 13 Aug. 2018, 22:50 Steinmaurer, Thomas, 
mailto:thomas.steinmau...@dynatrace.com>> 
wrote:
Thanks Kurt.

What is the proper workflow here to get this accepted? Create a new ticket 
dedicated for the backport referencing 11540 or re-open 11540?

Thanks for your help.

Thomas

From: kurt greaves mailto:k...@instaclustr.com>>
Sent: Montag, 13. August 2018 13:24
To: User mailto:user@cassandra.apache.org>>
Subject: Re: Data Corruption due to multiple Cassandra 2.1 processes?

Yeah that's not ideal and could lead to problems. I think corruption is only 
likely if compactions occur, but seems like data loss is a potential not to 
mention all sorts of other possible nasties that could occur running two C*'s 
at once. Seems to me that 11540 should have gone to 2.1 in the first place, but 
it just got missed. Very simple patch so I think a backport should be accepted.

On 7 August 2018 at 15:57, Steinmaurer, Thomas 
mailto:thomas.steinmau...@dynatrace.com>> 
wrote:
Hello,

with 2.1, in case a second Cassandra process/instance is started on a host (by 
accident), may this result in some sort of corruption, although Cassandra will 
exit at some point in time due to not being able to bind TCP ports already in 
use?

What we have seen in this scenario is something like that:

ERROR [main] 2018-08-05 21:10:24,046 CassandraDaemon.java:120 - Error starting 
local jmx server:
java.rmi.server.ExportException: Port already in use: 7199; nested exception is:
java.net.BindException: Address already in use (Bind failed)
…

But then continuing with stuff like opening system and even user tables:

INFO  [main] 2018-08-05 21:10:24,060 CacheService.java:110 - Initializing key 
cache with capacity of 100 MBs.
INFO  [main] 2018-08-05 21:10:24,067 CacheService.java:132 - Initializing row 
cache with capacity of 0 MBs
INFO  [main] 2018-08-05 21:10:24,073 CacheService.java:149 - Initializing 
counter cache with capacity of 50 MBs
INFO  [main] 2018-08-05 21:10:24,074 CacheService.java:160 - Scheduling counter 
cache save to every 7200 seconds (going to save all keys).
INFO  [main] 2018-08-05 21:10:24,161 ColumnFamilyStore.java:365 - Initializing 
system.sstable_activity
INFO  [SSTableBatchOpen:2] 2018-08-05 21:10:24,692 SSTableReader.java:475 - 
Opening 
/var/opt/xxx-managed/cassandra/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/system-sstable_activity-ka-165
 (2023 bytes)
INFO  [SSTableBatchOpen:3] 2018-08-05 21:10:24,692 SSTableReader.java:475 - 
Opening 
/var/opt/xxx-managed/cassandra/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/system-sstable_activity-ka-167
 (2336 bytes)
INFO  [SSTableBatchOpen:1] 2018-08-05 21:10:24,692 SSTableReader.java:475 - 
Opening 
/var/opt/xxx-managed/cassandra/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/system-sstable_activity-ka-166
 (2686 bytes)
INFO  [main] 2018-08-05 21:10:24,755 ColumnFamilyStore.java:365 - Initializing 
system.hints
INFO  [SSTableBatchOpen:1] 2018-08-05 21:10:24,758 SSTableReader.java:475 - 
Opening 
/var/opt/xxx-managed/cassandra/system/hints-2666e20573ef38b390fefecf96e8f0c7/system-hints-ka-377
 (46210621 bytes)
INFO  [main] 2018-08-05 21:10:24,766 ColumnFamilyStore.java:365 - Initializing 
system.compaction_history
INFO  [SSTableBatchOpen:1] 2018-08-05 21:10:24,768 SSTableReader.java:475 - 
Opening 
/var/opt/xxx-managed/cassandra/system/compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca/system-compaction_history-ka-129
 (91269 bytes)
…

Replaying commit logs:

…
INFO  [main] 2018-08-05 21:10:25,896 CommitLogReplayer.java:267 - Replaying 
/var/opt/dynatrace-managed/cassandra/commitlog/CommitLog-4-1533133668366.log
INFO  [main] 2018-08-05 21:10:25,896 CommitLogReplayer.java:270 - Replaying 
/var/opt/dynatrace-managed/cassandra/commitlog/CommitLog-4-1533133668366.log 
(CL version 4, messaging version 8)
…

Even writing memtables already (below just pasted system tables, but also user 
tables):

…
INFO  [MemtableFlushWriter:4] 2018-08-05 21:11:52,524 Memtable.java:347 - 
Writing 
Memtable-size_estimates@1941663179(2.655MiB
 serialized bytes, 325710 ops, 2%/0% of on/off-heap limit)
INFO  [MemtableFlushWriter:3] 2018-08-05 21:11:52,552 Memtable.java:347 - 
Writing 
Memtable-peer_events@1474667699(0.199KiB
 serialized bytes, 4 ops, 0%/0% of on/off-heap limit)
…

Until it comes to a point where it can’t bind ports like the storage port 7000:

ERROR [main] 2018-08-05 21:11:54,350 CassandraDaemon.java:395 - Fatal 
configuration error

Re: High IO and poor read performance on 3.11.2 cassandra cluster

2018-09-05 Thread Alexander Dejanovski
Don't forget to run "nodetool upgradesstables -a" after you ran the ALTER
statement so that all SSTables get re-written with the new compression
settings.

Since you have a lot of tables in your cluster, be aware that lowering the
chunk length will grow the offheap memory usage of Cassandra.
You can get more informations here :
http://thelastpickle.com/blog/2018/08/08/compression_performance.html

You should also check your readahead settings as it may be set too high :
sudo blockdev --report
The default is usually 256 but Cassandra would rather favor low readahead
values to get more IOPS instead of more throughput (and readahead is
usually not that useful for Cassandra). A conservative setting is 64 (you
can go down to 8 and see how Cassandra performs then).
Do note that changing the readahead settings requires to restart Cassandra
as it is only read once by the JVM during startup.

Cheers,

On Wed, Sep 5, 2018 at 7:27 AM CPC  wrote:

> Could you decrease chunk_length_in_kb to 16 or 8 and repeat the test.
>
> On Wed, Sep 5, 2018, 5:51 AM wxn...@zjqunshuo.com 
> wrote:
>
>> How large is your row? You may meet reading wide row problem.
>>
>> -Simon
>>
>> *From:* Laxmikant Upadhyay 
>> *Date:* 2018-09-05 01:01
>> *To:* user 
>> *Subject:* High IO and poor read performance on 3.11.2 cassandra cluster
>>
>> We have 3 node cassandra cluster (3.11.2) in single dc.
>>
>> We have written 450 million records on the table with LCS. The write
>> latency is fine.  After write we perform read and update operations.
>>
>> When we run read+update operations on newly inserted 1 million records
>> (on top of 450 m records) then the read latency and io usage is under
>> control. However when we perform read+update on old 1 million records which
>> are part of 450 million records we observe high read latency (The
>> performance goes down by 4 times in comparison 1st case ).  We have not
>> observed major gc pauses.
>>
>> *system information:*
>> *cpu core :*  24
>> *disc type : *ssd . we are using raid with deadline schedular
>> *disk space:*
>> df -h :
>> Filesystem  Size  Used Avail Use% Mounted on
>> /dev/sdb11.9T  393G  1.5T  22% /var/lib/cassandra
>> *memory:*
>> free -g
>>   totalusedfree  shared  buff/cache
>>  available
>> Mem: 62  30   0   0  32
>> 31
>> Swap: 8   0   8
>>
>> ==
>>
>> *schema*
>>
>> desc table ks.xyz;
>>
>> CREATE TABLE ks.xyz (
>> key text,
>> column1 text,
>> value text,
>> PRIMARY KEY (key, column1)
>> ) WITH COMPACT STORAGE
>> AND CLUSTERING ORDER BY (column1 ASC)
>> AND bloom_filter_fp_chance = 0.1
>> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>> AND comment = ''
>> AND compaction = {'class':
>> 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
>> AND compression = {'chunk_length_in_kb': '64', 'class':
>> 'org.apache.cassandra.io.compress.LZ4Compressor'}
>> AND crc_check_chance = 1.0
>> AND dclocal_read_repair_chance = 0.0
>> AND default_time_to_live = 0
>> AND gc_grace_seconds = 864000
>> AND max_index_interval = 2048
>> AND memtable_flush_period_in_ms = 0
>> AND min_index_interval = 128
>> AND read_repair_chance = 0.0
>> AND speculative_retry = '99PERCENTILE';
>>
>> ==
>> Below is some system stats snippet when read operations was running:
>>
>> *iotop -o * : Observation : the disk read goes up to 5.5 G/s
>>
>> Total DISK READ :   *3.86 G/s* | Total DISK WRITE :1252.88 K/s
>> Actual DISK READ:  * 3.92 G/s* | Actual DISK WRITE:   0.00 B/s
>>   TID  PRIO  USER DISK READ  DISK WRITE  SWAPIN IO>COMMAND
>> 10715 be/4 cassandr  375.89 M/s   99.79 K/s  0.00 % 29.15 % java
>> -Dorg.xerial.snappy.tempdir=/var/tmp
>> -Dja~etrics-core-3.1.0.jar:/usr/share/cassandra/lib/
>> 10714 be/4 cassandr  358.56 M/s  107.18 K/s  0.00 % 27.06 % java
>> -Dorg.xerial.snappy.tempdir=/var/tmp
>> -Dja~etrics-core-3.1.0.jar:/usr/share/cassandra/lib/
>> 10712 be/4 cassandr  351.86 M/s  147.83 K/s  0.00 % 25.02 % java
>> -Dorg.xerial.snappy.tempdir=/var/tmp
>> -Dja~etrics-core-3.1.0.jar:/usr/share/cassandra/lib/
>> 10718 be/4 cassandr  359.82 M/s  110.87 K/s  0.00 % 24.49 % java
>> -Dorg.xerial.snappy.tempdir=/var/tmp
>> -Dja~etrics-core-3.1.0.jar:/usr/share/cassandra/lib/
>> 10711 be/4 cassandr  333.03 M/s  125.66 K/s  0.00 % 23.37 % java
>> -Dorg.xerial.snappy.tempdir=/var/tmp
>> -Dja~etrics-core-3.1.0.jar:/usr/share/cassandra/lib/
>> 10716 be/4 cassandr  330.80 M/s  103.48 K/s  0.00 % 23.02 % java
>> -Dorg.xerial.snappy.tempdir=/var/tmp
>> -Dja~etrics-core-3.1.0.jar:/usr/share/cassandra/lib/
>> 10717 be/4 cassandr  319.49 M/s  118.27 K/s  0.00 % 22.11 % java
>> -Dorg.xerial.snappy.tempdir=/var/tmp
>>