Re: What happens if multiple processes send create table if not exist statement to cassandra?

2018-01-27 Thread Jeff Jirsa
Originally we would make tables based on keyspace name / table name pairs, 
which was fine unless you dropped a table and recreated it, which could happen 
while one node was offline / split network / gc pause. The recreation scenario 
could allow data to be resurrected after a drop. 

So we augmented that (years and years ago) to have a uuid identifier for the 
table, so now we can differentiate between table creations - if you drop a 
table and recreate it, the new table has a different id.

However, if you issue a create table on two instances at the same time, neither 
thinks the table exists, each generates their own cfid, two ids get created. 
Schema eventually gets store inside Cassandra, so last write wins, and the 
first ID seen gets stomped by the second. The race typically manifests as one 
instance throwing errors about cfid not found, or a data directory that doesn’t 
match the cfid in the schema (so a restart creates an empty data directory), or 
similar situations like that.

The actual plumbing to use strong consistency (actually do paxos or some other 
election to make sure exactly one id wins) is planned, likely for 4.0, but 
doesn’t exist in any released version now

So again, don’t programmatically create tables if there’s a race possible, it 
may work fine most of the time, but there’s a risk of ugly failure.

-- 
Jeff Jirsa


> On Jan 27, 2018, at 1:23 PM, Kant Kodali  wrote:
> 
> May I know why? 
> 
> Sent from my iPhone
> 
>> On Jan 27, 2018, at 12:36 PM, Jeff Jirsa  wrote:
>> 
>> Yes it causes issues
>> 
>> 
>> -- 
>> Jeff Jirsa
>> 
>> 
>>> On Jan 27, 2018, at 12:17 PM, Kant Kodali  wrote:
>>> 
>>> Schema changes I assume you guys are talking about different create table 
>>> or alter table statements. What if multiple threads issue same exact create 
>>> table if not exists statement? Will that cause issues?
>>> 
>>> Sent from my iPhone
>>> 
 On Jan 27, 2018, at 11:41 AM, Carlos Rolo  wrote:
 
 Don't do that. Worst case you might get different schemas in flight and no 
 agreement on your cluster.  If you are already doing that, check "nodetool 
 describecluster" after you do that.
 
 Like Jeff said, it is likely to cause problems.
 
 Regards,
 
 Carlos Juzarte Rolo
 Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
  
 Pythian - Love your data
 
 rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin: 
 linkedin.com/in/carlosjuzarterolo 
 Mobile: +351 918 918 100 
 www.pythian.com
 
> On Sat, Jan 27, 2018 at 7:25 PM, Jeff Jirsa  wrote:
> It’s not LWT. Don’t do programmatic schema changes that can race, it’s 
> likely to cause problems
> 
> 
> --
> Jeff Jirsa
> 
> 
> > On Jan 27, 2018, at 10:19 AM, Kant Kodali  wrote:
> >
> > Hi All,
> >
> > What happens if multiple processes send create table if not exist 
> > statement to cassandra? will there be any data corruption or any other 
> > issues if I send "create table if not exist" request often?
> >
> > I dont see any entry in system.paxos table so is it fair to say "IF NOT 
> > EXISTS" doesn't automatically imply LWT?
> >
> > Thanks!
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
> 
 
 
 --
 
 
 
 


Re: What happens if multiple processes send create table if not exist statement to cassandra?

2018-01-27 Thread Kant Kodali
May I know why? 

Sent from my iPhone

> On Jan 27, 2018, at 12:36 PM, Jeff Jirsa  wrote:
> 
> Yes it causes issues
> 
> 
> -- 
> Jeff Jirsa
> 
> 
>> On Jan 27, 2018, at 12:17 PM, Kant Kodali  wrote:
>> 
>> Schema changes I assume you guys are talking about different create table or 
>> alter table statements. What if multiple threads issue same exact create 
>> table if not exists statement? Will that cause issues?
>> 
>> Sent from my iPhone
>> 
>>> On Jan 27, 2018, at 11:41 AM, Carlos Rolo  wrote:
>>> 
>>> Don't do that. Worst case you might get different schemas in flight and no 
>>> agreement on your cluster.  If you are already doing that, check "nodetool 
>>> describecluster" after you do that.
>>> 
>>> Like Jeff said, it is likely to cause problems.
>>> 
>>> Regards,
>>> 
>>> Carlos Juzarte Rolo
>>> Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
>>>  
>>> Pythian - Love your data
>>> 
>>> rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin: 
>>> linkedin.com/in/carlosjuzarterolo 
>>> Mobile: +351 918 918 100 
>>> www.pythian.com
>>> 
 On Sat, Jan 27, 2018 at 7:25 PM, Jeff Jirsa  wrote:
 It’s not LWT. Don’t do programmatic schema changes that can race, it’s 
 likely to cause problems
 
 
 --
 Jeff Jirsa
 
 
 > On Jan 27, 2018, at 10:19 AM, Kant Kodali  wrote:
 >
 > Hi All,
 >
 > What happens if multiple processes send create table if not exist 
 > statement to cassandra? will there be any data corruption or any other 
 > issues if I send "create table if not exist" request often?
 >
 > I dont see any entry in system.paxos table so is it fair to say "IF NOT 
 > EXISTS" doesn't automatically imply LWT?
 >
 > Thanks!
 
 -
 To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
 For additional commands, e-mail: user-h...@cassandra.apache.org
 
>>> 
>>> 
>>> --
>>> 
>>> 
>>> 
>>> 


Re: What happens if multiple processes send create table if not exist statement to cassandra?

2018-01-27 Thread Jeff Jirsa
Yes it causes issues


-- 
Jeff Jirsa


> On Jan 27, 2018, at 12:17 PM, Kant Kodali  wrote:
> 
> Schema changes I assume you guys are talking about different create table or 
> alter table statements. What if multiple threads issue same exact create 
> table if not exists statement? Will that cause issues?
> 
> Sent from my iPhone
> 
>> On Jan 27, 2018, at 11:41 AM, Carlos Rolo  wrote:
>> 
>> Don't do that. Worst case you might get different schemas in flight and no 
>> agreement on your cluster.  If you are already doing that, check "nodetool 
>> describecluster" after you do that.
>> 
>> Like Jeff said, it is likely to cause problems.
>> 
>> Regards,
>> 
>> Carlos Juzarte Rolo
>> Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
>>  
>> Pythian - Love your data
>> 
>> rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin: 
>> linkedin.com/in/carlosjuzarterolo 
>> Mobile: +351 918 918 100 
>> www.pythian.com
>> 
>>> On Sat, Jan 27, 2018 at 7:25 PM, Jeff Jirsa  wrote:
>>> It’s not LWT. Don’t do programmatic schema changes that can race, it’s 
>>> likely to cause problems
>>> 
>>> 
>>> --
>>> Jeff Jirsa
>>> 
>>> 
>>> > On Jan 27, 2018, at 10:19 AM, Kant Kodali  wrote:
>>> >
>>> > Hi All,
>>> >
>>> > What happens if multiple processes send create table if not exist 
>>> > statement to cassandra? will there be any data corruption or any other 
>>> > issues if I send "create table if not exist" request often?
>>> >
>>> > I dont see any entry in system.paxos table so is it fair to say "IF NOT 
>>> > EXISTS" doesn't automatically imply LWT?
>>> >
>>> > Thanks!
>>> 
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>> 
>> 
>> 
>> --
>> 
>> 
>> 
>> 


Re: What happens if multiple processes send create table if not exist statement to cassandra?

2018-01-27 Thread Kant Kodali
Schema changes I assume you guys are talking about different create table or 
alter table statements. What if multiple threads issue same exact create table 
if not exists statement? Will that cause issues?

Sent from my iPhone

> On Jan 27, 2018, at 11:41 AM, Carlos Rolo  wrote:
> 
> Don't do that. Worst case you might get different schemas in flight and no 
> agreement on your cluster.  If you are already doing that, check "nodetool 
> describecluster" after you do that.
> 
> Like Jeff said, it is likely to cause problems.
> 
> Regards,
> 
> Carlos Juzarte Rolo
> Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
>  
> Pythian - Love your data
> 
> rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin: 
> linkedin.com/in/carlosjuzarterolo 
> Mobile: +351 918 918 100 
> www.pythian.com
> 
>> On Sat, Jan 27, 2018 at 7:25 PM, Jeff Jirsa  wrote:
>> It’s not LWT. Don’t do programmatic schema changes that can race, it’s 
>> likely to cause problems
>> 
>> 
>> --
>> Jeff Jirsa
>> 
>> 
>> > On Jan 27, 2018, at 10:19 AM, Kant Kodali  wrote:
>> >
>> > Hi All,
>> >
>> > What happens if multiple processes send create table if not exist 
>> > statement to cassandra? will there be any data corruption or any other 
>> > issues if I send "create table if not exist" request often?
>> >
>> > I dont see any entry in system.paxos table so is it fair to say "IF NOT 
>> > EXISTS" doesn't automatically imply LWT?
>> >
>> > Thanks!
>> 
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>> 
> 
> 
> --
> 
> 
> 
> 


Re: What happens if multiple processes send create table if not exist statement to cassandra?

2018-01-27 Thread Carlos Rolo
Don't do that. Worst case you might get different schemas in flight and no
agreement on your cluster.  If you are already doing that, check "nodetool
describecluster" after you do that.

Like Jeff said, it is likely to cause problems.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant / Datastax Certified Architect / Cassandra MVP

Pythian - Love your data

rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
*linkedin.com/in/carlosjuzarterolo
*
Mobile: +351 918 918 100
www.pythian.com

On Sat, Jan 27, 2018 at 7:25 PM, Jeff Jirsa  wrote:

> It’s not LWT. Don’t do programmatic schema changes that can race, it’s
> likely to cause problems
>
>
> --
> Jeff Jirsa
>
>
> > On Jan 27, 2018, at 10:19 AM, Kant Kodali  wrote:
> >
> > Hi All,
> >
> > What happens if multiple processes send create table if not exist
> statement to cassandra? will there be any data corruption or any other
> issues if I send "create table if not exist" request often?
> >
> > I dont see any entry in system.paxos table so is it fair to say "IF NOT
> EXISTS" doesn't automatically imply LWT?
> >
> > Thanks!
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>

-- 


--





Re: What happens if multiple processes send create table if not exist statement to cassandra?

2018-01-27 Thread Jeff Jirsa
It’s not LWT. Don’t do programmatic schema changes that can race, it’s likely 
to cause problems


-- 
Jeff Jirsa


> On Jan 27, 2018, at 10:19 AM, Kant Kodali  wrote:
> 
> Hi All,
> 
> What happens if multiple processes send create table if not exist statement 
> to cassandra? will there be any data corruption or any other issues if I send 
> "create table if not exist" request often?
> 
> I dont see any entry in system.paxos table so is it fair to say "IF NOT 
> EXISTS" doesn't automatically imply LWT? 
> 
> Thanks!

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



What happens if multiple processes send create table if not exist statement to cassandra?

2018-01-27 Thread Kant Kodali
Hi All,

What happens if multiple processes send create table if not exist statement
to cassandra? will there be any data corruption or any other issues if I
send "create table if not exist" request often?

I dont see any entry in system.paxos table so is it fair to say "IF NOT
EXISTS" doesn't automatically imply LWT?

Thanks!


Re: Migrate from Windows to Linux

2018-01-27 Thread Alaa Zubaidi (PDF)
Thanks Alain, yes, I saw some of these, but I was hoping that there are new
experiments.
This is useful, now I need to find new HW to test these options.
Thanks for your help..
Alaa

On Sat, Jan 27, 2018 at 9:04 AM, Alain RODRIGUEZ  wrote:

> Hello Alaa,
>
> Over time, people who tried seems to have failed to do a live migration
> from windows to linux (and probably the other way around). It appears to be
> something unsupported:
>
> http://grokbase.com/t/cassandra/user/13anrd7qv1/mixed-linux-
> windows-cluster-in-cassandra-1-2
> http://grokbase.com/t/cassandra/user/115vy2hy4w/mixing-
> different-os-in-a-cassandra-cluster
>
> Those discussions are a bit old. Yet if this incompatibility still true,
> you can probably:
>
> 1 - Stop the cluster and use sstable loader to load data in the new linux
> cluster
>
> or
>
> 2 - Fork writes to the new cluster, run the sstable loader then switch
> clients, apparently doable without downtime
>
> This 2 solutions were discussed there http://grokbase.com/t/cassandr
> a/user/125rhxydwb/migrating-from-a-windows-cluster-to-a-linux-cluster
>
> 3 - Try to run mixed cluster in a testing environment, check the Cassandra
> code, possibly make a ticket or offer a patch if it does not work. If doing
> so, add a new data center with linux, not mixed node within the same rack
> or data center. It's safer and more efficient. Even though in this specific
> case it might fail.
>
> C*heers,
> ---
> Alain Rodriguez - @arodream - al...@thelastpickle.com
> France / Spain
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
>
> 2018-01-26 20:49 GMT+00:00 Alaa Zubaidi (PDF) :
>
>> Hi,
>> What is the best way to migrate my Cassandra 2.x cluster from Windows to
>> Linux?
>> - Can I mix Windows and linux nodes in the same cluster? bring linux node
>> one by one, and shutdown windows nodes one by one?
>> - Can I use multiple racks feature? how would this work?
>> - Other ideas?
>>
>> Regards.
>> Alaa
>>
>> *This message may contain confidential and privileged information. If it
>> has been sent to you in error, please reply to advise the sender of the
>> error and then immediately permanently delete it and all attachments to it
>> from your systems. If you are not the intended recipient, do not read,
>> copy, disclose or otherwise use this message or any attachments to it. The
>> sender disclaims any liability for such unauthorized use. PLEASE NOTE that
>> all incoming e-mails sent to PDF e-mail accounts will be archived and may
>> be scanned by us and/or by external service providers to detect and prevent
>> threats to our systems, investigate illegal or inappropriate behavior,
>> and/or eliminate unsolicited promotional e-mails (“spam”). If you have any
>> concerns about this process, please contact us at *
>> *legal.departm...@pdf.com* *.*
>
>
>


-- 

Alaa Zubaidi
PDF Solutions, Inc.
333 West San Carlos Street, Suite 1000
San Jose, CA 95110  USA
Tel: 408-283-5639
fax: 408-938-6479
email: alaa.zuba...@pdf.com

-- 
*This message may contain confidential and privileged information. If it 
has been sent to you in error, please reply to advise the sender of the 
error and then immediately permanently delete it and all attachments to it 
from your systems. If you are not the intended recipient, do not read, 
copy, disclose or otherwise use this message or any attachments to it. The 
sender disclaims any liability for such unauthorized use. PLEASE NOTE that 
all incoming e-mails sent to PDF e-mail accounts will be archived and may 
be scanned by us and/or by external service providers to detect and prevent 
threats to our systems, investigate illegal or inappropriate behavior, 
and/or eliminate unsolicited promotional e-mails (“spam”). If you have any 
concerns about this process, please contact us at *
*legal.departm...@pdf.com* *.*


Re: Migrate from Windows to Linux

2018-01-27 Thread Alain RODRIGUEZ
Hello Alaa,

Over time, people who tried seems to have failed to do a live migration
from windows to linux (and probably the other way around). It appears to be
something unsupported:

http://grokbase.com/t/cassandra/user/13anrd7qv1/mixed-linux-windows-cluster-
in-cassandra-1-2
http://grokbase.com/t/cassandra/user/115vy2hy4w/mixing-different-os-in-a-
cassandra-cluster

Those discussions are a bit old. Yet if this incompatibility still true,
you can probably:

1 - Stop the cluster and use sstable loader to load data in the new linux
cluster

or

2 - Fork writes to the new cluster, run the sstable loader then switch
clients, apparently doable without downtime

This 2 solutions were discussed there http://grokbase.com/t/
cassandra/user/125rhxydwb/migrating-from-a-windows-
cluster-to-a-linux-cluster

3 - Try to run mixed cluster in a testing environment, check the Cassandra
code, possibly make a ticket or offer a patch if it does not work. If doing
so, add a new data center with linux, not mixed node within the same rack
or data center. It's safer and more efficient. Even though in this specific
case it might fail.

C*heers,
---
Alain Rodriguez - @arodream - al...@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com


2018-01-26 20:49 GMT+00:00 Alaa Zubaidi (PDF) :

> Hi,
> What is the best way to migrate my Cassandra 2.x cluster from Windows to
> Linux?
> - Can I mix Windows and linux nodes in the same cluster? bring linux node
> one by one, and shutdown windows nodes one by one?
> - Can I use multiple racks feature? how would this work?
> - Other ideas?
>
> Regards.
> Alaa
>
> *This message may contain confidential and privileged information. If it
> has been sent to you in error, please reply to advise the sender of the
> error and then immediately permanently delete it and all attachments to it
> from your systems. If you are not the intended recipient, do not read,
> copy, disclose or otherwise use this message or any attachments to it. The
> sender disclaims any liability for such unauthorized use. PLEASE NOTE that
> all incoming e-mails sent to PDF e-mail accounts will be archived and may
> be scanned by us and/or by external service providers to detect and prevent
> threats to our systems, investigate illegal or inappropriate behavior,
> and/or eliminate unsolicited promotional e-mails (“spam”). If you have any
> concerns about this process, please contact us at *
> *legal.departm...@pdf.com* *.*


Re: Slow paging query on Cassandra.

2018-01-27 Thread Avi Kivity
Does the last_update_date constraint filter out a lot of rows? In that 
case the server may be reading a large number of rows, only to throw 
them away since they get filtered out.



If you apply the filter on the client side, you shouldn't see timeouts 
(but overall the process will be slower since you have to transfer more 
data).



btw, from the logs it looks like the client is multi-threaded, there are 
different token ranges in the same time period.



On 01/26/2018 10:39 PM, Juan Manuel Alonso wrote:

Hi guys,

I'm having some trouble while using paged queries on Cassandra's Java 
driver (version 3.3.2). I'm using Cassandra 3.11.0.


I have to fetch a page of data from the DB, then make some trivial 
changes, and then update these rows.


A simplified version of the code i'm running would be:

                    Integer rowCounter = 0;
                    Statement selectQuery = 
QueryBuilder.select()...setFetchSize(pageSize)...;
                    ResultSet result = 
cassandraSession.execute(selectQuery);

                    List mappedResults = new ArrayList<>();
                    for (Row row : result) {
                        rowCounter++;
                        mappedResults.add(map(row));
                        if (rowCounter % pageSize == 0) {

                            List resultsToUpdate = 
modifyData(mappedResults);
                            for (MyClass resultToUpdate : 
resultsToUpdate){
                                Statement updateQuery = 
QueryBuilder.update(KEYSPACE, tableName)...;

cassandraSession.execute(query);
                            }
TimeUnit.SECONDS.sleep(sleepSeconds); //Sleep for a few seconds to let 
the DB... breathe

                        }

                    }

I'm using consistency level ONE on both select and update queries, the 
value of sleepSeconds is 5 and the pageSize is 47.


My problem is that I have to use very small page sizes, otherwise 
queries start to timeout on Cassandra.


There is only one thread running this long update process, but when i 
check Cassandra's debug.log, it looks like this:


...
DEBUG [ScheduledTasks:1] 2018-01-26 12:43:09,221 
MonitoringTask.java:173 - 55 operations were slow in the last 5001 msecs:
token(id) > 1940709131428868672 AND token(id) <= 1976881771356013545 
LIMIT 47>, time 1232 msec - slow timeout 500 msec/cross-node
token(id) > -31240603717813337 AND token(id) <= 93066413544676618 
LIMIT 47>, time 672 msec - slow timeout 500 msec/cross-node
token(id) > -2746601914911102981 AND token(id) <= -2679503374406295369 
LIMIT 47>, time 722 msec - slow timeout 500 msec/cross-node
token(id) > 8697901506577253630 AND token(id) <= 8756251242481074941 
LIMIT 47>, time 1737 msec - slow timeout 500 msec/cross-node
token(id) > -2566441277217350674 AND token(id) <= -2410488306633473620 
LIMIT 47>, time 997 msec - slow timeout 500 msec/cross-node
token(id) > 5186947162422827855 AND token(id) <= 5251256039266177164 
LIMIT 47>, time 1619 msec - slow timeout 500 msec/cross-node
token(id) > 523415566358416448 AND token(id) <= 558165594730430519 
LIMIT 47>, time 793 msec - slow timeout 500 msec/cross-node
token(id) > -6313110054894254305 AND token(id) <= -614970167889875 
LIMIT 47>, time 510 msec - slow timeout 500 msec/cross-node
token(id) > 133117363640100699 AND token(id) <= 326755086351479456 
LIMIT 47>, time 594 msec - slow timeout 500 msec/cross-node
token(id) > -5773756298752768296 AND token(id) <= -5672224259310839216 
LIMIT 47>, time 631 msec - slow timeout 500 msec/cross-node
token(id) > 9138868762246577790 AND token(id) <= 9184809921750217730 
LIMIT 47>, time 1680 msec - slow timeout 500 msec/cross-node
token(id) > 1481347618188085389 AND token(id) <= 1529429375374220120 
LIMIT 47>, time 1337 msec - slow timeout 500 msec/cross-node
token(id) > -3179570044050246190 AND token(id) <= -2975237200717735765 
LIMIT 47>, time 773 msec - slow timeout 500 msec/cross-node
token(id) > -1992364373944487162 AND token(id) <= -1754930707218513982 
LIMIT 47>, time 793 msec - slow timeout 500 msec/cross-node
token(id) > 7461256765584395144 AND token(id) <= 7513523865647503158 
LIMIT 47>, time 1569 msec - slow timeout 500 msec/cross-node
token(id) > 2199511646454841639 AND token(id) <= 2235092311035533306 
LIMIT 47>, time 1157 msec - slow timeout 500 msec/cross-node
token(id) > 5981009014177068366 AND token(id) <= 6193847522724693984 
LIMIT 47>, time 1549 msec - slow timeout 500 msec/cross-node
token(id) > 6587824379305518475 AND token(id) <= 6941621185441223079 
LIMIT 47>, time 1491 msec - slow timeout 500 msec/cross-node
token(id) > -2888016351766682341 AND token(id) <= -2832466742668731344 
LIMIT 47>, time 642 msec - slow timeout 500 msec/cross-node
token(id) > 4599678137499867302 AND token(id) <= 4681791682494977137 
LIMIT 47>, time 1222 msec - slow timeout 500 msec/cross-node
token(id) > 4097891947569113599 AND token(id) <= 4205652216148641874 
LIMIT 47>, time 1421 msec - slow timeout 500 msec/cross-node
token(id) >