That is one of the most asked questions that it prompted me to write a blog post last month -- https://academy.datastax.com/support-blog/counting-keys-might-well-be-counting-stars
On Mon, Nov 13, 2017 at 6:55 PM, Hareesh Veduraj <hareesh.vedu...@gmail.com> wrote: > Hi Team, > > I have a new requirement , where I need to copy all the rows from one > table to another table in Cassandra, where the second table contains one > extra column. > > I have written a python script, which reads each row and inserts . But the > problem is in stage environment I'm observing the select count query is > returning different result each time I execute, so the row count is > varying. > > The base table contains 6 lakhs rows.The stage cluster is of 5 instances > with a replication factor of 3. > > I'm able to successfully run the query in dev cluster with the same data, > where 3 instances are there and replication factor is 3. > > Can anyone help me out what is the best way to achieve this?Also, what is > the reason for the variation in select count query? > > Below is the Python script snippet: > > cql='select key, column1, column2, column3, column4, value from table1 > limit 1000000'; > insert_table2_query = session.prepare('INSERT INTO table2 (key, column1, > column2, column3, column4, column5, value) VALUES (?, ?, ?, ?, ?, ?, ?)'); > > def insert_into_table2(): > counter =1; > for row in session.execute(cql): > try: > val1=row.column4; > val2 = row.value; > if(len(val1)==0 or val1 == None) : > val1=""; > session.execute(insert_table2_query,(row.key,row.column1,row > .column2,row.column3,val1,val2,val2)) > print("processed row count : "+str(counter)); > counter = counter + 1; > except Exception as e: > counter = counter + 1; > print 'failed to insert the row '; > > > Appreciating any help. > > Thanks, > Hareesh.A.V > Mob: +7022643519 > >