That is one of the most asked questions that it prompted me to write a blog
post last month --
https://academy.datastax.com/support-blog/counting-keys-might-well-be-counting-stars

On Mon, Nov 13, 2017 at 6:55 PM, Hareesh Veduraj <hareesh.vedu...@gmail.com>
wrote:

> Hi Team,
>
> I have a new requirement , where I need to copy all the rows from one
> table to another table in Cassandra, where the second table contains one
> extra column.
>
> I have written a python script, which reads each row and inserts . But the
> problem is in stage environment I'm observing the select count query is
> returning different result each time I execute, so the row count is
> varying.
>
> The base  table contains 6 lakhs rows.The stage cluster is of 5 instances
> with a replication factor of 3.
>
> I'm able to successfully run the query in dev cluster with the same data,
> where 3 instances are there and replication factor is 3.
>
> Can anyone help me out what is the best way to achieve this?Also, what is
> the reason for the variation in select count query?
>
> Below is the Python script snippet:
>
> cql='select key, column1, column2, column3, column4, value from table1
> limit 1000000';
> insert_table2_query = session.prepare('INSERT INTO table2 (key, column1,
> column2, column3, column4, column5, value) VALUES (?, ?, ?, ?, ?, ?, ?)');
>
> def insert_into_table2():
>     counter =1;
>     for row in session.execute(cql):
>         try:
>             val1=row.column4;
>             val2 = row.value;
>             if(len(val1)==0 or val1 == None) :
>                  val1="";
>             session.execute(insert_table2_query,(row.key,row.column1,row
> .column2,row.column3,val1,val2,val2))
>             print("processed row count : "+str(counter));
>             counter = counter + 1;
>         except Exception as e:
>             counter = counter + 1;
>             print 'failed to insert the row ';
>
>
> Appreciating any help.
>
> Thanks,
> Hareesh.A.V
> Mob: +7022643519
>
>

Reply via email to