I’ve never been a big fan of the “COPY” statement.

My preference for stuff like this (though I am definitely in the minority I 
think!) — particularly for the amount of data you’re talking about — is to use 
the open source tool “cassandradump” — which is similar to mysqldump but for 
cassandra.

Original:
https://github.com/gianlucaborello/cassandradump 
<https://github.com/gianlucaborello/cassandradump>

My version which changes a number of things to make it a little easier to work 
with (and more mysqldump-like):
https://github.com/terwilliger42/cassandradump/commits/develop-tw42 
<https://github.com/terwilliger42/cassandradump/commits/develop-tw42>

- Updated for Python 3 (only works on Python 3; tested on Python 3.5) 
<https://github.com/terwilliger42/cassandradump/commit/3aab25d74eb373fdf279d8eec89df7d57b3e07df>
- Added progress bar
- Added default reading/writing to stdin/stdout
- Added ability to prompt for password
- --protocol-version no longer required for C* 2.x+
- Added capability to export/import into a different keyspace
- Split commandline args into two subparsers "import" and "export"
- Added --truncate option
- Added shorter CLI params, similar to mysqldump (-h for --host, -u for 
--username, -p for --password, etc)

- Max

> On Oct 25, 2018, at 5:07 am, Philip Ó Condúin <philipocond...@gmail.com> 
> wrote:
> 
> Hi Alain,
> 
> That is exactly what I did yesterday in the end.  I ran the selects and 
> output the results to a file, I ran some greps on that file to leave myself 
> with just the data rows removing any white space and headers.
> I then copied this data into a notepad on my local machine and saved it as a 
> csv.  Luckily the results of the selects were delimited by pipe "|" so I 
> imported the csv into a spreadsheet and was able to separate the values into 
> columns.
> 
> From here I was able to build up the insert statements and now have 4K insert 
> statements as a backup.
> 
> Thanks a lot for your reply.
> 
> Kind regards,
> Phil
> 
> On Thu, 25 Oct 2018 at 11:59, Alain RODRIGUEZ <arodr...@gmail.com 
> <mailto:arodr...@gmail.com>> wrote:
> 
> Does anyone have any ideas of what I can do to generate inserts based on 
> primary key numbers in an excel spreadsheet?
> 
> A quick thought:
> 
> What about using a column of the spreadsheet to actually store the SELECT 
> result and generate the INSERT statement (and I would probably do the DELETE 
> too) corresponding to each row using the power of the spreadsheet to write 
> the query once and have it for all the partitions with the proper values?
> 
> The spreadsheet would then be your backup somehow.
> 
> We are a bit far from any Cassandra advice, but that's my first thought on 
> your problem, use the spreadsheet :).
> Another option is probably to SELECT these rows and INSERT them into some 
> other Cassandra table (same cluster or not). Here you would have to code it I 
> think (client app of any kind)
> This might not a good fit, but just in case, you might want to check at the 
> 'COPY' statement: 
> https://stackoverflow.com/questions/21363046/how-to-select-data-from-a-table-and-insert-into-another-table
>  
> <https://stackoverflow.com/questions/21363046/how-to-select-data-from-a-table-and-insert-into-another-table>
> I'm not too sure what suits you the best.
> 
> C*heers,
> -----------------------
> Alain Rodriguez - al...@thelastpickle.com <mailto:al...@thelastpickle.com>
> France / Spain
> 
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com <http://www.thelastpickle.com/>
> Le mer. 24 oct. 2018 à 12:46, Philip Ó Condúin <philipocond...@gmail.com 
> <mailto:philipocond...@gmail.com>> a écrit :
> Hi All,
> 
> I have a problem that I'm trying to work out and can't find anything online 
> that may help me.
> 
> I have been asked to delete 4K records from a Column Family that has a total 
> of 1.8 million rows.  I have been given an excel spreadsheet with a list of 
> the 4K PRIMARY KEY numbers to be deleted.  Great, the delete will be easy 
> anyway.
> 
> But before I delete them I want to take a backup of what I'm deleting before 
> I do, so that if the customer comes along and says they got the wrong numbers 
> then I can quickly restore one or all of them.
> I have been trying to figure out how I can generate inserts from a select but 
> it looks like this is not possible.
> 
> I'm using centos and Cassandra 2.11
> 
> Does anyone have any ideas of what I can do to generate inserts based on 
> primary key numbers in an excel spreadsheet?
> 
> Kind Regards,
> Phil
> 
> 
> 
> 
> -- 
> Regards,
> Phil

Reply via email to