Re: copy from one table to another

2018-04-24 Thread Kyrylo Lebediev
Thank you,  Rahul!

From: Rahul Singh <rahul.xavier.si...@gmail.com>
Sent: Saturday, April 21, 2018 3:02:11 PM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

That’s correct.

On Apr 21, 2018, 5:05 AM -0400, Kyrylo Lebediev <kyrylo_lebed...@epam.com>, 
wrote:

You mean that correct table UUID should be specified as suffix in directory 
name?
For example:


Table:


cqlsh> select id from system_schema.tables where keyspace_name='test' and 
table_name='usr';

 id
--
 ea2f6da0-f931-11e7-8224-43ca70555242


Directory name:
./data/test/usr-ea2f6da0f93111e7822443ca70555242


Correct?


Regards,

Kyrill


From: Rahul Singh <rahul.xavier.si...@gmail.com>
Sent: Thursday, April 19, 2018 10:53:11 PM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

Each table has a different Guid — doing a hard link may work as long as the 
sstable dir’s guid is he same as the newly created table in the system schema.

--
Rahul Singh
rahul.si...@anant.us

Anant Corporation

On Apr 19, 2018, 10:41 AM -0500, Kyrylo Lebediev <kyrylo_lebed...@epam.com>, 
wrote:

The table is too large to be copied fast/effectively , so I'd like to leverage 
immutableness  property of SSTables.

My idea is to:

1) create new empty table (NewTable) with the same structure as existing one 
(OldTable)
2) at some time run simultaneous 'nodetool snapshot -t ttt  OldTable' 
on all nodes -- this will create point in time state of OldTable

3) on each node run:
   for each file in OldTable ttt snapshot directory:

 ln 
//OldTable-/snapshots/ttt/_OldTable_xx 
.//Newtable/_NewTable_x

 then:
 nodetool refresh  NewTable

4) nodetool repair NewTable
5) Use OldTable and NewTable independently (Read/Write)


Are there any issues with using hardlinks (ln) instead of copying (cp) in this 
case?


Thanks,

Kyrill



From: Rahul Singh <rahul.xavier.si...@gmail.com>
Sent: Wednesday, April 18, 2018 2:07:17 AM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

1. Make a new table with the same schema.
For each node
2. Shutdown node
3. Copy data from Source sstable dir to new sstable dir.

This will do what you want.

--
Rahul Singh
rahul.si...@anant.us

Anant Corporation

On Apr 16, 2018, 4:21 PM -0500, Kyrylo Lebediev <kyrylo_lebed...@epam.com>, 
wrote:
Thanks,  Ali.
I just need to copy a large table in production without actual copying by using 
hardlinks. After this both tables should be used independently (RW). Is this a 
supported way or not?

Regards,
Kyrill

From: Ali Hubail <ali.hub...@petrolink.com>
Sent: Monday, April 16, 2018 6:51:51 PM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

If you want to copy a portion of the data to another table, you can also use 
sstable cql writer. It is more of an advanced feature and can be tricky, but 
doable.
once you write the new sstables, you can then use the sstableloader to stream 
the new data into the new table.
check this out:
https://www.datastax.com/dev/blog/using-the-cassandra-bulk-loader-updated

I have recently used this to clean up 500 GB worth of sstable data in order to 
purge tombstones that were mistakenly generated by the client.
obviously this is not as fast as hardlinks + refresh, but it's much faster and 
more efficient than using cql to copy data accross the tables.
take advantage of CQLSSTableWriter.builder.sorted() if you can, and utilize 
writetime if you have to.

Ali Hubail

Confidentiality warning: This message and any attachments are intended only for 
the persons to whom this message is addressed, are confidential, and may be 
privileged. If you are not the intended recipient, you are hereby notified that 
any review, retransmission, conversion to hard copy, copying, modification, 
circulation or other use of this message and any attachments is strictly 
prohibited. If you receive this message in error, please notify the sender 
immediately by return email, and delete this message and any attachments from 
your system. Petrolink International Limited its subsidiaries, holding 
companies and affiliates disclaims all responsibility from and accepts no 
liability whatsoever for the consequences of any unauthorized person acting, or 
refraining from acting, on any information contained in this message. For 
security purposes, staff training, to assist in resolving complaints and to 
improve our customer service, email communications may be monitored and 
telephone calls may be recorded.


Kyrylo Lebediev <kyrylo_lebed...@epam.com>

04/16/2018 10:37 AM

Please respond to
user@cassandra.apache.org




To
"user@cassandra.apache.org" <user@cassandra.apache.org>,
cc

Subject
Re: copy from one table to another







Any 

Re: copy from one table to another

2018-04-21 Thread Rahul Singh
That’s correct.

On Apr 21, 2018, 5:05 AM -0400, Kyrylo Lebediev <kyrylo_lebed...@epam.com>, 
wrote:
> You mean that correct table UUID should be specified as suffix in directory 
> name?
> For example:
>
> Table:
>
> cqlsh> select id from system_schema.tables where keyspace_name='test' and 
> table_name='usr';
>
>  id
> --
>  ea2f6da0-f931-11e7-8224-43ca70555242
>
>
> Directory name:
> ./data/test/usr-ea2f6da0f93111e7822443ca70555242
>
> Correct?
>
> Regards,
> Kyrill
> From: Rahul Singh <rahul.xavier.si...@gmail.com>
> Sent: Thursday, April 19, 2018 10:53:11 PM
> To: user@cassandra.apache.org
> Subject: Re: copy from one table to another
>
> Each table has a different Guid — doing a hard link may work as long as the 
> sstable dir’s guid is he same as the newly created table in the system schema.
>
> --
> Rahul Singh
> rahul.si...@anant.us
>
> Anant Corporation
>
> On Apr 19, 2018, 10:41 AM -0500, Kyrylo Lebediev <kyrylo_lebed...@epam.com>, 
> wrote:
> > The table is too large to be copied fast/effectively , so I'd like to 
> > leverage immutableness  property of SSTables.
> >
> > My idea is to:
> > 1) create new empty table (NewTable) with the same structure as existing 
> > one (OldTable)
> > 2) at some time run simultaneous 'nodetool snapshot -t ttt  
> > OldTable' on all nodes -- this will create point in time state of OldTable
> > 3) on each node run:
> >        for each file in OldTable ttt snapshot directory:
> >  ln 
> > //OldTable-/snapshots/ttt/_OldTable_xx 
> > .//Newtable/_NewTable_x
> >  then:
> >  nodetool refresh  NewTable
> > 4) nodetool repair NewTable
> > 5) Use OldTable and NewTable independently (Read/Write)
> >
> > Are there any issues with using hardlinks (ln) instead of copying (cp) in 
> > this case?
> >
> > Thanks,
> > Kyrill
> >
> > From: Rahul Singh <rahul.xavier.si...@gmail.com>
> > Sent: Wednesday, April 18, 2018 2:07:17 AM
> > To: user@cassandra.apache.org
> > Subject: Re: copy from one table to another
> >
> > 1. Make a new table with the same schema.
> > For each node
> > 2. Shutdown node
> > 3. Copy data from Source sstable dir to new sstable dir.
> >
> > This will do what you want.
> >
> > --
> > Rahul Singh
> > rahul.si...@anant.us
> >
> > Anant Corporation
> >
> > On Apr 16, 2018, 4:21 PM -0500, Kyrylo Lebediev <kyrylo_lebed...@epam.com>, 
> > wrote:
> > > Thanks,  Ali.
> > > I just need to copy a large table in production without actual copying by 
> > > using hardlinks. After this both tables should be used independently 
> > > (RW). Is this a supported way or not?
> > >
> > > Regards,
> > > Kyrill
> > > From: Ali Hubail <ali.hub...@petrolink.com>
> > > Sent: Monday, April 16, 2018 6:51:51 PM
> > > To: user@cassandra.apache.org
> > > Subject: Re: copy from one table to another
> > >
> > > If you want to copy a portion of the data to another table, you can also 
> > > use sstable cql writer. It is more of an advanced feature and can be 
> > > tricky, but doable.
> > > once you write the new sstables, you can then use the sstableloader to 
> > > stream the new data into the new table.
> > > check this out:
> > > https://www.datastax.com/dev/blog/using-the-cassandra-bulk-loader-updated
> > >
> > > I have recently used this to clean up 500 GB worth of sstable data in 
> > > order to purge tombstones that were mistakenly generated by the client.
> > > obviously this is not as fast as hardlinks + refresh, but it's much 
> > > faster and more efficient than using cql to copy data accross the tables.
> > > take advantage of CQLSSTableWriter.builder.sorted() if you can, and 
> > > utilize writetime if you have to.
> > >
> > > Ali Hubail
> > >
> > > Confidentiality warning: This message and any attachments are intended 
> > > only for the persons to whom this message is addressed, are confidential, 
> > > and may be privileged. If you are not the intended recipient, you are 
> > > hereby notified that any review, retransmission, conversion to hard copy, 
> > > copying, modification, circulation or other use of this message and any 
> > > attachments is strictly prohibited. If you receive this message in error, 
> > > please notify the sender immediately by ret

Re: copy from one table to another

2018-04-21 Thread Kyrylo Lebediev
You mean that correct table UUID should be specified as suffix in directory 
name?
For example:


Table:


cqlsh> select id from system_schema.tables where keyspace_name='test' and 
table_name='usr';

 id
--
 ea2f6da0-f931-11e7-8224-43ca70555242


Directory name:
./data/test/usr-ea2f6da0f93111e7822443ca70555242


Correct?


Regards,

Kyrill


From: Rahul Singh <rahul.xavier.si...@gmail.com>
Sent: Thursday, April 19, 2018 10:53:11 PM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

Each table has a different Guid — doing a hard link may work as long as the 
sstable dir’s guid is he same as the newly created table in the system schema.

--
Rahul Singh
rahul.si...@anant.us

Anant Corporation

On Apr 19, 2018, 10:41 AM -0500, Kyrylo Lebediev <kyrylo_lebed...@epam.com>, 
wrote:

The table is too large to be copied fast/effectively , so I'd like to leverage 
immutableness  property of SSTables.

My idea is to:

1) create new empty table (NewTable) with the same structure as existing one 
(OldTable)
2) at some time run simultaneous 'nodetool snapshot -t ttt  OldTable' 
on all nodes -- this will create point in time state of OldTable

3) on each node run:
   for each file in OldTable ttt snapshot directory:

 ln 
//OldTable-/snapshots/ttt/_OldTable_xx 
.//Newtable/_NewTable_x

 then:
 nodetool refresh  NewTable

4) nodetool repair NewTable
5) Use OldTable and NewTable independently (Read/Write)


Are there any issues with using hardlinks (ln) instead of copying (cp) in this 
case?


Thanks,

Kyrill



From: Rahul Singh <rahul.xavier.si...@gmail.com>
Sent: Wednesday, April 18, 2018 2:07:17 AM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

1. Make a new table with the same schema.
For each node
2. Shutdown node
3. Copy data from Source sstable dir to new sstable dir.

This will do what you want.

--
Rahul Singh
rahul.si...@anant.us

Anant Corporation

On Apr 16, 2018, 4:21 PM -0500, Kyrylo Lebediev <kyrylo_lebed...@epam.com>, 
wrote:
Thanks,  Ali.
I just need to copy a large table in production without actual copying by using 
hardlinks. After this both tables should be used independently (RW). Is this a 
supported way or not?

Regards,
Kyrill

From: Ali Hubail <ali.hub...@petrolink.com>
Sent: Monday, April 16, 2018 6:51:51 PM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

If you want to copy a portion of the data to another table, you can also use 
sstable cql writer. It is more of an advanced feature and can be tricky, but 
doable.
once you write the new sstables, you can then use the sstableloader to stream 
the new data into the new table.
check this out:
https://www.datastax.com/dev/blog/using-the-cassandra-bulk-loader-updated

I have recently used this to clean up 500 GB worth of sstable data in order to 
purge tombstones that were mistakenly generated by the client.
obviously this is not as fast as hardlinks + refresh, but it's much faster and 
more efficient than using cql to copy data accross the tables.
take advantage of CQLSSTableWriter.builder.sorted() if you can, and utilize 
writetime if you have to.

Ali Hubail

Confidentiality warning: This message and any attachments are intended only for 
the persons to whom this message is addressed, are confidential, and may be 
privileged. If you are not the intended recipient, you are hereby notified that 
any review, retransmission, conversion to hard copy, copying, modification, 
circulation or other use of this message and any attachments is strictly 
prohibited. If you receive this message in error, please notify the sender 
immediately by return email, and delete this message and any attachments from 
your system. Petrolink International Limited its subsidiaries, holding 
companies and affiliates disclaims all responsibility from and accepts no 
liability whatsoever for the consequences of any unauthorized person acting, or 
refraining from acting, on any information contained in this message. For 
security purposes, staff training, to assist in resolving complaints and to 
improve our customer service, email communications may be monitored and 
telephone calls may be recorded.


Kyrylo Lebediev <kyrylo_lebed...@epam.com>

04/16/2018 10:37 AM

Please respond to
user@cassandra.apache.org




To
"user@cassandra.apache.org" <user@cassandra.apache.org>,
cc

Subject
Re: copy from one table to another







Any issues if we:

1) create an new empty table with the same structure as the old one
2) create hardlinks ("ln without -s"): 
.../-/--* ---> 
.../-/--*
3) run nodetool refresh -- newkeyspacename newtable

and then query/modify both tables independently/simultaneously?

In theory, as SSTables are immutable, this s

Re: copy from one table to another

2018-04-19 Thread Rahul Singh
Each table has a different Guid — doing a hard link may work as long as the 
sstable dir’s guid is he same as the newly created table in the system schema.

--
Rahul Singh
rahul.si...@anant.us

Anant Corporation

On Apr 19, 2018, 10:41 AM -0500, Kyrylo Lebediev <kyrylo_lebed...@epam.com>, 
wrote:
> The table is too large to be copied fast/effectively , so I'd like to 
> leverage immutableness  property of SSTables.
>
> My idea is to:
> 1) create new empty table (NewTable) with the same structure as existing one 
> (OldTable)
> 2) at some time run simultaneous 'nodetool snapshot -t ttt  
> OldTable' on all nodes -- this will create point in time state of OldTable
> 3) on each node run:
>        for each file in OldTable ttt snapshot directory:
>  ln 
> //OldTable-/snapshots/ttt/_OldTable_xx 
> .//Newtable/_NewTable_x
>  then:
>  nodetool refresh  NewTable
> 4) nodetool repair NewTable
> 5) Use OldTable and NewTable independently (Read/Write)
>
> Are there any issues with using hardlinks (ln) instead of copying (cp) in 
> this case?
>
> Thanks,
> Kyrill
>
> From: Rahul Singh <rahul.xavier.si...@gmail.com>
> Sent: Wednesday, April 18, 2018 2:07:17 AM
> To: user@cassandra.apache.org
> Subject: Re: copy from one table to another
>
> 1. Make a new table with the same schema.
> For each node
> 2. Shutdown node
> 3. Copy data from Source sstable dir to new sstable dir.
>
> This will do what you want.
>
> --
> Rahul Singh
> rahul.si...@anant.us
>
> Anant Corporation
>
> On Apr 16, 2018, 4:21 PM -0500, Kyrylo Lebediev <kyrylo_lebed...@epam.com>, 
> wrote:
> > Thanks,  Ali.
> > I just need to copy a large table in production without actual copying by 
> > using hardlinks. After this both tables should be used independently (RW). 
> > Is this a supported way or not?
> >
> > Regards,
> > Kyrill
> > From: Ali Hubail <ali.hub...@petrolink.com>
> > Sent: Monday, April 16, 2018 6:51:51 PM
> > To: user@cassandra.apache.org
> > Subject: Re: copy from one table to another
> >
> > If you want to copy a portion of the data to another table, you can also 
> > use sstable cql writer. It is more of an advanced feature and can be 
> > tricky, but doable.
> > once you write the new sstables, you can then use the sstableloader to 
> > stream the new data into the new table.
> > check this out:
> > https://www.datastax.com/dev/blog/using-the-cassandra-bulk-loader-updated
> >
> > I have recently used this to clean up 500 GB worth of sstable data in order 
> > to purge tombstones that were mistakenly generated by the client.
> > obviously this is not as fast as hardlinks + refresh, but it's much faster 
> > and more efficient than using cql to copy data accross the tables.
> > take advantage of CQLSSTableWriter.builder.sorted() if you can, and utilize 
> > writetime if you have to.
> >
> > Ali Hubail
> >
> > Confidentiality warning: This message and any attachments are intended only 
> > for the persons to whom this message is addressed, are confidential, and 
> > may be privileged. If you are not the intended recipient, you are hereby 
> > notified that any review, retransmission, conversion to hard copy, copying, 
> > modification, circulation or other use of this message and any attachments 
> > is strictly prohibited. If you receive this message in error, please notify 
> > the sender immediately by return email, and delete this message and any 
> > attachments from your system. Petrolink International Limited its 
> > subsidiaries, holding companies and affiliates disclaims all responsibility 
> > from and accepts no liability whatsoever for the consequences of any 
> > unauthorized person acting, or refraining from acting, on any information 
> > contained in this message. For security purposes, staff training, to assist 
> > in resolving complaints and to improve our customer service, email 
> > communications may be monitored and telephone calls may be recorded.
> >
> >
> > Kyrylo Lebediev <kyrylo_lebed...@epam.com>
> > 04/16/2018 10:37 AM
> > Please respond to
> > user@cassandra.apache.org
> >
> > To
> > "user@cassandra.apache.org" <user@cassandra.apache.org>,
> > cc
> > Subject
> > Re: copy from one table to another
> >
> >
> >
> >
> >
> > Any issues if we:
> >
> > 1) create an new empty table with the same structure as the old one
> > 2) create hardlinks ("ln without -s"): 
> > .../-/--* ---> 
> > .

Re: copy from one table to another

2018-04-19 Thread Kyrylo Lebediev
The table is too large to be copied fast/effectively , so I'd like to leverage 
immutableness  property of SSTables.

My idea is to:

1) create new empty table (NewTable) with the same structure as existing one 
(OldTable)
2) at some time run simultaneous 'nodetool snapshot -t ttt  OldTable' 
on all nodes -- this will create point in time state of OldTable

3) on each node run:
   for each file in OldTable ttt snapshot directory:

 ln 
//OldTable-/snapshots/ttt/_OldTable_xx 
.//Newtable/_NewTable_x

 then:
 nodetool refresh  NewTable

4) nodetool repair NewTable
5) Use OldTable and NewTable independently (Read/Write)


Are there any issues with using hardlinks (ln) instead of copying (cp) in this 
case?


Thanks,

Kyrill



From: Rahul Singh <rahul.xavier.si...@gmail.com>
Sent: Wednesday, April 18, 2018 2:07:17 AM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

1. Make a new table with the same schema.
For each node
2. Shutdown node
3. Copy data from Source sstable dir to new sstable dir.

This will do what you want.

--
Rahul Singh
rahul.si...@anant.us

Anant Corporation

On Apr 16, 2018, 4:21 PM -0500, Kyrylo Lebediev <kyrylo_lebed...@epam.com>, 
wrote:
Thanks,  Ali.
I just need to copy a large table in production without actual copying by using 
hardlinks. After this both tables should be used independently (RW). Is this a 
supported way or not?

Regards,
Kyrill

From: Ali Hubail <ali.hub...@petrolink.com>
Sent: Monday, April 16, 2018 6:51:51 PM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

If you want to copy a portion of the data to another table, you can also use 
sstable cql writer. It is more of an advanced feature and can be tricky, but 
doable.
once you write the new sstables, you can then use the sstableloader to stream 
the new data into the new table.
check this out:
https://www.datastax.com/dev/blog/using-the-cassandra-bulk-loader-updated

I have recently used this to clean up 500 GB worth of sstable data in order to 
purge tombstones that were mistakenly generated by the client.
obviously this is not as fast as hardlinks + refresh, but it's much faster and 
more efficient than using cql to copy data accross the tables.
take advantage of CQLSSTableWriter.builder.sorted() if you can, and utilize 
writetime if you have to.

Ali Hubail

Confidentiality warning: This message and any attachments are intended only for 
the persons to whom this message is addressed, are confidential, and may be 
privileged. If you are not the intended recipient, you are hereby notified that 
any review, retransmission, conversion to hard copy, copying, modification, 
circulation or other use of this message and any attachments is strictly 
prohibited. If you receive this message in error, please notify the sender 
immediately by return email, and delete this message and any attachments from 
your system. Petrolink International Limited its subsidiaries, holding 
companies and affiliates disclaims all responsibility from and accepts no 
liability whatsoever for the consequences of any unauthorized person acting, or 
refraining from acting, on any information contained in this message. For 
security purposes, staff training, to assist in resolving complaints and to 
improve our customer service, email communications may be monitored and 
telephone calls may be recorded.


Kyrylo Lebediev <kyrylo_lebed...@epam.com>

04/16/2018 10:37 AM

Please respond to
user@cassandra.apache.org




To
"user@cassandra.apache.org" <user@cassandra.apache.org>,
cc

Subject
    Re: copy from one table to another







Any issues if we:

1) create an new empty table with the same structure as the old one
2) create hardlinks ("ln without -s"): 
.../-/--* ---> 
.../-/--*
3) run nodetool refresh -- newkeyspacename newtable

and then query/modify both tables independently/simultaneously?

In theory, as SSTables are immutable, this should work, but could there be some 
hidden issues?

Regards,
Kyrill



From: Dmitry Saprykin <saprykin.dmi...@gmail.com>
Sent: Sunday, April 8, 2018 7:33:03 PM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

You can copy hardlinks to ALL SSTables from old to new table and then delete 
part of data you do not need in a new one.

On Sun, Apr 8, 2018 at 10:20 AM, Nitan Kainth 
<nitankai...@gmail.com<mailto:nitankai...@gmail.com>> wrote:
If it for testing and you don’t need any specific data, just copy a set of 
sstables with all files of that sequence and move to target tables directory 
and rename it.

Restart target node or run nodetool refresh

Sent from my iPhone

On Apr 8, 2018, at 4:15 AM, onmstester onmstester 
<onmstes...@zoho.com<mailto:onmstes...@zoho.com>> wrote:

Is there any way to copy

Re: copy from one table to another

2018-04-17 Thread Rahul Singh
1. Make a new table with the same schema.
For each node
2. Shutdown node
3. Copy data from Source sstable dir to new sstable dir.

This will do what you want.

--
Rahul Singh
rahul.si...@anant.us

Anant Corporation

On Apr 16, 2018, 4:21 PM -0500, Kyrylo Lebediev <kyrylo_lebed...@epam.com>, 
wrote:
> Thanks,  Ali.
> I just need to copy a large table in production without actual copying by 
> using hardlinks. After this both tables should be used independently (RW). Is 
> this a supported way or not?
>
> Regards,
> Kyrill
> From: Ali Hubail <ali.hub...@petrolink.com>
> Sent: Monday, April 16, 2018 6:51:51 PM
> To: user@cassandra.apache.org
> Subject: Re: copy from one table to another
>
> If you want to copy a portion of the data to another table, you can also use 
> sstable cql writer. It is more of an advanced feature and can be tricky, but 
> doable.
> once you write the new sstables, you can then use the sstableloader to stream 
> the new data into the new table.
> check this out:
> https://www.datastax.com/dev/blog/using-the-cassandra-bulk-loader-updated
>
> I have recently used this to clean up 500 GB worth of sstable data in order 
> to purge tombstones that were mistakenly generated by the client.
> obviously this is not as fast as hardlinks + refresh, but it's much faster 
> and more efficient than using cql to copy data accross the tables.
> take advantage of CQLSSTableWriter.builder.sorted() if you can, and utilize 
> writetime if you have to.
>
> Ali Hubail
>
> Confidentiality warning: This message and any attachments are intended only 
> for the persons to whom this message is addressed, are confidential, and may 
> be privileged. If you are not the intended recipient, you are hereby notified 
> that any review, retransmission, conversion to hard copy, copying, 
> modification, circulation or other use of this message and any attachments is 
> strictly prohibited. If you receive this message in error, please notify the 
> sender immediately by return email, and delete this message and any 
> attachments from your system. Petrolink International Limited its 
> subsidiaries, holding companies and affiliates disclaims all responsibility 
> from and accepts no liability whatsoever for the consequences of any 
> unauthorized person acting, or refraining from acting, on any information 
> contained in this message. For security purposes, staff training, to assist 
> in resolving complaints and to improve our customer service, email 
> communications may be monitored and telephone calls may be recorded.
>
>
> Kyrylo Lebediev <kyrylo_lebed...@epam.com>
> 04/16/2018 10:37 AM
> Please respond to
> user@cassandra.apache.org
>
> To
> "user@cassandra.apache.org" <user@cassandra.apache.org>,
> cc
> Subject
> Re: copy from one table to another
>
>
>
>
>
> Any issues if we:
>
> 1) create an new empty table with the same structure as the old one
> 2) create hardlinks ("ln without -s"): 
> .../-/--* ---> 
> .../-/--*
> 3) run nodetool refresh -- newkeyspacename newtable
>
> and then query/modify both tables independently/simultaneously?
>
> In theory, as SSTables are immutable, this should work, but could there be 
> some hidden issues?
>
> Regards,
> Kyrill
>
> From: Dmitry Saprykin <saprykin.dmi...@gmail.com>
> Sent: Sunday, April 8, 2018 7:33:03 PM
> To: user@cassandra.apache.org
> Subject: Re: copy from one table to another
>
> You can copy hardlinks to ALL SSTables from old to new table and then delete 
> part of data you do not need in a new one.
>
> On Sun, Apr 8, 2018 at 10:20 AM, Nitan Kainth <nitankai...@gmail.com> wrote:
> If it for testing and you don’t need any specific data, just copy a set of 
> sstables with all files of that sequence and move to target tables directory 
> and rename it.
>
> Restart target node or run nodetool refresh
>
> Sent from my iPhone
>
> On Apr 8, 2018, at 4:15 AM, onmstester onmstester <onmstes...@zoho.com> wrote:
>
> Is there any way to copy some part of a table to another table in cassandra? 
> A large amount of data should be copied so i don't want to fetch data to 
> client and stream it back to cassandra using cql.
>
> Sent using Zoho Mail
>
>
>


Re: copy from one table to another

2018-04-16 Thread Kyrylo Lebediev
Thanks,  Ali.
I just need to copy a large table in production without actual copying by using 
hardlinks. After this both tables should be used independently (RW). Is this a 
supported way or not?

Regards,
Kyrill

From: Ali Hubail <ali.hub...@petrolink.com>
Sent: Monday, April 16, 2018 6:51:51 PM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

If you want to copy a portion of the data to another table, you can also use 
sstable cql writer. It is more of an advanced feature and can be tricky, but 
doable.
once you write the new sstables, you can then use the sstableloader to stream 
the new data into the new table.
check this out:
https://www.datastax.com/dev/blog/using-the-cassandra-bulk-loader-updated

I have recently used this to clean up 500 GB worth of sstable data in order to 
purge tombstones that were mistakenly generated by the client.
obviously this is not as fast as hardlinks + refresh, but it's much faster and 
more efficient than using cql to copy data accross the tables.
take advantage of CQLSSTableWriter.builder.sorted() if you can, and utilize 
writetime if you have to.

Ali Hubail

Confidentiality warning: This message and any attachments are intended only for 
the persons to whom this message is addressed, are confidential, and may be 
privileged. If you are not the intended recipient, you are hereby notified that 
any review, retransmission, conversion to hard copy, copying, modification, 
circulation or other use of this message and any attachments is strictly 
prohibited. If you receive this message in error, please notify the sender 
immediately by return email, and delete this message and any attachments from 
your system. Petrolink International Limited its subsidiaries, holding 
companies and affiliates disclaims all responsibility from and accepts no 
liability whatsoever for the consequences of any unauthorized person acting, or 
refraining from acting, on any information contained in this message. For 
security purposes, staff training, to assist in resolving complaints and to 
improve our customer service, email communications may be monitored and 
telephone calls may be recorded.


Kyrylo Lebediev <kyrylo_lebed...@epam.com>

04/16/2018 10:37 AM
Please respond to
user@cassandra.apache.org




To
"user@cassandra.apache.org" <user@cassandra.apache.org>,
cc

Subject
    Re: copy from one table to another







Any issues if we:

1) create an new empty table with the same structure as the old one
2) create hardlinks ("ln without -s"): 
.../-/--* ---> 
.../-/--*
3) run nodetool refresh -- newkeyspacename newtable

and then query/modify both tables independently/simultaneously?

In theory, as SSTables are immutable, this should work, but could there be some 
hidden issues?

Regards,
Kyrill



From: Dmitry Saprykin <saprykin.dmi...@gmail.com>
Sent: Sunday, April 8, 2018 7:33:03 PM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

You can copy hardlinks to ALL SSTables from old to new table and then delete 
part of data you do not need in a new one.

On Sun, Apr 8, 2018 at 10:20 AM, Nitan Kainth 
<nitankai...@gmail.com<mailto:nitankai...@gmail.com>> wrote:
If it for testing and you don’t need any specific data, just copy a set of 
sstables with all files of that sequence and move to target tables directory 
and rename it.

Restart target node or run nodetool refresh

Sent from my iPhone

On Apr 8, 2018, at 4:15 AM, onmstester onmstester 
<onmstes...@zoho.com<mailto:onmstes...@zoho.com>> wrote:

Is there any way to copy some part of a table to another table in cassandra? A 
large amount of data should be copied so i don't want to fetch data to client 
and stream it back to cassandra using cql.

Sent using Zoho Mail<https://www.zoho.com/mail/>





Re: copy from one table to another

2018-04-16 Thread Ali Hubail
If you want to copy a portion of the data to another table, you can also 
use sstable cql writer. It is more of an advanced feature and can be 
tricky, but doable.
once you write the new sstables, you can then use the sstableloader to 
stream the new data into the new table.
check this out:
https://www.datastax.com/dev/blog/using-the-cassandra-bulk-loader-updated

I have recently used this to clean up 500 GB worth of sstable data in 
order to purge tombstones that were mistakenly generated by the client.
obviously this is not as fast as hardlinks + refresh, but it's much faster 
and more efficient than using cql to copy data accross the tables.
take advantage of CQLSSTableWriter.builder.sorted() if you can, and 
utilize writetime if you have to.

Ali Hubail

Confidentiality warning: This message and any attachments are intended 
only for the persons to whom this message is addressed, are confidential, 
and may be privileged. If you are not the intended recipient, you are 
hereby notified that any review, retransmission, conversion to hard copy, 
copying, modification, circulation or other use of this message and any 
attachments is strictly prohibited. If you receive this message in error, 
please notify the sender immediately by return email, and delete this 
message and any attachments from your system. Petrolink International 
Limited its subsidiaries, holding companies and affiliates disclaims all 
responsibility from and accepts no liability whatsoever for the 
consequences of any unauthorized person acting, or refraining from acting, 
on any information contained in this message. For security purposes, staff 
training, to assist in resolving complaints and to improve our customer 
service, email communications may be monitored and telephone calls may be 
recorded.



Kyrylo Lebediev <kyrylo_lebed...@epam.com> 
04/16/2018 10:37 AM
Please respond to
user@cassandra.apache.org


To
"user@cassandra.apache.org" <user@cassandra.apache.org>, 
cc

Subject
Re: copy from one table to another






Any issues if we:

1) create an new empty table with the same structure as the old one 
2) create hardlinks ("ln without -s"): 
.../-/--* ---> 
.../-/--* 
3) run nodetool refresh -- newkeyspacename newtable

and then query/modify both tables independently/simultaneously?

In theory, as SSTables are immutable, this should work, but could there be 
some hidden issues? 

Regards, 
Kyrill

From: Dmitry Saprykin <saprykin.dmi...@gmail.com>
Sent: Sunday, April 8, 2018 7:33:03 PM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another 
 
You can copy hardlinks to ALL SSTables from old to new table and then 
delete part of data you do not need in a new one.

On Sun, Apr 8, 2018 at 10:20 AM, Nitan Kainth <nitankai...@gmail.com> 
wrote:
If it for testing and you don’t need any specific data, just copy a set of 
sstables with all files of that sequence and move to target tables 
directory and rename it. 

Restart target node or run nodetool refresh 

Sent from my iPhone

On Apr 8, 2018, at 4:15 AM, onmstester onmstester <onmstes...@zoho.com> 
wrote:

Is there any way to copy some part of a table to another table in 
cassandra? A large amount of data should be copied so i don't want to 
fetch data to client and stream it back to cassandra using cql.

Sent using Zoho Mail






Re: copy from one table to another

2018-04-16 Thread Kyrylo Lebediev
Any issues if we:


1) create an new empty table with the same structure as the old one

2) create hardlinks ("ln without -s"): 
.../-/--* ---> 
.../-/--*

3) run nodetool refresh -- newkeyspacename newtable


and then query/modify both tables independently/simultaneously?


In theory, as SSTables are immutable, this should work, but could there be some 
hidden issues?


Regards,

Kyrill


From: Dmitry Saprykin <saprykin.dmi...@gmail.com>
Sent: Sunday, April 8, 2018 7:33:03 PM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

You can copy hardlinks to ALL SSTables from old to new table and then delete 
part of data you do not need in a new one.

On Sun, Apr 8, 2018 at 10:20 AM, Nitan Kainth 
<nitankai...@gmail.com<mailto:nitankai...@gmail.com>> wrote:
If it for testing and you don’t need any specific data, just copy a set of 
sstables with all files of that sequence and move to target tables directory 
and rename it.

Restart target node or run nodetool refresh

Sent from my iPhone

On Apr 8, 2018, at 4:15 AM, onmstester onmstester 
<onmstes...@zoho.com<mailto:onmstes...@zoho.com>> wrote:

Is there any way to copy some part of a table to another table in cassandra? A 
large amount of data should be copied so i don't want to fetch data to client 
and stream it back to cassandra using cql.


Sent using Zoho Mail<https://www.zoho.com/mail/>





Re: copy from one table to another

2018-04-08 Thread Christophe Schmitz
If you need this kind of logic, you might want to consider using Spark.
It's often used for data migration.
You could load your list of partition_key in a Spark RDD, then
use joinWithCassandraTable, and write the result back to your destination
table.
Just before the join, you could use repartitionByCassandraReplica on your
RDD to have better data locality.
This documentation can be helpful:
https://github.com/datastax/spark-cassandra-connector/blob/master/doc/2_loading.md#performing-efficient-joins-with-cassandra-tables-since-12

Hope it helps

Cheers,
Christophe

On 9 April 2018 at 13:09, onmstester onmstester  wrote:

> Thank you all
> I need something like this:
> insert into table test2 select * from test1 where
> partition_key='SOME_KEYS';
> The problem with copying sstable is that original table contains some
> billions of records and i only want some hundred millions of records from
> the table, so after copy/pasting big sstables in so many nodes i should
> wait for a deletion that would take so long to response:
> delete from test2 where partition_key != 'SOME_KEYS'
>
> Sent using Zoho Mail 
>
>
>  On Mon, 09 Apr 2018 06:14:02 +0430 *Dmitry Saprykin
> >* wrote 
>
> IMHO The best step by step description of what you need to do is here
>
> https://issues.apache.org/jira/browse/CASSANDRA-1585?
> focusedCommentId=13488959=com.atlassian.jira.
> plugin.system.issuetabpanels%3Acomment-tabpanel#comment-13488959
>
> The only difference is that you need to copy data from one table only. I
> did it for a whole keyspace.
>
>
>
>
> On Sun, Apr 8, 2018 at 3:06 PM Jean Carlo 
> wrote:
>
> You can use the same procedure to restore a table from snapshot from
> datastax webpage
>
> https://docs.datastax.com/en/cassandra/2.1/cassandra/
> operations/ops_backup_snapshot_restore_t.html
> Just two modifications.
>
> after step 5, modify the name of the sstables to add the name of the table
> you want to copy to.
>
> and in the step 6 copy the sstables to the right directory corresponding
> to the tale you want to copy to.
>
> Be sure you have an snapshot of the table source and ignore step 4 of
> course
>
>
> Saludos
>
> Jean Carlo
>
> "The best way to predict the future is to invent it" Alan Kay
>
> On Sun, Apr 8, 2018 at 6:33 PM, Dmitry Saprykin  > wrote:
>
> You can copy hardlinks to ALL SSTables from old to new table and then
> delete part of data you do not need in a new one.
>
> On Sun, Apr 8, 2018 at 10:20 AM, Nitan Kainth 
> wrote:
>
> If it for testing and you don’t need any specific data, just copy a set of
> sstables with all files of that sequence and move to target tables
> directory and rename it.
>
> Restart target node or run nodetool refresh
>
> Sent from my iPhone
>
> On Apr 8, 2018, at 4:15 AM, onmstester onmstester 
> wrote:
>
> Is there any way to copy some part of a table to another table in
> cassandra? A large amount of data should be copied so i don't want to fetch
> data to client and stream it back to cassandra using cql.
>
> Sent using Zoho Mail 
>
>
>
>


-- 

*Christophe Schmitz - **VP Consulting*

AU: +61 4 03751980 / FR: +33 7 82022899

   


Read our latest technical blog posts here
. This email has been sent on behalf
of Instaclustr Pty. Limited (Australia) and Instaclustr Inc (USA). This
email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.


Re: copy from one table to another

2018-04-08 Thread onmstester onmstester
Thank you all

I need something like this:

insert into table test2 select * from test1 where partition_key='SOME_KEYS';

The problem with copying sstable is that original table contains some billions 
of records and i only want some hundred millions of records from the table, so 
after copy/pasting big sstables in so many nodes i should wait for a deletion 
that would take so long to response:

delete from test2 where partition_key != 'SOME_KEYS'


Sent using Zoho Mail






 On Mon, 09 Apr 2018 06:14:02 +0430 Dmitry Saprykin 
saprykin.dmi...@gmail.com wrote 




IMHO The best step by step description of what you need to do is here



https://issues.apache.org/jira/browse/CASSANDRA-1585?focusedCommentId=13488959page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-13488959



The only difference is that you need to copy data from one table only. I did it 
for a whole keyspace.










On Sun, Apr 8, 2018 at 3:06 PM Jean Carlo jean.jeancar...@gmail.com 
wrote:






You can use the same procedure to restore a table from snapshot from datastax 
webpage 



https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_backup_snapshot_restore_t.html


Just two modifications.




after step 5, modify the name of the sstables to add the name of the table you 
want to copy to.




and in the step 6 copy the sstables to the right directory corresponding to the 
tale you want to copy to.




Be sure you have an snapshot of the table source and ignore step 4 of course 






Saludos



Jean Carlo


"The best way to predict the future is to invent it" Alan Kay






On Sun, Apr 8, 2018 at 6:33 PM, Dmitry Saprykin 
saprykin.dmi...@gmail.com wrote:

You can copy hardlinks to ALL SSTables from old to new table and then delete 
part of data you do not need in a new one.



On Sun, Apr 8, 2018 at 10:20 AM, Nitan Kainth nitankai...@gmail.com 
wrote:

If it for testing and you don’t need any specific data, just copy a set of 
sstables with all files of that sequence and move to target tables directory 
and rename it.



Restart target node or run nodetool refresh 



Sent from my iPhone



On Apr 8, 2018, at 4:15 AM, onmstester onmstester onmstes...@zoho.com 
wrote:


Is there any way to copy some part of a table to another table in cassandra? A 
large amount of data should be copied so i don't want to fetch data to client 
and stream it back to cassandra using cql.



Sent using Zoho Mail




















Re: copy from one table to another

2018-04-08 Thread Dmitry Saprykin
IMHO The best step by step description of what you need to do is here

https://issues.apache.org/jira/browse/CASSANDRA-1585?focusedCommentId=13488959=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-13488959

The only difference is that you need to copy data from one table only. I
did it for a whole keyspace.




On Sun, Apr 8, 2018 at 3:06 PM Jean Carlo  wrote:

> You can use the same procedure to restore a table from snapshot from
> datastax webpage
>
>
> https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_backup_snapshot_restore_t.html
>
> Just two modifications.
>
> after step 5, modify the name of the sstables to add the name of the table
> you want to copy to.
>
> and in the step 6 copy the sstables to the right directory corresponding
> to the tale you want to copy to.
>
>
> Be sure you have an snapshot of the table source and ignore step 4 of
> course
>
>
> Saludos
>
> Jean Carlo
>
> "The best way to predict the future is to invent it" Alan Kay
>
> On Sun, Apr 8, 2018 at 6:33 PM, Dmitry Saprykin  > wrote:
>
>> You can copy hardlinks to ALL SSTables from old to new table and then
>> delete part of data you do not need in a new one.
>>
>> On Sun, Apr 8, 2018 at 10:20 AM, Nitan Kainth 
>> wrote:
>>
>>> If it for testing and you don’t need any specific data, just copy a set
>>> of sstables with all files of that sequence and move to target tables
>>> directory and rename it.
>>>
>>> Restart target node or run nodetool refresh
>>>
>>> Sent from my iPhone
>>>
>>> On Apr 8, 2018, at 4:15 AM, onmstester onmstester 
>>> wrote:
>>>
>>> Is there any way to copy some part of a table to another table in
>>> cassandra? A large amount of data should be copied so i don't want to fetch
>>> data to client and stream it back to cassandra using cql.
>>>
>>> Sent using Zoho Mail 
>>>
>>>
>>>
>>
>


Re: copy from one table to another

2018-04-08 Thread Jean Carlo
You can use the same procedure to restore a table from snapshot from
datastax webpage

https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_backup_snapshot_restore_t.html

Just two modifications.

after step 5, modify the name of the sstables to add the name of the table
you want to copy to.

and in the step 6 copy the sstables to the right directory corresponding to
the tale you want to copy to.


Be sure you have an snapshot of the table source and ignore step 4 of
course


Saludos

Jean Carlo

"The best way to predict the future is to invent it" Alan Kay

On Sun, Apr 8, 2018 at 6:33 PM, Dmitry Saprykin 
wrote:

> You can copy hardlinks to ALL SSTables from old to new table and then
> delete part of data you do not need in a new one.
>
> On Sun, Apr 8, 2018 at 10:20 AM, Nitan Kainth 
> wrote:
>
>> If it for testing and you don’t need any specific data, just copy a set
>> of sstables with all files of that sequence and move to target tables
>> directory and rename it.
>>
>> Restart target node or run nodetool refresh
>>
>> Sent from my iPhone
>>
>> On Apr 8, 2018, at 4:15 AM, onmstester onmstester 
>> wrote:
>>
>> Is there any way to copy some part of a table to another table in
>> cassandra? A large amount of data should be copied so i don't want to fetch
>> data to client and stream it back to cassandra using cql.
>>
>> Sent using Zoho Mail 
>>
>>
>>
>


Re: copy from one table to another

2018-04-08 Thread Dmitry Saprykin
You can copy hardlinks to ALL SSTables from old to new table and then
delete part of data you do not need in a new one.

On Sun, Apr 8, 2018 at 10:20 AM, Nitan Kainth  wrote:

> If it for testing and you don’t need any specific data, just copy a set of
> sstables with all files of that sequence and move to target tables
> directory and rename it.
>
> Restart target node or run nodetool refresh
>
> Sent from my iPhone
>
> On Apr 8, 2018, at 4:15 AM, onmstester onmstester 
> wrote:
>
> Is there any way to copy some part of a table to another table in
> cassandra? A large amount of data should be copied so i don't want to fetch
> data to client and stream it back to cassandra using cql.
>
> Sent using Zoho Mail 
>
>
>


Re: copy from one table to another

2018-04-08 Thread Nitan Kainth
If it for testing and you don’t need any specific data, just copy a set of 
sstables with all files of that sequence and move to target tables directory 
and rename it.

Restart target node or run nodetool refresh 

Sent from my iPhone

> On Apr 8, 2018, at 4:15 AM, onmstester onmstester  wrote:
> 
> Is there any way to copy some part of a table to another table in cassandra? 
> A large amount of data should be copied so i don't want to fetch data to 
> client and stream it back to cassandra using cql.
> 
> Sent using Zoho Mail
> 
> 
> 


copy from one table to another

2018-04-08 Thread onmstester onmstester
Is there any way to copy some part of a table to another table in cassandra? A 
large amount of data should be copied so i don't want to fetch data to client 
and stream it back to cassandra using cql.



Sent using Zoho Mail