Re: read repair with consistency one

2018-04-21 Thread Ben Slater
I haven't checked the code to make sure this is still the case but last
time I checked:
- For any read, if an inconsistency between replicas is detected then this
inconsistency will be repaired. This obviously wouldn’t apply with CL=ONE
because you’re not reading multiple replicas to find inconsistencies.
- If read_repair_chance or dc_local_read_repair_chance are >0 then extra
replicas are checked as part of the query for the % of queries specified by
the chance setting. Again, if inconsistencies are found, they are repaired.
I expect this mechanism would still apply for CL=ONE.


Cheers
Ben

On Sat, 21 Apr 2018 at 22:20 Grzegorz Pietrusza 
wrote:

> I haven't asked about "regular" repairs. I just wanted to know how read
> repair behaves in my configuration (or is it doing anything at all).
>
> 2018-04-21 14:04 GMT+02:00 Rahul Singh :
>
>> Read repairs are one anti-entropy measure. Continuous repairs is another.
>> If you do repairs via Reaper or your own method it will resolve your
>> discrepencies.
>>
>> On Apr 21, 2018, 3:16 AM -0400, Grzegorz Pietrusza ,
>> wrote:
>>
>> Hi all
>>
>> I'm a bit confused with how read repair works in my case, which is:
>> - multiple DCs with RF 1 (NetworkTopologyStrategy)
>> - reads with consistency ONE
>>
>>
>> The article #1 says that read repair in fact runs RF reads for some
>> percent of the requests. Let's say I have read_repair_chance = 0.1. Does
>> it mean that 10% of requests will be read in all DCs (digest) and processed
>> in a background?
>>
>> On the other hand article #2 says that for consistency ONE read repair is
>> not performed. Does it mean that in my case read repair does not work at
>> all? Is there any way to enable read repair across DCs and stay will
>> consistency ONE for reads?
>>
>>
>> #1 https://www.datastax.com/dev/blog/common-mistakes-and-misconceptions
>> #2
>> https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsRepairNodesReadRepair.html
>>
>> Regards
>> Grzegorz
>>
>>
> --


*Ben Slater*

*Chief Product Officer *

   


Read our latest technical blog posts here
.

This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
and Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.


Re: Cassandra doesn't insert all rows

2018-04-21 Thread dinesh.jo...@yahoo.com.INVALID
Soheil, 
As Jeff mentioned that you need to provide more information. There are no known 
issues that I can think of that would cause such behavior. It would be great if 
you could provide us with a reduced test case so we can try and reproduce this 
behavior or at least help you debug the issue better. Could you detail the 
version of Cassandra, the number of nodes, the keyspace definition, RF / CL, 
perhaps a bit of your client code that does the writes, did you get back any 
errors on the client or on the server side? These details would be helpful to 
further help you.
Thanks,
Dinesh 

On Saturday, April 21, 2018, 11:06:12 AM PDT, Soheil Pourbafrani 
 wrote:  
 
 I consume data from Kafka and insert them into Cassandra cluster using Java 
API. The table has 4 keys including a timestamp based on millisecond. But when 
executing the code, it just inserts 120 to 190 rows and ignores other incoming 
data!
What parts can be the cause of the problem? Bad insert code in key fields that 
overwrite data, improper cluster configuration,?  

Re: Cassandra doesn't insert all rows

2018-04-21 Thread Jeff Jirsa
Impossible to guess with that info, but maybe one of:

- “Wrong” consistency level for reads or writes
- Incorrect primary key definition (you’re overwriting data you don’t realize 
you’re overwriting)

Less likely:
- Broken cluster where hosts are flapping and you’re missing data on read
- using a version of Cassandra with bugs in short read protection

-- 
Jeff Jirsa


> On Apr 21, 2018, at 2:05 PM, Soheil Pourbafrani  wrote:
> 
> I consume data from Kafka and insert them into Cassandra cluster using Java 
> API. The table has 4 keys including a timestamp based on millisecond. But 
> when executing the code, it just inserts 120 to 190 rows and ignores other 
> incoming data!
> 
> What parts can be the cause of the problem? Bad insert code in key fields 
> that overwrite data, improper cluster configuration,?

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Cassandra doesn't insert all rows

2018-04-21 Thread Soheil Pourbafrani
I consume data from Kafka and insert them into Cassandra cluster using Java
API. The table has 4 keys including a timestamp based on millisecond. But
when executing the code, it just inserts 120 to 190 rows and ignores other
incoming data!

What parts can be the cause of the problem? Bad insert code in key fields
that overwrite data, improper cluster configuration,?


Re: How to configure Cassandra to NOT use SSLv2?

2018-04-21 Thread Lou DeGenaro
3.0.9

On Fri, Apr 20, 2018 at 10:26 PM, Michael Shuler 
wrote:

> On 04/20/2018 08:46 AM, Lou DeGenaro wrote:
> > Could you be more specific?  What does one specify exactly to assure
> > SSLv2 is not used for both client-server and server-server
> > communications?  Example yaml statements would be wonderful.
>
> The defaults in cassandra.yaml have only TLS specified in the current
> branch HEADs. I'm pretty sure SSLv2/3 removal was a post-POODLE commit.
> It's possible you may be on something older - what version are we
> talking about?
>
> --
> Michael
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: read repair with consistency one

2018-04-21 Thread Grzegorz Pietrusza
I haven't asked about "regular" repairs. I just wanted to know how read
repair behaves in my configuration (or is it doing anything at all).

2018-04-21 14:04 GMT+02:00 Rahul Singh :

> Read repairs are one anti-entropy measure. Continuous repairs is another.
> If you do repairs via Reaper or your own method it will resolve your
> discrepencies.
>
> On Apr 21, 2018, 3:16 AM -0400, Grzegorz Pietrusza ,
> wrote:
>
> Hi all
>
> I'm a bit confused with how read repair works in my case, which is:
> - multiple DCs with RF 1 (NetworkTopologyStrategy)
> - reads with consistency ONE
>
>
> The article #1 says that read repair in fact runs RF reads for some
> percent of the requests. Let's say I have read_repair_chance = 0.1. Does
> it mean that 10% of requests will be read in all DCs (digest) and processed
> in a background?
>
> On the other hand article #2 says that for consistency ONE read repair is
> not performed. Does it mean that in my case read repair does not work at
> all? Is there any way to enable read repair across DCs and stay will
> consistency ONE for reads?
>
>
> #1 https://www.datastax.com/dev/blog/common-mistakes-and-misconceptions
> #2 https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/
> opsRepairNodesReadRepair.html
>
> Regards
> Grzegorz
>
>


Re: read repair with consistency one

2018-04-21 Thread Rahul Singh
Read repairs are one anti-entropy measure. Continuous repairs is another. If 
you do repairs via Reaper or your own method it will resolve your discrepencies.

On Apr 21, 2018, 3:16 AM -0400, Grzegorz Pietrusza , 
wrote:
> Hi all
>
> I'm a bit confused with how read repair works in my case, which is:
> - multiple DCs with RF 1 (NetworkTopologyStrategy)
> - reads with consistency ONE
>
>
> The article #1 says that read repair in fact runs RF reads for some percent 
> of the requests. Let's say I have read_repair_chance = 0.1. Does it mean that 
> 10% of requests will be read in all DCs (digest) and processed in a 
> background?
>
> On the other hand article #2 says that for consistency ONE read repair is not 
> performed. Does it mean that in my case read repair does not work at all? Is 
> there any way to enable read repair across DCs and stay will consistency ONE 
> for reads?
>
>
> #1 https://www.datastax.com/dev/blog/common-mistakes-and-misconceptions
> #2 
> https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsRepairNodesReadRepair.html
>
> Regards
> Grzegorz


Re: copy from one table to another

2018-04-21 Thread Rahul Singh
That’s correct.

On Apr 21, 2018, 5:05 AM -0400, Kyrylo Lebediev , 
wrote:
> You mean that correct table UUID should be specified as suffix in directory 
> name?
> For example:
>
> Table:
>
> cqlsh> select id from system_schema.tables where keyspace_name='test' and 
> table_name='usr';
>
>  id
> --
>  ea2f6da0-f931-11e7-8224-43ca70555242
>
>
> Directory name:
> ./data/test/usr-ea2f6da0f93111e7822443ca70555242
>
> Correct?
>
> Regards,
> Kyrill
> From: Rahul Singh 
> Sent: Thursday, April 19, 2018 10:53:11 PM
> To: user@cassandra.apache.org
> Subject: Re: copy from one table to another
>
> Each table has a different Guid — doing a hard link may work as long as the 
> sstable dir’s guid is he same as the newly created table in the system schema.
>
> --
> Rahul Singh
> rahul.si...@anant.us
>
> Anant Corporation
>
> On Apr 19, 2018, 10:41 AM -0500, Kyrylo Lebediev , 
> wrote:
> > The table is too large to be copied fast/effectively , so I'd like to 
> > leverage immutableness  property of SSTables.
> >
> > My idea is to:
> > 1) create new empty table (NewTable) with the same structure as existing 
> > one (OldTable)
> > 2) at some time run simultaneous 'nodetool snapshot -t ttt  
> > OldTable' on all nodes -- this will create point in time state of OldTable
> > 3) on each node run:
> >        for each file in OldTable ttt snapshot directory:
> >  ln 
> > //OldTable-/snapshots/ttt/_OldTable_xx 
> > .//Newtable/_NewTable_x
> >  then:
> >  nodetool refresh  NewTable
> > 4) nodetool repair NewTable
> > 5) Use OldTable and NewTable independently (Read/Write)
> >
> > Are there any issues with using hardlinks (ln) instead of copying (cp) in 
> > this case?
> >
> > Thanks,
> > Kyrill
> >
> > From: Rahul Singh 
> > Sent: Wednesday, April 18, 2018 2:07:17 AM
> > To: user@cassandra.apache.org
> > Subject: Re: copy from one table to another
> >
> > 1. Make a new table with the same schema.
> > For each node
> > 2. Shutdown node
> > 3. Copy data from Source sstable dir to new sstable dir.
> >
> > This will do what you want.
> >
> > --
> > Rahul Singh
> > rahul.si...@anant.us
> >
> > Anant Corporation
> >
> > On Apr 16, 2018, 4:21 PM -0500, Kyrylo Lebediev , 
> > wrote:
> > > Thanks,  Ali.
> > > I just need to copy a large table in production without actual copying by 
> > > using hardlinks. After this both tables should be used independently 
> > > (RW). Is this a supported way or not?
> > >
> > > Regards,
> > > Kyrill
> > > From: Ali Hubail 
> > > Sent: Monday, April 16, 2018 6:51:51 PM
> > > To: user@cassandra.apache.org
> > > Subject: Re: copy from one table to another
> > >
> > > If you want to copy a portion of the data to another table, you can also 
> > > use sstable cql writer. It is more of an advanced feature and can be 
> > > tricky, but doable.
> > > once you write the new sstables, you can then use the sstableloader to 
> > > stream the new data into the new table.
> > > check this out:
> > > https://www.datastax.com/dev/blog/using-the-cassandra-bulk-loader-updated
> > >
> > > I have recently used this to clean up 500 GB worth of sstable data in 
> > > order to purge tombstones that were mistakenly generated by the client.
> > > obviously this is not as fast as hardlinks + refresh, but it's much 
> > > faster and more efficient than using cql to copy data accross the tables.
> > > take advantage of CQLSSTableWriter.builder.sorted() if you can, and 
> > > utilize writetime if you have to.
> > >
> > > Ali Hubail
> > >
> > > Confidentiality warning: This message and any attachments are intended 
> > > only for the persons to whom this message is addressed, are confidential, 
> > > and may be privileged. If you are not the intended recipient, you are 
> > > hereby notified that any review, retransmission, conversion to hard copy, 
> > > copying, modification, circulation or other use of this message and any 
> > > attachments is strictly prohibited. If you receive this message in error, 
> > > please notify the sender immediately by return email, and delete this 
> > > message and any attachments from your system. Petrolink International 
> > > Limited its subsidiaries, holding companies and affiliates disclaims all 
> > > responsibility from and accepts no liability whatsoever for the 
> > > consequences of any unauthorized person acting, or refraining from 
> > > acting, on any information contained in this message. For security 
> > > purposes, staff training, to assist in resolving complaints and to 
> > > improve our customer service, email communications may be monitored and 
> > > telephone calls may be recorded.
> > >
> > >
> > > Kyrylo Lebediev 
> > > 04/16/2018 10:37 AM
> > > Please respond to
> > > user@cassandra.apache.org
> > >
> > > To
> > > 

Re: copy from one table to another

2018-04-21 Thread Kyrylo Lebediev
You mean that correct table UUID should be specified as suffix in directory 
name?
For example:


Table:


cqlsh> select id from system_schema.tables where keyspace_name='test' and 
table_name='usr';

 id
--
 ea2f6da0-f931-11e7-8224-43ca70555242


Directory name:
./data/test/usr-ea2f6da0f93111e7822443ca70555242


Correct?


Regards,

Kyrill


From: Rahul Singh 
Sent: Thursday, April 19, 2018 10:53:11 PM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

Each table has a different Guid — doing a hard link may work as long as the 
sstable dir’s guid is he same as the newly created table in the system schema.

--
Rahul Singh
rahul.si...@anant.us

Anant Corporation

On Apr 19, 2018, 10:41 AM -0500, Kyrylo Lebediev , 
wrote:

The table is too large to be copied fast/effectively , so I'd like to leverage 
immutableness  property of SSTables.

My idea is to:

1) create new empty table (NewTable) with the same structure as existing one 
(OldTable)
2) at some time run simultaneous 'nodetool snapshot -t ttt  OldTable' 
on all nodes -- this will create point in time state of OldTable

3) on each node run:
   for each file in OldTable ttt snapshot directory:

 ln 
//OldTable-/snapshots/ttt/_OldTable_xx 
.//Newtable/_NewTable_x

 then:
 nodetool refresh  NewTable

4) nodetool repair NewTable
5) Use OldTable and NewTable independently (Read/Write)


Are there any issues with using hardlinks (ln) instead of copying (cp) in this 
case?


Thanks,

Kyrill



From: Rahul Singh 
Sent: Wednesday, April 18, 2018 2:07:17 AM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

1. Make a new table with the same schema.
For each node
2. Shutdown node
3. Copy data from Source sstable dir to new sstable dir.

This will do what you want.

--
Rahul Singh
rahul.si...@anant.us

Anant Corporation

On Apr 16, 2018, 4:21 PM -0500, Kyrylo Lebediev , 
wrote:
Thanks,  Ali.
I just need to copy a large table in production without actual copying by using 
hardlinks. After this both tables should be used independently (RW). Is this a 
supported way or not?

Regards,
Kyrill

From: Ali Hubail 
Sent: Monday, April 16, 2018 6:51:51 PM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

If you want to copy a portion of the data to another table, you can also use 
sstable cql writer. It is more of an advanced feature and can be tricky, but 
doable.
once you write the new sstables, you can then use the sstableloader to stream 
the new data into the new table.
check this out:
https://www.datastax.com/dev/blog/using-the-cassandra-bulk-loader-updated

I have recently used this to clean up 500 GB worth of sstable data in order to 
purge tombstones that were mistakenly generated by the client.
obviously this is not as fast as hardlinks + refresh, but it's much faster and 
more efficient than using cql to copy data accross the tables.
take advantage of CQLSSTableWriter.builder.sorted() if you can, and utilize 
writetime if you have to.

Ali Hubail

Confidentiality warning: This message and any attachments are intended only for 
the persons to whom this message is addressed, are confidential, and may be 
privileged. If you are not the intended recipient, you are hereby notified that 
any review, retransmission, conversion to hard copy, copying, modification, 
circulation or other use of this message and any attachments is strictly 
prohibited. If you receive this message in error, please notify the sender 
immediately by return email, and delete this message and any attachments from 
your system. Petrolink International Limited its subsidiaries, holding 
companies and affiliates disclaims all responsibility from and accepts no 
liability whatsoever for the consequences of any unauthorized person acting, or 
refraining from acting, on any information contained in this message. For 
security purposes, staff training, to assist in resolving complaints and to 
improve our customer service, email communications may be monitored and 
telephone calls may be recorded.


Kyrylo Lebediev 

04/16/2018 10:37 AM

Please respond to
user@cassandra.apache.org




To
"user@cassandra.apache.org" ,
cc

Subject
Re: copy from one table to another







Any issues if we:

1) create an new empty table with the same structure as the old one
2) create hardlinks ("ln without -s"): 
.../-/--* ---> 
.../-/--*
3) run nodetool refresh -- newkeyspacename newtable

and then query/modify both tables independently/simultaneously?

In theory, as SSTables are immutable, this should work, but could there be some 
hidden issues?

Regards,
Kyrill


read repair with consistency one

2018-04-21 Thread Grzegorz Pietrusza
Hi all

I'm a bit confused with how read repair works in my case, which is:
- multiple DCs with RF 1 (NetworkTopologyStrategy)
- reads with consistency ONE


The article #1 says that read repair in fact runs RF reads for some percent
of the requests. Let's say I have read_repair_chance = 0.1. Does it mean
that 10% of requests will be read in all DCs (digest) and processed in a
background?

On the other hand article #2 says that for consistency ONE read repair is
not performed. Does it mean that in my case read repair does not work at
all? Is there any way to enable read repair across DCs and stay will
consistency ONE for reads?


#1 https://www.datastax.com/dev/blog/common-mistakes-and-misconceptions
#2
https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsRepairNodesReadRepair.html

Regards
Grzegorz