Re: Flakey Dtests

2017-11-16 Thread Michael Kjellman
Quick update re: dtests and off-heap memtables:

I’ve filed CASSANDRA-14056 (Many dtests fail with ConfigurationException: 
offheap_objects are not available in 3.0 when OFFHEAP_MEMTABLES=“true”)

Looks like we’re gonna need to do some work to test this configuration and 
right now it’s pretty broken...

Do we have any volunteers to fix the broken Materialized Views and CDC DTests?

best,
kjellman


> On Nov 15, 2017, at 5:59 PM, Michael Kjellman  
> wrote:
> 
> yes - true- some are flaky, but almost all of the ones i filed fail 100% () 
> of the time. i look forward to triaging just the remaining flaky ones 
> (hopefully - without powers combined - by the end of this month!!)
> 
> appreciate everyone’s help - no matter how small... i already personally did 
> a few “fun” random-python-class-is-missing-return-after-method stuff. 
> 
> we’ve wanted this for a while and now is our time to actually execute and 
> make good on our previous dev list promises. 
> 
> best,
> kjellman
> 
>> On Nov 15, 2017, at 5:45 PM, Jeff Jirsa  wrote:
>> 
>> In lieu of a weekly wrap-up, here's a pre-Thanksgiving call for help.
>> 
>> If you haven't been paying attention to JIRA, you likely didn't notice that
>> Josh went through and triage/categorized a bunch of issues by adding
>> components, and Michael took the time to open a bunch of JIRAs for failing
>> tests.
>> 
>> How many is a bunch? Something like 35 or so just for tests currently
>> failing on trunk.  If you're a regular contributor, you already know that
>> dtests are flakey - it'd be great if a few of us can go through and fix a
>> few. Even incremental improvements are improvements. Here's an easy search
>> to find them:
>> 
>> https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=true=project+%3D+CASSANDRA+AND+component+%3D+Testing+ORDER+BY+updated+DESC%2C+priority+DESC%2C+created+ASC=hide
>> 
>> If you're a new contributor, fixing tests is often a good way to learn a
>> new part of the codebase. Many of these are dtests, which live in a
>> different repo ( https://github.com/apache/cassandra-dtest ) and are in
>> python, but have no fear, the repo has instructions for setting up and
>> running dtests(
>> https://github.com/apache/cassandra-dtest/blob/master/INSTALL.md )
>> 
>> Normal contribution workflow applies: self-assign the ticket if you want to
>> work on it, click on 'start progress' to indicate that you're working on
>> it, mark it 'patch available' when you've uploaded code to be reviewed (in
>> a github branch, or as a standalone patch file attached to the JIRA). If
>> you have questions, feel free to email the dev list (that's what it's here
>> for).
>> 
>> Many thanks will be given,
>> - Jeff



Re: custom validation before replication

2017-11-16 Thread Jeff Jirsa
What you're wanting really isn't supported/available, but you could
probably extend cassandra to do this with some work.

Doing this at replication time is the wrong point, though -  you want to do
it before the mutation is applied locally, so triggers are still the
closest to the right point as it exists now.

If you let it apply locally and then try to stop replication, you'll have
to also fight:
- Commitlog replay
- Read repair / consistency levels
- Antientropy repair
- Hints
etc





On Thu, Nov 16, 2017 at 1:36 PM, Abdelkrim Fitouri 
wrote:

> ok please find bellow an example:
>
> Lets suppose that i have a cassandra cluster of 4 nodes / one DC /
> replication factor = 4, So in this architecture i have on full copy of the
> data on each node.
>
> Imagine now that one node have been hacked and in some way with full access
> to cqlsh session, if data is changed on that node, data will be changed on
> the three other, am i right ?
>
> imagine now that i am able to know (using cryptographic bases) if one
> column was modified by my API ( => normal way) or not ( => suspicious way),
> and i want to execute this check function just before any replication of a
> keyspace to avoid that all the replica will be affected by that and so a
> rollback will be not easy and the integrity of all the system will be down,
> the check will for example kill the local cassandra service ...
>
> Hope that my question is more clear now.
>
> Many thanks for any help.
>
> 2017-11-16 21:59 GMT+01:00 Nate McCall :
>
> > On Fri, Nov 17, 2017 at 9:11 AM, Abdelkrim Fitouri 
> > wrote:
> > > Trigger does not resolve my problem because it is not a format
> validation
> > > issue but an integrity constraint ...
> > >
> > > My purpose is to check data integrity before replication, by returning
> an
> > > error and killing the service, so i am killing the node that is
> supposed
> > to
> > > replicate data after a write action ...
> >
> > I'm a little confused. Can you provide some specific examples of your
> > requirements?
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>
>
> --
>
> Cordialement / Best Regards.
>
> *Abdelkarim FITOURI*
>
> LPIC/CEH/ITIL
>
> System And Security Engineer
>


Re: custom validation before replication

2017-11-16 Thread Jon Haddad
Looks like you’ve got this thread going on the user & dev ML.  This list is the 
dev one, and is meant for discussion of the Cassandra project.  Would everyone 
mind replying to the thread of the same name on the user list instead?

> On Nov 16, 2017, at 1:36 PM, Abdelkrim Fitouri  wrote:
> 
> ok please find bellow an example:
> 
> Lets suppose that i have a cassandra cluster of 4 nodes / one DC /
> replication factor = 4, So in this architecture i have on full copy of the
> data on each node.
> 
> Imagine now that one node have been hacked and in some way with full access
> to cqlsh session, if data is changed on that node, data will be changed on
> the three other, am i right ?
> 
> imagine now that i am able to know (using cryptographic bases) if one
> column was modified by my API ( => normal way) or not ( => suspicious way),
> and i want to execute this check function just before any replication of a
> keyspace to avoid that all the replica will be affected by that and so a
> rollback will be not easy and the integrity of all the system will be down,
> the check will for example kill the local cassandra service ...
> 
> Hope that my question is more clear now.
> 
> Many thanks for any help.
> 
> 2017-11-16 21:59 GMT+01:00 Nate McCall :
> 
>> On Fri, Nov 17, 2017 at 9:11 AM, Abdelkrim Fitouri 
>> wrote:
>>> Trigger does not resolve my problem because it is not a format validation
>>> issue but an integrity constraint ...
>>> 
>>> My purpose is to check data integrity before replication, by returning an
>>> error and killing the service, so i am killing the node that is supposed
>> to
>>> replicate data after a write action ...
>> 
>> I'm a little confused. Can you provide some specific examples of your
>> requirements?
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> 
> 
> 
> -- 
> 
> Cordialement / Best Regards.
> 
> *Abdelkarim FITOURI*
> 
> LPIC/CEH/ITIL
> 
> System And Security Engineer


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: custom validation before replication

2017-11-16 Thread Abdelkrim Fitouri
ok please find bellow an example:

Lets suppose that i have a cassandra cluster of 4 nodes / one DC /
replication factor = 4, So in this architecture i have on full copy of the
data on each node.

Imagine now that one node have been hacked and in some way with full access
to cqlsh session, if data is changed on that node, data will be changed on
the three other, am i right ?

imagine now that i am able to know (using cryptographic bases) if one
column was modified by my API ( => normal way) or not ( => suspicious way),
and i want to execute this check function just before any replication of a
keyspace to avoid that all the replica will be affected by that and so a
rollback will be not easy and the integrity of all the system will be down,
the check will for example kill the local cassandra service ...

Hope that my question is more clear now.

Many thanks for any help.

2017-11-16 21:59 GMT+01:00 Nate McCall :

> On Fri, Nov 17, 2017 at 9:11 AM, Abdelkrim Fitouri 
> wrote:
> > Trigger does not resolve my problem because it is not a format validation
> > issue but an integrity constraint ...
> >
> > My purpose is to check data integrity before replication, by returning an
> > error and killing the service, so i am killing the node that is supposed
> to
> > replicate data after a write action ...
>
> I'm a little confused. Can you provide some specific examples of your
> requirements?
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


-- 

Cordialement / Best Regards.

*Abdelkarim FITOURI*

LPIC/CEH/ITIL

System And Security Engineer


RE: custom validation before replication

2017-11-16 Thread Abdelkrim Fitouri
Trigger does not resolve my problem because it is not a format validation
issue but an integrity constraint ...

My purpose is to check data integrity before replication, by returning an
error and killing the service, so i am killing the node that is supposed to
replicate data after a write action ...

Does that seems possible ?

Many thanks.

Le 16 nov. 2017 18:53, "Jacques-Henri Berthemet" <
jacques-henri.berthe...@genesys.com> a écrit :

Hi,

You can't prevent the replication because if you manage to return a failure
the other node will keep trying to send the data. What would be more
relevant is to prevent the modification in the first place. You could try
to implement a custom trigger and load it in Cassandra:
http://cassandra.apache.org/doc/latest/cql/triggers.html
https://github.com/apache/cassandra/tree/cassandra-3.11/examples/triggers

In your trigger implementation, you'll need to validate the data and throw
an exception if it does not meet your security settings. However, I don't
think you'll have access to the current username/role at this level.

It may be simpler for you to work with regular authentication and roles:
http://cassandra.apache.org/doc/latest/cql/security.html

Regards,
--
Jacques-Henri Berthemet

-Original Message-
From: Abdelkrim Fitouri [mailto:abdou@gmail.com]
Sent: jeudi 16 novembre 2017 18:31
To: dev@cassandra.apache.org
Subject: custom validation before replication

Hi,

I have some security constraint on a project, and i need to validate or
unvalidate changes made on a keyspace via cql or via an other ways before
replication.

for example in the case of multinode cluster with replication, if data was
changed locally using cqlsh, data will be replicated (that is normal
working way for cassandra)

is there a possibility to call a custom validation function just before
data replication ?

Many thanks for any help.

--

Best Regards.

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org


Re: custom validation before replication

2017-11-16 Thread Jeff Jirsa
Going to hate myself for this, but check out the trigger interface.

https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/triggers/ITrigger.java

Pay attention to the note that says the API is in beta and subject to
change. It's had that note for many years, which is an indication of how
infrequently anyone uses triggers.


On Thu, Nov 16, 2017 at 9:31 AM, Abdelkrim Fitouri 
wrote:

> Hi,
>
> I have some security constraint on a project, and i need to validate or
> unvalidate changes made on a keyspace via cql or via an other ways before
> replication.
>
> for example in the case of multinode cluster with replication, if data was
> changed locally using cqlsh, data will be replicated (that is normal
> working way for cassandra)
>
> is there a possibility to call a custom validation function just before
> data replication ?
>
> Many thanks for any help.
>
> --
>
> Best Regards.
>


RE: custom validation before replication

2017-11-16 Thread Jacques-Henri Berthemet
Hi,

You can't prevent the replication because if you manage to return a failure the 
other node will keep trying to send the data. What would be more relevant is to 
prevent the modification in the first place. You could try to implement a 
custom trigger and load it in Cassandra:
http://cassandra.apache.org/doc/latest/cql/triggers.html
https://github.com/apache/cassandra/tree/cassandra-3.11/examples/triggers

In your trigger implementation, you'll need to validate the data and throw an 
exception if it does not meet your security settings. However, I don't think 
you'll have access to the current username/role at this level.

It may be simpler for you to work with regular authentication and roles:
http://cassandra.apache.org/doc/latest/cql/security.html

Regards,
--
Jacques-Henri Berthemet

-Original Message-
From: Abdelkrim Fitouri [mailto:abdou@gmail.com] 
Sent: jeudi 16 novembre 2017 18:31
To: dev@cassandra.apache.org
Subject: custom validation before replication

Hi,

I have some security constraint on a project, and i need to validate or 
unvalidate changes made on a keyspace via cql or via an other ways before 
replication.

for example in the case of multinode cluster with replication, if data was 
changed locally using cqlsh, data will be replicated (that is normal working 
way for cassandra)

is there a possibility to call a custom validation function just before data 
replication ?

Many thanks for any help.

-- 

Best Regards.

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Possibly cassandra 3.0.9 bug?

2017-11-16 Thread Jeff Jirsa
1) There are a LOT of bugs in cassandra-3.0.9. Some of them are really bad,
you should definitely consider upgrading to 3.0.15.

2) The GC profile changed between 2.0 and 3.0. It may be that you're
generating a bit more garbage and causing GC pauses (especially likely
since you're using thrift), or it could be that you're hitting some other
bug.

3) It's also possible there's bad rows involved, or some row that's very
out of sync. You'll get read timeouts if (for example) you generate a read
repair mutation > your max mutation size.


On Wed, Nov 15, 2017 at 2:04 PM, Alex Circus 
wrote:

> Hi,
>
> *On short:*
> I use cassandra 3.0.9 in a cluster of 6 nodes.
> 1. I create a keyspace called test:
> CREATE KEYSPACE business WITH replication = {'class':
> 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true;
> 2. I create table called test:
>
> CREATE TABLE test.test (
>
> test_id bigint,
>
> test_value text
>
> PRIMARY KEY (test_id)
>
> )
>
> 3. I insert test_id=23 and test_value=some very large string/html (like
> 406088 chars utf8).
>
> 4. I query for test_id=35 and I get timeout (even with clqsh
> --request-timeout=3600)...
>
> 5. If I run the above on an existing cassandra cluster with cassa 2.0 the
> select returns instantlyThe Java heap size is 8GB and in JMX I see max
> 4GB used of these 8 GB in the new cluster
>
>
> *Detailed:*
>
> The above was just a test. The real scenario is:
>
> I migrated some tables from an old cassa (2.0) cluster with 9 nodes into
> another with 6 nodes and with cassa 3.0.9 and there was a lot of
> problems
>
> I have a table like this:
>
> CREATE TABLE table (
>   id text,
>   ts text,
>   score decimal,
>   type text,
>   values text,
>   PRIMARY KEY (id, ts)
> ) WITH CLUSTERING ORDER BY (ts DESC)
>
> and the following query (which returns instantly):
>
> SELECT * FROM keyspace.table WHERE id='someId' AND ts IN 
> ('2017-10-15','2017-10-16','2017-10-17','2017-10-18','2017-10-19','2017-10-20','2017-10-21','2017-10-22','2017-10-23','2017-10-24','2017-10-25','2017-10-26','2017-10-27','2017-10-28','2017-10-29','2017-10-30','2017-10-31','2017-11-01','2017-11-02','2017-11-03','2017-11-04','2017-11-05','2017-11-06');
>
> *If I add another day in the IN clause, the response never comes (even
> after 10 minutes!!!):*
>
> SELECT * FROM keyspace.table WHERE id='someId' AND ts IN
> ('2017-10-15','2017-10-16','2017-10-17','2017-10-18','
> 2017-10-19','2017-10-20','2017-10-21','2017-10-22','
> 2017-10-23','2017-10-24','2017-10-25','2017-10-26','
> 2017-10-27','2017-10-28','2017-10-29','2017-10-30','
> 2017-10-31','2017-11-01','2017-11-02','2017-11-03','
> 2017-11-04','2017-11-05','2017-11-06', *'2017-11-07'*);
>
> *The 'values' column may have large json data. *
>
> I managed to trace one of the timeouts by looking into system_trace
> keyspace. Please look into the attached image and see the last process took
> 10 minutes!!!
>
> I think there is some size limit somewhere because in* the IN clause *if
> I have 23 params it works(under 1 second), but with more(1+) it fails. The
> rows are the same size (same json size on all). In node2 of those 6 it
> works with 24 params. In node1 and node3 no. The other nodes I haven't
> checked yet.
>
> I saw no concluding logs except this one from cassa's debug.log (in the
> moment of the timeout or very close to that):
>
> *DEBUG [Thrift:2608] 2017-11-15 13:48:05,611 ReadCallback.java:126 - Timed
> out; received 0 of 1 responses*
>
> I think this problem has the same root cause as the one from the test
> (large html text) and it is related to some memory limit by code somewhere.
>
>
> Thank you,
>
> Alex.
> [image: screenshot.png]
>
>


custom validation before replication

2017-11-16 Thread Abdelkrim Fitouri
Hi,

I have some security constraint on a project, and i need to validate or
unvalidate changes made on a keyspace via cql or via an other ways before
replication.

for example in the case of multinode cluster with replication, if data was
changed locally using cqlsh, data will be replicated (that is normal
working way for cassandra)

is there a possibility to call a custom validation function just before
data replication ?

Many thanks for any help.

-- 

Best Regards.


Re: Possibly cassandra 3.0.9 bug?

2017-11-16 Thread Alex Circus
Thanks,

Here is a link with the image:
https://imgur.com/a/KpJMh

On Thu, Nov 16, 2017 at 11:22 AM Pavel Drankov  wrote:

> You also can use any image uploading service like https://imgur.com/
>
> Best wishes,
> Pavel
>
> On 16 November 2017 at 11:21, Murukesh Mohanan  >
> wrote:
>
> > Hi Alex,
> >
> > It's still not visible... I don't think the mailing list supports image
> > attachments. Maybe you can create an issue on JIRA with the attachments?
> >
> > Thanks,
> > Muru
> > On Thu, 16 Nov 2017 at 17:00 Alex Circus 
> > wrote:
> >
> > > Hi Pavel,
> > >
> > > I'm attaching it again. I use gmail app from browser. Please check now.
> > >
> > > Thanks,
> > > Alex.
> > >
> > > On Thu, Nov 16, 2017 at 9:17 AM, Pavel Drankov 
> > > wrote:
> > >
> > >> Hi Alex,
> > >>
> > >> I don't see any attached image. Can you please send it one more time?
> > >>
> > >> Best wishes,
> > >> Pavel
> > >>
> > >> On 16 November 2017 at 01:04, Alex Circus  >
> > >> wrote:
> > >>
> > >> > Hi,
> > >> >
> > >> > *On short:*
> > >> > I use cassandra 3.0.9 in a cluster of 6 nodes.
> > >> > 1. I create a keyspace called test:
> > >> > CREATE KEYSPACE business WITH replication = {'class':
> > >> > 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes =
> > true;
> > >> > 2. I create table called test:
> > >> >
> > >> > CREATE TABLE test.test (
> > >> >
> > >> > test_id bigint,
> > >> >
> > >> > test_value text
> > >> >
> > >> > PRIMARY KEY (test_id)
> > >> >
> > >> > )
> > >> >
> > >> > 3. I insert test_id=23 and test_value=some very large string/html
> > (like
> > >> > 406088 chars utf8).
> > >> >
> > >> > 4. I query for test_id=35 and I get timeout (even with clqsh
> > >> > --request-timeout=3600)...
> > >> >
> > >> > 5. If I run the above on an existing cassandra cluster with cassa
> 2.0
> > >> the
> > >> > select returns instantlyThe Java heap size is 8GB and in JMX I
> see
> > >> max
> > >> > 4GB used of these 8 GB in the new cluster
> > >> >
> > >> >
> > >> > *Detailed:*
> > >> >
> > >> > The above was just a test. The real scenario is:
> > >> >
> > >> > I migrated some tables from an old cassa (2.0) cluster with 9 nodes
> > into
> > >> > another with 6 nodes and with cassa 3.0.9 and there was a lot of
> > >> > problems
> > >> >
> > >> > I have a table like this:
> > >> >
> > >> > CREATE TABLE table (
> > >> >   id text,
> > >> >   ts text,
> > >> >   score decimal,
> > >> >   type text,
> > >> >   values text,
> > >> >   PRIMARY KEY (id, ts)
> > >> > ) WITH CLUSTERING ORDER BY (ts DESC)
> > >> >
> > >> > and the following query (which returns instantly):
> > >> >
> > >> > SELECT * FROM keyspace.table WHERE id='someId' AND ts IN
> > >> ('2017-10-15','2017-10-16','2017-10-17','2017-10-18','
> > 2017-10-19','2017-10-20','2017-10-21','2017-10-22','
> > 2017-10-23','2017-10-24','2017-10-25','2017-10-26','
> > 2017-10-27','2017-10-28','2017-10-29','2017-10-30','
> > 2017-10-31','2017-11-01','2017-11-02','2017-11-03','
> > 2017-11-04','2017-11-05','2017-11-06');
> > >> >
> > >> > *If I add another day in the IN clause, the response never comes
> (even
> > >> > after 10 minutes!!!):*
> > >> >
> > >> > SELECT * FROM keyspace.table WHERE id='someId' AND ts IN
> > >> > ('2017-10-15','2017-10-16','2017-10-17','2017-10-18','
> > >> > 2017-10-19','2017-10-20','2017-10-21','2017-10-22','
> > >> > 2017-10-23','2017-10-24','2017-10-25','2017-10-26','
> > >> > 2017-10-27','2017-10-28','2017-10-29','2017-10-30','
> > >> > 2017-10-31','2017-11-01','2017-11-02','2017-11-03','
> > >> > 2017-11-04','2017-11-05','2017-11-06', *'2017-11-07'*);
> > >> >
> > >> > *The 'values' column may have large json data. *
> > >> >
> > >> > I managed to trace one of the timeouts by looking into system_trace
> > >> > keyspace. Please look into the attached image and see the last
> process
> > >> took
> > >> > 10 minutes!!!
> > >> >
> > >> > I think there is some size limit somewhere because in* the IN clause
> > *if
> > >> > I have 23 params it works(under 1 second), but with more(1+) it
> fails.
> > >> The
> > >> > rows are the same size (same json size on all). In node2 of those 6
> it
> > >> > works with 24 params. In node1 and node3 no. The other nodes I
> haven't
> > >> > checked yet.
> > >> >
> > >> > I saw no concluding logs except this one from cassa's debug.log (in
> > the
> > >> > moment of the timeout or very close to that):
> > >> >
> > >> > *DEBUG [Thrift:2608] 2017-11-15 13:48:05,611 ReadCallback.java:126 -
> > >> Timed
> > >> > out; received 0 of 1 responses*
> > >> >
> > >> > I think this problem has the same root cause as the one from the
> test
> > >> > (large html text) and it is related to some memory limit by code
> > >> somewhere.
> > >> >
> > >> >
> > >> > Thank you,
> > >> >
> > >> > Alex.
> > >> > [image: screenshot.png]
> > >> >
> > >> >
> > >>
> > >
> > >
> > > 

Re: Possibly cassandra 3.0.9 bug?

2017-11-16 Thread Pavel Drankov
You also can use any image uploading service like https://imgur.com/

Best wishes,
Pavel

On 16 November 2017 at 11:21, Murukesh Mohanan 
wrote:

> Hi Alex,
>
> It's still not visible... I don't think the mailing list supports image
> attachments. Maybe you can create an issue on JIRA with the attachments?
>
> Thanks,
> Muru
> On Thu, 16 Nov 2017 at 17:00 Alex Circus 
> wrote:
>
> > Hi Pavel,
> >
> > I'm attaching it again. I use gmail app from browser. Please check now.
> >
> > Thanks,
> > Alex.
> >
> > On Thu, Nov 16, 2017 at 9:17 AM, Pavel Drankov 
> > wrote:
> >
> >> Hi Alex,
> >>
> >> I don't see any attached image. Can you please send it one more time?
> >>
> >> Best wishes,
> >> Pavel
> >>
> >> On 16 November 2017 at 01:04, Alex Circus 
> >> wrote:
> >>
> >> > Hi,
> >> >
> >> > *On short:*
> >> > I use cassandra 3.0.9 in a cluster of 6 nodes.
> >> > 1. I create a keyspace called test:
> >> > CREATE KEYSPACE business WITH replication = {'class':
> >> > 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes =
> true;
> >> > 2. I create table called test:
> >> >
> >> > CREATE TABLE test.test (
> >> >
> >> > test_id bigint,
> >> >
> >> > test_value text
> >> >
> >> > PRIMARY KEY (test_id)
> >> >
> >> > )
> >> >
> >> > 3. I insert test_id=23 and test_value=some very large string/html
> (like
> >> > 406088 chars utf8).
> >> >
> >> > 4. I query for test_id=35 and I get timeout (even with clqsh
> >> > --request-timeout=3600)...
> >> >
> >> > 5. If I run the above on an existing cassandra cluster with cassa 2.0
> >> the
> >> > select returns instantlyThe Java heap size is 8GB and in JMX I see
> >> max
> >> > 4GB used of these 8 GB in the new cluster
> >> >
> >> >
> >> > *Detailed:*
> >> >
> >> > The above was just a test. The real scenario is:
> >> >
> >> > I migrated some tables from an old cassa (2.0) cluster with 9 nodes
> into
> >> > another with 6 nodes and with cassa 3.0.9 and there was a lot of
> >> > problems
> >> >
> >> > I have a table like this:
> >> >
> >> > CREATE TABLE table (
> >> >   id text,
> >> >   ts text,
> >> >   score decimal,
> >> >   type text,
> >> >   values text,
> >> >   PRIMARY KEY (id, ts)
> >> > ) WITH CLUSTERING ORDER BY (ts DESC)
> >> >
> >> > and the following query (which returns instantly):
> >> >
> >> > SELECT * FROM keyspace.table WHERE id='someId' AND ts IN
> >> ('2017-10-15','2017-10-16','2017-10-17','2017-10-18','
> 2017-10-19','2017-10-20','2017-10-21','2017-10-22','
> 2017-10-23','2017-10-24','2017-10-25','2017-10-26','
> 2017-10-27','2017-10-28','2017-10-29','2017-10-30','
> 2017-10-31','2017-11-01','2017-11-02','2017-11-03','
> 2017-11-04','2017-11-05','2017-11-06');
> >> >
> >> > *If I add another day in the IN clause, the response never comes (even
> >> > after 10 minutes!!!):*
> >> >
> >> > SELECT * FROM keyspace.table WHERE id='someId' AND ts IN
> >> > ('2017-10-15','2017-10-16','2017-10-17','2017-10-18','
> >> > 2017-10-19','2017-10-20','2017-10-21','2017-10-22','
> >> > 2017-10-23','2017-10-24','2017-10-25','2017-10-26','
> >> > 2017-10-27','2017-10-28','2017-10-29','2017-10-30','
> >> > 2017-10-31','2017-11-01','2017-11-02','2017-11-03','
> >> > 2017-11-04','2017-11-05','2017-11-06', *'2017-11-07'*);
> >> >
> >> > *The 'values' column may have large json data. *
> >> >
> >> > I managed to trace one of the timeouts by looking into system_trace
> >> > keyspace. Please look into the attached image and see the last process
> >> took
> >> > 10 minutes!!!
> >> >
> >> > I think there is some size limit somewhere because in* the IN clause
> *if
> >> > I have 23 params it works(under 1 second), but with more(1+) it fails.
> >> The
> >> > rows are the same size (same json size on all). In node2 of those 6 it
> >> > works with 24 params. In node1 and node3 no. The other nodes I haven't
> >> > checked yet.
> >> >
> >> > I saw no concluding logs except this one from cassa's debug.log (in
> the
> >> > moment of the timeout or very close to that):
> >> >
> >> > *DEBUG [Thrift:2608] 2017-11-15 13:48:05,611 ReadCallback.java:126 -
> >> Timed
> >> > out; received 0 of 1 responses*
> >> >
> >> > I think this problem has the same root cause as the one from the test
> >> > (large html text) and it is related to some memory limit by code
> >> somewhere.
> >> >
> >> >
> >> > Thank you,
> >> >
> >> > Alex.
> >> > [image: screenshot.png]
> >> >
> >> >
> >>
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
>
> --
>
> Murukesh Mohanan,
> Yahoo! Japan
>


Re: Possibly cassandra 3.0.9 bug?

2017-11-16 Thread Murukesh Mohanan
Hi Alex,

It's still not visible... I don't think the mailing list supports image
attachments. Maybe you can create an issue on JIRA with the attachments?

Thanks,
Muru
On Thu, 16 Nov 2017 at 17:00 Alex Circus  wrote:

> Hi Pavel,
>
> I'm attaching it again. I use gmail app from browser. Please check now.
>
> Thanks,
> Alex.
>
> On Thu, Nov 16, 2017 at 9:17 AM, Pavel Drankov 
> wrote:
>
>> Hi Alex,
>>
>> I don't see any attached image. Can you please send it one more time?
>>
>> Best wishes,
>> Pavel
>>
>> On 16 November 2017 at 01:04, Alex Circus 
>> wrote:
>>
>> > Hi,
>> >
>> > *On short:*
>> > I use cassandra 3.0.9 in a cluster of 6 nodes.
>> > 1. I create a keyspace called test:
>> > CREATE KEYSPACE business WITH replication = {'class':
>> > 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true;
>> > 2. I create table called test:
>> >
>> > CREATE TABLE test.test (
>> >
>> > test_id bigint,
>> >
>> > test_value text
>> >
>> > PRIMARY KEY (test_id)
>> >
>> > )
>> >
>> > 3. I insert test_id=23 and test_value=some very large string/html (like
>> > 406088 chars utf8).
>> >
>> > 4. I query for test_id=35 and I get timeout (even with clqsh
>> > --request-timeout=3600)...
>> >
>> > 5. If I run the above on an existing cassandra cluster with cassa 2.0
>> the
>> > select returns instantlyThe Java heap size is 8GB and in JMX I see
>> max
>> > 4GB used of these 8 GB in the new cluster
>> >
>> >
>> > *Detailed:*
>> >
>> > The above was just a test. The real scenario is:
>> >
>> > I migrated some tables from an old cassa (2.0) cluster with 9 nodes into
>> > another with 6 nodes and with cassa 3.0.9 and there was a lot of
>> > problems
>> >
>> > I have a table like this:
>> >
>> > CREATE TABLE table (
>> >   id text,
>> >   ts text,
>> >   score decimal,
>> >   type text,
>> >   values text,
>> >   PRIMARY KEY (id, ts)
>> > ) WITH CLUSTERING ORDER BY (ts DESC)
>> >
>> > and the following query (which returns instantly):
>> >
>> > SELECT * FROM keyspace.table WHERE id='someId' AND ts IN
>> ('2017-10-15','2017-10-16','2017-10-17','2017-10-18','2017-10-19','2017-10-20','2017-10-21','2017-10-22','2017-10-23','2017-10-24','2017-10-25','2017-10-26','2017-10-27','2017-10-28','2017-10-29','2017-10-30','2017-10-31','2017-11-01','2017-11-02','2017-11-03','2017-11-04','2017-11-05','2017-11-06');
>> >
>> > *If I add another day in the IN clause, the response never comes (even
>> > after 10 minutes!!!):*
>> >
>> > SELECT * FROM keyspace.table WHERE id='someId' AND ts IN
>> > ('2017-10-15','2017-10-16','2017-10-17','2017-10-18','
>> > 2017-10-19','2017-10-20','2017-10-21','2017-10-22','
>> > 2017-10-23','2017-10-24','2017-10-25','2017-10-26','
>> > 2017-10-27','2017-10-28','2017-10-29','2017-10-30','
>> > 2017-10-31','2017-11-01','2017-11-02','2017-11-03','
>> > 2017-11-04','2017-11-05','2017-11-06', *'2017-11-07'*);
>> >
>> > *The 'values' column may have large json data. *
>> >
>> > I managed to trace one of the timeouts by looking into system_trace
>> > keyspace. Please look into the attached image and see the last process
>> took
>> > 10 minutes!!!
>> >
>> > I think there is some size limit somewhere because in* the IN clause *if
>> > I have 23 params it works(under 1 second), but with more(1+) it fails.
>> The
>> > rows are the same size (same json size on all). In node2 of those 6 it
>> > works with 24 params. In node1 and node3 no. The other nodes I haven't
>> > checked yet.
>> >
>> > I saw no concluding logs except this one from cassa's debug.log (in the
>> > moment of the timeout or very close to that):
>> >
>> > *DEBUG [Thrift:2608] 2017-11-15 13:48:05,611 ReadCallback.java:126 -
>> Timed
>> > out; received 0 of 1 responses*
>> >
>> > I think this problem has the same root cause as the one from the test
>> > (large html text) and it is related to some memory limit by code
>> somewhere.
>> >
>> >
>> > Thank you,
>> >
>> > Alex.
>> > [image: screenshot.png]
>> >
>> >
>>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org

-- 

Murukesh Mohanan,
Yahoo! Japan


Re: Possibly cassandra 3.0.9 bug?

2017-11-16 Thread Alex Circus
Hi Pavel,

I'm attaching it again. I use gmail app from browser. Please check now.

Thanks,
Alex.

On Thu, Nov 16, 2017 at 9:17 AM, Pavel Drankov  wrote:

> Hi Alex,
>
> I don't see any attached image. Can you please send it one more time?
>
> Best wishes,
> Pavel
>
> On 16 November 2017 at 01:04, Alex Circus 
> wrote:
>
> > Hi,
> >
> > *On short:*
> > I use cassandra 3.0.9 in a cluster of 6 nodes.
> > 1. I create a keyspace called test:
> > CREATE KEYSPACE business WITH replication = {'class':
> > 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true;
> > 2. I create table called test:
> >
> > CREATE TABLE test.test (
> >
> > test_id bigint,
> >
> > test_value text
> >
> > PRIMARY KEY (test_id)
> >
> > )
> >
> > 3. I insert test_id=23 and test_value=some very large string/html (like
> > 406088 chars utf8).
> >
> > 4. I query for test_id=35 and I get timeout (even with clqsh
> > --request-timeout=3600)...
> >
> > 5. If I run the above on an existing cassandra cluster with cassa 2.0 the
> > select returns instantlyThe Java heap size is 8GB and in JMX I see
> max
> > 4GB used of these 8 GB in the new cluster
> >
> >
> > *Detailed:*
> >
> > The above was just a test. The real scenario is:
> >
> > I migrated some tables from an old cassa (2.0) cluster with 9 nodes into
> > another with 6 nodes and with cassa 3.0.9 and there was a lot of
> > problems
> >
> > I have a table like this:
> >
> > CREATE TABLE table (
> >   id text,
> >   ts text,
> >   score decimal,
> >   type text,
> >   values text,
> >   PRIMARY KEY (id, ts)
> > ) WITH CLUSTERING ORDER BY (ts DESC)
> >
> > and the following query (which returns instantly):
> >
> > SELECT * FROM keyspace.table WHERE id='someId' AND ts IN
> ('2017-10-15','2017-10-16','2017-10-17','2017-10-18','
> 2017-10-19','2017-10-20','2017-10-21','2017-10-22','
> 2017-10-23','2017-10-24','2017-10-25','2017-10-26','
> 2017-10-27','2017-10-28','2017-10-29','2017-10-30','
> 2017-10-31','2017-11-01','2017-11-02','2017-11-03','
> 2017-11-04','2017-11-05','2017-11-06');
> >
> > *If I add another day in the IN clause, the response never comes (even
> > after 10 minutes!!!):*
> >
> > SELECT * FROM keyspace.table WHERE id='someId' AND ts IN
> > ('2017-10-15','2017-10-16','2017-10-17','2017-10-18','
> > 2017-10-19','2017-10-20','2017-10-21','2017-10-22','
> > 2017-10-23','2017-10-24','2017-10-25','2017-10-26','
> > 2017-10-27','2017-10-28','2017-10-29','2017-10-30','
> > 2017-10-31','2017-11-01','2017-11-02','2017-11-03','
> > 2017-11-04','2017-11-05','2017-11-06', *'2017-11-07'*);
> >
> > *The 'values' column may have large json data. *
> >
> > I managed to trace one of the timeouts by looking into system_trace
> > keyspace. Please look into the attached image and see the last process
> took
> > 10 minutes!!!
> >
> > I think there is some size limit somewhere because in* the IN clause *if
> > I have 23 params it works(under 1 second), but with more(1+) it fails.
> The
> > rows are the same size (same json size on all). In node2 of those 6 it
> > works with 24 params. In node1 and node3 no. The other nodes I haven't
> > checked yet.
> >
> > I saw no concluding logs except this one from cassa's debug.log (in the
> > moment of the timeout or very close to that):
> >
> > *DEBUG [Thrift:2608] 2017-11-15 13:48:05,611 ReadCallback.java:126 -
> Timed
> > out; received 0 of 1 responses*
> >
> > I think this problem has the same root cause as the one from the test
> > (large html text) and it is related to some memory limit by code
> somewhere.
> >
> >
> > Thank you,
> >
> > Alex.
> > [image: screenshot.png]
> >
> >
>

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org