date:20190609

[jira] [Commented] (CASSANDRA-15152) Batch Log - Mutation too large while bootstrapping a newly added node

2019-06-09 Thread Avraham Kalvo (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16859724#comment-16859724
 ] 

Avraham Kalvo commented on CASSANDRA-15152:
---

Switching log level to trace has disclosed the following, just before the error 
we’re getting:
`TRACE [BatchlogTasks:1] 2019-06-10 05:45:40,251 BatchlogManager.java:309 - 
Replaying batch 5694cca0-8834-11e9-b262-b3ace0831935`

How should one query the `system.batches` table to see the actual mutation(s) 
list (Blob to Text? Casting?)
Would this table disclose the exact keyspace.table the mutations is related to? 
thanks.



> Batch Log - Mutation too large while bootstrapping a newly added node
> -
>
> Key: CASSANDRA-15152
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15152
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Batch Log
>Reporter: Avraham Kalvo
>Priority: Normal
>
> Scaling our six nodes cluster by three more nodes, we came upon behavior in 
> which bootstrap appears hung under `UJ` (two previously added were joined 
> within approximately 2.5 hours).
> Examining the logs the following became apparent shortly after the bootstrap 
> process has commenced for this node:
> ```
> ERROR [BatchlogTasks:1] 2019-06-05 14:43:46,508 CassandraDaemon.java:207 - 
> Exception in thread Thread[BatchlogTasks:1,5,main]
> java.lang.IllegalArgumentException: Mutation of 108035175 bytes is too large 
> for the maximum size of 16777216
> at 
> org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:256) 
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:520) 
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.Keyspace.applyNotDeferrable(Keyspace.java:399) 
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at org.apache.cassandra.db.Mutation.apply(Mutation.java:213) 
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at org.apache.cassandra.db.Mutation.apply(Mutation.java:227) 
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.batchlog.BatchlogManager$ReplayingBatch.sendSingleReplayMutation(BatchlogManager.java:427)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.batchlog.BatchlogManager$ReplayingBatch.sendReplays(BatchlogManager.java:402)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.batchlog.BatchlogManager$ReplayingBatch.replay(BatchlogManager.java:318)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.batchlog.BatchlogManager.processBatchlogEntries(BatchlogManager.java:238)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.batchlog.BatchlogManager.replayFailedBatches(BatchlogManager.java:207)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_201]
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
> [na:1.8.0_201]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  [na:1.8.0_201]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  [na:1.8.0_201]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [na:1.8.0_201]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [na:1.8.0_201]
> at java.lang.Thread.run(Thread.java:748) [na:1.8.0_201]
> ```
> And since then, repeating itself in the logs.
> We decided to discard the newly added apparently still joining node by doing 
> the following:
> 1. at first - simply restarting it, which resulted in it starting up 
> apparently normally 
> 2. then - decommission it by issuing `nodetool decommission`, this took long 
> (over 2.5 hours) and eventually was terminated by issuing `nodetool 
> removenode`
> 3. node removal was hung on a specific token, which led us to complete it by 
> force.
> 4. forcing the node removal has generated a corruption with one of the 
> `system.batches` table SSTables, which was removed (backed up) from its 
> underlying data dir as mitigation (78MB worth)
> 5. cluster-wide repair was run
> 6. `Mutation too large` error is now repeating itself in three different 
> permutations (alerted sizes) under three different nodes (our standard 
> replication factor is of three)
> We're not sure whether we

[jira] [Created] (CASSANDRA-15152) Batch Log - Mutation too large while bootstrapping a newly added node

2019-06-09 Thread Avraham Kalvo (JIRA)

Avraham Kalvo created CASSANDRA-15152:
-

 Summary: Batch Log - Mutation too large while bootstrapping a 
newly added node
 Key: CASSANDRA-15152
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15152
 Project: Cassandra
  Issue Type: Bug
  Components: Consistency/Batch Log
Reporter: Avraham Kalvo


Scaling our six nodes cluster by three more nodes, we came upon behavior in 
which bootstrap appears hung under `UJ` (two previously added were joined 
within approximately 2.5 hours).

Examining the logs the following became apparent shortly after the bootstrap 
process has commenced for this node:
```
ERROR [BatchlogTasks:1] 2019-06-05 14:43:46,508 CassandraDaemon.java:207 - 
Exception in thread Thread[BatchlogTasks:1,5,main]
java.lang.IllegalArgumentException: Mutation of 108035175 bytes is too large 
for the maximum size of 16777216
at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:256) 
~[apache-cassandra-3.0.10.jar:3.0.10]
at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:520) 
~[apache-cassandra-3.0.10.jar:3.0.10]
at 
org.apache.cassandra.db.Keyspace.applyNotDeferrable(Keyspace.java:399) 
~[apache-cassandra-3.0.10.jar:3.0.10]
at org.apache.cassandra.db.Mutation.apply(Mutation.java:213) 
~[apache-cassandra-3.0.10.jar:3.0.10]
at org.apache.cassandra.db.Mutation.apply(Mutation.java:227) 
~[apache-cassandra-3.0.10.jar:3.0.10]
at 
org.apache.cassandra.batchlog.BatchlogManager$ReplayingBatch.sendSingleReplayMutation(BatchlogManager.java:427)
 ~[apache-cassandra-3.0.10.jar:3.0.10]
at 
org.apache.cassandra.batchlog.BatchlogManager$ReplayingBatch.sendReplays(BatchlogManager.java:402)
 ~[apache-cassandra-3.0.10.jar:3.0.10]
at 
org.apache.cassandra.batchlog.BatchlogManager$ReplayingBatch.replay(BatchlogManager.java:318)
 ~[apache-cassandra-3.0.10.jar:3.0.10]
at 
org.apache.cassandra.batchlog.BatchlogManager.processBatchlogEntries(BatchlogManager.java:238)
 ~[apache-cassandra-3.0.10.jar:3.0.10]
at 
org.apache.cassandra.batchlog.BatchlogManager.replayFailedBatches(BatchlogManager.java:207)
 ~[apache-cassandra-3.0.10.jar:3.0.10]
at 
org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118)
 ~[apache-cassandra-3.0.10.jar:3.0.10]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_201]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
[na:1.8.0_201]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
 [na:1.8.0_201]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
 [na:1.8.0_201]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[na:1.8.0_201]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[na:1.8.0_201]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_201]
```

And since then, repeating itself in the logs.

We decided to discard the newly added apparently still joining node by doing 
the following:
1. at first - simply restarting it, which resulted in it starting up apparently 
normally 
2. then - decommission it by issuing `nodetool decommission`, this took long 
(over 2.5 hours) and eventually was terminated by issuing `nodetool removenode`
3. node removal was hung on a specific token, which led us to complete it by 
force.
4. forcing the node removal has generated a corruption with one of the 
`system.batches` table SSTables, which was removed (backed up) from its 
underlying data dir as mitigation (78MB worth)
5. cluster-wide repair was run
6. `Mutation too large` error is now repeating itself in three different 
permutations (alerted sizes) under three different nodes (our standard 
replication factor is of three)

We're not sure whether we're hitting 
https://issues.apache.org/jira/browse/CASSANDRA-11670 or not, as it's said to 
be resolved in our current version of 3.0.10.
Still would like to verify what's the root cause for this? as we need to make 
clear whether we are to expect this happening in production environments.

How would you recommend verifying to which keyspace.table does this mutation 
belong to?

Thanks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-10190) Python 3 support for cqlsh

2019-06-09 Thread Patrick Bannister (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16859468#comment-16859468
 ] 

Patrick Bannister edited comment on CASSANDRA-10190 at 6/9/19 2:15 PM:
---

[~djoshi3], thanks for the feedback. I've incorporated all of your 
recommendations on the branch.

I meant to completely delete FrozenType after testing with it commented out, 
but I forgot to follow through. The reason I wanted to remove the FrozenType 
class from cqlsh.py is because it was commented "Needed until the bundled 
python driver adds FrozenType.", and I noticed that [the Python driver includes 
FrozenType since version 
2.5.0|https://github.com/datastax/python-driver/blob/master/CHANGELOG.rst#250].

It looks like the bundled driver is at least version 3.7.0 (CASSANDRA-12736), 
so I think we should be able to remove FrozenType completely. I've made this 
change on my branch too.

 


was (Author: ptbannister):
[~djoshi3], thanks for the feedback. I've incorporated all of your 
recommendations on the branch.

I meant to completely delete FrozenType after testing with it commented out, 
but I forgot to follow through. The reason I wanted to remove the FrozenType 
class from cqlsh.py is because it was commented "Needed until the bundled 
python driver adds FrozenType.", and I noticed that [the Python driver includes 
FrozenType since version 
2.5.0|[https://github.com/datastax/python-driver/blob/master/CHANGELOG.rst#250].]

It looks like the bundled driver is at least version 3.7.0 (CASSANDRA-12736), 
so I think we should be able to remove FrozenType completely. I've made this 
change on my branch too.

 

> Python 3 support for cqlsh
> --
>
> Key: CASSANDRA-10190
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10190
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Tools
>Reporter: Andrew Pennebaker
>Assignee: Patrick Bannister
>Priority: Normal
>  Labels: cqlsh
> Attachments: coverage_notes.txt
>
>
> Users who operate in a Python 3 environment may have trouble launching cqlsh. 
> Could we please update cqlsh's syntax to run in Python 3?
> As a workaround, users can setup pyenv, and cd to a directory with a 
> .python-version containing "2.7". But it would be nice if cqlsh supported 
> modern Python versions out of the box.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-10190) Python 3 support for cqlsh

2019-06-09 Thread Patrick Bannister (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16859468#comment-16859468
 ] 

Patrick Bannister commented on CASSANDRA-10190:
---

[~djoshi3], thanks for the feedback. I've incorporated all of your 
recommendations on the branch.

I meant to completely delete FrozenType after testing with it commented out, 
but I forgot to follow through. The reason I wanted to remove the FrozenType 
class from cqlsh.py is because it was commented "Needed until the bundled 
python driver adds FrozenType.", and I noticed that [the Python driver includes 
FrozenType since version 
2.5.0|[https://github.com/datastax/python-driver/blob/master/CHANGELOG.rst#250].]

It looks like the bundled driver is at least version 3.7.0 (CASSANDRA-12736), 
so I think we should be able to remove FrozenType completely. I've made this 
change on my branch too.

 

> Python 3 support for cqlsh
> --
>
> Key: CASSANDRA-10190
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10190
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Tools
>Reporter: Andrew Pennebaker
>Assignee: Patrick Bannister
>Priority: Normal
>  Labels: cqlsh
> Attachments: coverage_notes.txt
>
>
> Users who operate in a Python 3 environment may have trouble launching cqlsh. 
> Could we please update cqlsh's syntax to run in Python 3?
> As a workaround, users can setup pyenv, and cd to a directory with a 
> .python-version containing "2.7". But it would be nice if cqlsh supported 
> modern Python versions out of the box.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14494) Investigate possibility of a cqlsh terminfo

2019-06-09 Thread Patrick Bannister (JIRA)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Bannister updated CASSANDRA-14494:
--
Resolution: Won't Do
Status: Resolved  (was: Open)

This is unnecessary at this time.

> Investigate possibility of a cqlsh terminfo
> ---
>
> Key: CASSANDRA-14494
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14494
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Legacy/Tools
> Environment: This behavior has been observed in xterm on CentOS 7.5 
> platforms. The test_cqlsh_output.py unit tests 
> (pylib/cqlshlib/test/test_cqlsh_output.py) are a good place to see it in 
> action.
>Reporter: Patrick Bannister
>Assignee: Patrick Bannister
>Priority: Normal
>  Labels: Python, cqlsh, test
> Fix For: 4.x
>
>
> Summary: investigate whether we could use a cqlsh-specific terminfo file to 
> prevent use of the set-meta-mode escape sequence in xterm without breaking 
> colors. If it works, see if we can install it in an appropriate place using 
> Python distutils. If yes to both, generate a cqlsh terminfo file and work it 
> into the install process.
> Long detailed explanation:
> In some more recent environments, in Python REPL applications that use the 
> readline module, the set meta mode escape sequence is output before each 
> prompt. This escape sequence has caused problems for some applications, and 
> in our case, some of our cqlsh unit tests 
> (pylib/cqlshlib/test/test_cqlsh_output.py) choke on this output because of 
> the way our tests are designed to detect the cqlsh prompt. This behavior was 
> observed on a CentOS 7.5 platform.
> The set-meta-mode escape sequence normally appears as "[?1034h" in output; 
> it's normally defined as the bytes 1b 5b 3f 31 30 33 34 68.  The exact value 
> of the escape sequence is configurable and may be found on a GNU/Linux 
> platform by running the command:
> {code:java}
> tput smm | hexdump{code}
> If this command gives no output, then the set meta mode sequence is not 
> defined on this platform for the terminal in use. Refer to the xterm and 
> terminfo man pages for more information on this sequence.
> There are easier ways to solve this problem for the sake of the unit test, 
> but if time allows, I'd like to look into this to achieve a more consistent 
> output behavior for cqlsh on GNU/Linux platforms.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-14494) Investigate possibility of a cqlsh terminfo

2019-06-09 Thread Patrick Bannister (JIRA)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Bannister reassigned CASSANDRA-14494:
-

Assignee: Patrick Bannister

> Investigate possibility of a cqlsh terminfo
> ---
>
> Key: CASSANDRA-14494
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14494
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Legacy/Tools
> Environment: This behavior has been observed in xterm on CentOS 7.5 
> platforms. The test_cqlsh_output.py unit tests 
> (pylib/cqlshlib/test/test_cqlsh_output.py) are a good place to see it in 
> action.
>Reporter: Patrick Bannister
>Assignee: Patrick Bannister
>Priority: Normal
>  Labels: Python, cqlsh, test
> Fix For: 4.x
>
>
> Summary: investigate whether we could use a cqlsh-specific terminfo file to 
> prevent use of the set-meta-mode escape sequence in xterm without breaking 
> colors. If it works, see if we can install it in an appropriate place using 
> Python distutils. If yes to both, generate a cqlsh terminfo file and work it 
> into the install process.
> Long detailed explanation:
> In some more recent environments, in Python REPL applications that use the 
> readline module, the set meta mode escape sequence is output before each 
> prompt. This escape sequence has caused problems for some applications, and 
> in our case, some of our cqlsh unit tests 
> (pylib/cqlshlib/test/test_cqlsh_output.py) choke on this output because of 
> the way our tests are designed to detect the cqlsh prompt. This behavior was 
> observed on a CentOS 7.5 platform.
> The set-meta-mode escape sequence normally appears as "[?1034h" in output; 
> it's normally defined as the bytes 1b 5b 3f 31 30 33 34 68.  The exact value 
> of the escape sequence is configurable and may be found on a GNU/Linux 
> platform by running the command:
> {code:java}
> tput smm | hexdump{code}
> If this command gives no output, then the set meta mode sequence is not 
> defined on this platform for the terminal in use. Refer to the xterm and 
> terminfo man pages for more information on this sequence.
> There are easier ways to solve this problem for the sake of the unit test, 
> but if time allows, I'd like to look into this to achieve a more consistent 
> output behavior for cqlsh on GNU/Linux platforms.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15152) Batch Log - Mutation too large while bootstrapping a newly added node

[jira] [Created] (CASSANDRA-15152) Batch Log - Mutation too large while bootstrapping a newly added node

[jira] [Comment Edited] (CASSANDRA-10190) Python 3 support for cqlsh

[jira] [Commented] (CASSANDRA-10190) Python 3 support for cqlsh

[jira] [Updated] (CASSANDRA-14494) Investigate possibility of a cqlsh terminfo

[jira] [Assigned] (CASSANDRA-14494) Investigate possibility of a cqlsh terminfo

6 matches

Site Navigation

Mail list logo

Footer information