[jira] [Updated] (CASSANDRA-13983) Support a means of logging all queries as they were invoked

2018-08-02 Thread Jeremy Hanna (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-13983:
-
Labels: fqltool  (was: )

> Support a means of logging all queries as they were invoked
> ---
>
> Key: CASSANDRA-13983
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13983
> Project: Cassandra
>  Issue Type: New Feature
>  Components: CQL, Observability, Testing, Tools
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
>Priority: Major
>  Labels: fqltool
> Fix For: 4.0
>
>
> For correctness testing it's useful to be able to capture production traffic 
> so that it can be replayed against both the old and new versions of Cassandra 
> while comparing the results.
> Implementing this functionality once inside the database is high performance 
> and presents less operational complexity.
> In [this patch|https://github.com/apache/cassandra/pull/169] there is an 
> implementation of a full query log that logs uses chronicle-queue (apache 
> licensed, the maven artifacts are labeled incorrectly in some cases, 
> dependencies are also apache licensed) to implement a rotating log of queries.
> * Single thread asynchronously writes log entries to disk to reduce impact on 
> query latency
> * Heap memory usage bounded by a weighted queue with configurable maximum 
> weight sitting in front of logging thread
> * If the weighted queue is full producers can be blocked or samples can be 
> dropped
> * Disk utilization is bounded by deleting old log segments once a 
> configurable size is reached
> * The on disk serialization uses a flexible schema binary format 
> (chronicle-wire) making it easy to skip unrecognized fields, add new ones, 
> and omit old ones.
> * Can be enabled and configured via JMX, disabled, and reset (delete on disk 
> data), logging path is configurable via both JMX and YAML
> * Introduce new {{fqltool}} in /bin that currently implements {{Dump}} which 
> can dump in a human readable format full query logs as well as follow active 
> full query logs
> Follow up work:
> * Introduce new {{fqltool}} command Replay which can replay N full query logs 
> to two different clusters and compare the result and check for 
> inconsistencies. <- Actively working on getting this done
> * Log not just queries but their results to facilitate a comparison between 
> the original query result and the replayed result. <- Really just don't have 
> specific use case at the moment
> * "Consistent" query logging allowing replay to fully replicate the original 
> order of execution and completion even in the face of races (including CAS). 
> <- This is more speculative



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13983) Support a means of logging all queries as they were invoked

2017-12-04 Thread Ariel Weisberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-13983:
---
Resolution: Fixed
Status: Resolved  (was: Ready to Commit)

Thanks committed as 
[ae837806bd07dbb8b881960fb90c1a665d93|https://github.com/apache/cassandra/commit/ae837806bd07dbb8b881960fb90c1a665d93]

> Support a means of logging all queries as they were invoked
> ---
>
> Key: CASSANDRA-13983
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13983
> Project: Cassandra
>  Issue Type: New Feature
>  Components: CQL, Observability, Testing, Tools
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 4.0
>
>
> For correctness testing it's useful to be able to capture production traffic 
> so that it can be replayed against both the old and new versions of Cassandra 
> while comparing the results.
> Implementing this functionality once inside the database is high performance 
> and presents less operational complexity.
> In [this patch|https://github.com/apache/cassandra/pull/169] there is an 
> implementation of a full query log that logs uses chronicle-queue (apache 
> licensed, the maven artifacts are labeled incorrectly in some cases, 
> dependencies are also apache licensed) to implement a rotating log of queries.
> * Single thread asynchronously writes log entries to disk to reduce impact on 
> query latency
> * Heap memory usage bounded by a weighted queue with configurable maximum 
> weight sitting in front of logging thread
> * If the weighted queue is full producers can be blocked or samples can be 
> dropped
> * Disk utilization is bounded by deleting old log segments once a 
> configurable size is reached
> * The on disk serialization uses a flexible schema binary format 
> (chronicle-wire) making it easy to skip unrecognized fields, add new ones, 
> and omit old ones.
> * Can be enabled and configured via JMX, disabled, and reset (delete on disk 
> data), logging path is configurable via both JMX and YAML
> * Introduce new {{fqltool}} in /bin that currently implements {{Dump}} which 
> can dump in a human readable format full query logs as well as follow active 
> full query logs
> Follow up work:
> * Introduce new {{fqltool}} command Replay which can replay N full query logs 
> to two different clusters and compare the result and check for 
> inconsistencies. <- Actively working on getting this done
> * Log not just queries but their results to facilitate a comparison between 
> the original query result and the replayed result. <- Really just don't have 
> specific use case at the moment
> * "Consistent" query logging allowing replay to fully replicate the original 
> order of execution and completion even in the face of races (including CAS). 
> <- This is more speculative



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13983) Support a means of logging all queries as they were invoked

2017-11-30 Thread Blake Eggleston (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-13983:

Status: Ready to Commit  (was: Patch Available)

> Support a means of logging all queries as they were invoked
> ---
>
> Key: CASSANDRA-13983
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13983
> Project: Cassandra
>  Issue Type: New Feature
>  Components: CQL, Observability, Testing, Tools
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 4.0
>
>
> For correctness testing it's useful to be able to capture production traffic 
> so that it can be replayed against both the old and new versions of Cassandra 
> while comparing the results.
> Implementing this functionality once inside the database is high performance 
> and presents less operational complexity.
> In [this patch|https://github.com/apache/cassandra/pull/169] there is an 
> implementation of a full query log that logs uses chronicle-queue (apache 
> licensed, the maven artifacts are labeled incorrectly in some cases, 
> dependencies are also apache licensed) to implement a rotating log of queries.
> * Single thread asynchronously writes log entries to disk to reduce impact on 
> query latency
> * Heap memory usage bounded by a weighted queue with configurable maximum 
> weight sitting in front of logging thread
> * If the weighted queue is full producers can be blocked or samples can be 
> dropped
> * Disk utilization is bounded by deleting old log segments once a 
> configurable size is reached
> * The on disk serialization uses a flexible schema binary format 
> (chronicle-wire) making it easy to skip unrecognized fields, add new ones, 
> and omit old ones.
> * Can be enabled and configured via JMX, disabled, and reset (delete on disk 
> data), logging path is configurable via both JMX and YAML
> * Introduce new {{fqltool}} in /bin that currently implements {{Dump}} which 
> can dump in a human readable format full query logs as well as follow active 
> full query logs
> Follow up work:
> * Introduce new {{fqltool}} command Replay which can replay N full query logs 
> to two different clusters and compare the result and check for 
> inconsistencies. <- Actively working on getting this done
> * Log not just queries but their results to facilitate a comparison between 
> the original query result and the replayed result. <- Really just don't have 
> specific use case at the moment
> * "Consistent" query logging allowing replay to fully replicate the original 
> order of execution and completion even in the face of races (including CAS). 
> <- This is more speculative



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13983) Support a means of logging all queries as they were invoked

2017-11-01 Thread Ariel Weisberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-13983:
---
Status: Patch Available  (was: Open)

|[code|https://github.com/apache/cassandra/pull/169]|[utests|https://circleci.com/gh/aweisberg/cassandra/tree/cassandra-13983-trunk]|[dtests|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/405/]|

> Support a means of logging all queries as they were invoked
> ---
>
> Key: CASSANDRA-13983
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13983
> Project: Cassandra
>  Issue Type: New Feature
>  Components: CQL, Observability, Testing, Tools
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
>Priority: Major
> Fix For: 4.0
>
>
> For correctness testing it's useful to be able to capture production traffic 
> so that it can be replayed against both the old and new versions of Cassandra 
> while comparing the results.
> Implementing this functionality once inside the database is high performance 
> and presents less operational complexity.
> In [this patch|https://github.com/apache/cassandra/pull/169] there is an 
> implementation of a full query log that logs uses chronicle-queue (apache 
> licensed, the maven artifacts are labeled incorrectly in some cases, 
> dependencies are also apache licensed) to implement a rotating log of queries.
> * Single thread asynchronously writes log entries to disk to reduce impact on 
> query latency
> * Heap memory usage bounded by a weighted queue with configurable maximum 
> weight sitting in front of logging thread
> * If the weighted queue is full producers can be blocked or samples can be 
> dropped
> * Disk utilization is bounded by deleting old log segments once a 
> configurable size is reached
> * The on disk serialization uses a flexible schema binary format 
> (chronicle-wire) making it easy to skip unrecognized fields, add new ones, 
> and omit old ones.
> * Can be enabled and configured via JMX, disabled, and reset (delete on disk 
> data), logging path is configurable via both JMX and YAML
> * Introduce new {{fqltool}} in /bin that currently implements {{Dump}} which 
> can dump in a human readable format full query logs as well as follow active 
> full query logs
> Follow up work:
> * Introduce new {{fqltool}} command Replay which can replay N full query logs 
> to two different clusters and compare the result and check for 
> inconsistencies. <- Actively working on getting this done
> * Log not just queries but their results to facilitate a comparison between 
> the original query result and the replayed result. <- Really just don't have 
> specific use case at the moment
> * "Consistent" query logging allowing replay to fully replicate the original 
> order of execution and completion even in the face of races (including CAS). 
> <- This is more speculative



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13983) Support a means of logging all queries as they were invoked

2017-10-31 Thread Ariel Weisberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-13983:
---
Reviewer: Blake Eggleston

> Support a means of logging all queries as they were invoked
> ---
>
> Key: CASSANDRA-13983
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13983
> Project: Cassandra
>  Issue Type: New Feature
>  Components: CQL, Observability, Testing, Tools
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 4.0
>
>
> For correctness testing it's useful to be able to capture production traffic 
> so that it can be replayed against both the old and new versions of Cassandra 
> while comparing the results.
> Implementing this functionality once inside the database is high performance 
> and presents less operational complexity.
> In [this patch|https://github.com/apache/cassandra/pull/169] there is an 
> implementation of a full query log that logs uses chronicle-queue (apache 
> licensed, the maven artifacts are labeled incorrectly in some cases, 
> dependencies are also apache licensed) to implement a rotating log of queries.
> * Single thread asynchronously writes log entries to disk to reduce impact on 
> query latency
> * Heap memory usage bounded by a weighted queue with configurable maximum 
> weight sitting in front of logging thread
> * If the weighted queue is full producers can be blocked or samples can be 
> dropped
> * Disk utilization is bounded by deleting old log segments once a 
> configurable size is reached
> * The on disk serialization uses a flexible schema binary format 
> (chronicle-wire) making it easy to skip unrecognized fields, add new ones, 
> and omit old ones.
> * Can be enabled and configured via JMX, disabled, and reset (delete on disk 
> data), logging path is configurable via both JMX and YAML
> * Introduce new {{fqltool}} in /bin that currently implements {{Dump}} which 
> can dump in a human readable format full query logs as well as follow active 
> full query logs
> Follow up work:
> * Introduce new {{fqltool}} command Replay which can replay N full query logs 
> to two different clusters and compare the result and check for 
> inconsistencies. <- Actively working on getting this done
> * Log not just queries but their results to facilitate a comparison between 
> the original query result and the replayed result. <- Really just don't have 
> specific use case at the moment
> * "Consistent" query logging allowing replay to fully replicate the original 
> order of execution and completion even in the face of races (including CAS). 
> <- This is more speculative



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13983) Support a means of logging all queries as they were invoked

2017-10-31 Thread Ariel Weisberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-13983:
---
Description: 
For correctness testing it's useful to be able to capture production traffic so 
that it can be replayed against both the old and new versions of Cassandra 
while comparing the results.

Implementing this functionality once inside the database is high performance 
and presents less operational complexity.

In [this patch|https://github.com/apache/cassandra/pull/169] there is an 
implementation of a full query log that logs uses chronicle-queue (apache 
licensed, the maven artifacts are labeled incorrectly in some cases, 
dependencies are also apache licensed) to implement a rotating log of queries.

* Single thread asynchronously writes log entries to disk to reduce impact on 
query latency
* Heap memory usage bounded by a weighted queue with configurable maximum 
weight sitting in front of logging thread
* If the weighted queue is full producers can be blocked or samples can be 
dropped
* Disk utilization is bounded by deleting old log segments once a configurable 
size is reached
* The on disk serialization uses a flexible schema binary format 
(chronicle-wire) making it easy to skip unrecognized fields, add new ones, and 
omit old ones.
* Can be enabled and configured via JMX, disabled, and reset (delete on disk 
data), logging path is configurable via both JMX and YAML
* Introduce new {{fqltool}} in /bin that currently implements {{Dump}} which 
can dump in a human readable format full query logs as well as follow active 
full query logs

Follow up work:
* Introduce new {{fqltool}} command Replay which can replay N full query logs 
to two different clusters and compare the result and check for inconsistencies. 
<- Actively working on getting this done
* Log not just queries but their results to facilitate a comparison between the 
original query result and the replayed result. <- Really just don't have 
specific use case at the moment
* "Consistent" query logging allowing replay to fully replicate the original 
order of execution and completion even in the face of races (including CAS). <- 
This is more speculative

  was:
For correctness testing it's useful to be able to capture production traffic so 
that it can be replayed against both the old and new versions of Cassandra 
while comparing the results.

Implementing this functionality once inside the database is high performance 
and presents less operational complexity.

In [this 
patch|https://github.com/apache/cassandra/compare/trunk...aweisberg:fql-trunk-temp?expand=1]
 there is an implementation of a full query log that logs uses chronicle-queue 
(apache licensed, the maven artifacts are labeled incorrectly in some cases, 
dependencies are also apache licensed) to implement a rotating log of queries.

* Single thread asynchronously writes log entries to disk to reduce impact on 
query latency
* Heap memory usage bounded by a weighted queue with configurable maximum 
weight sitting in front of logging thread
* If the weighted queue is full producers can be blocked or samples can be 
dropped
* Disk utilization is bounded by deleting old log segments once a configurable 
size is reached
* The on disk serialization uses a flexible schema binary format 
(chronicle-wire) making it easy to skip unrecognized fields, add new ones, and 
omit old ones.
* Can be enabled and configured via JMX, disabled, and reset (delete on disk 
data), logging path is configurable via both JMX and YAML
* Introduce new {{fqltool}} in /bin that currently implements {{Dump}} which 
can dump in a human readable format full query logs as well as follow active 
full query logs

Follow up work:
* Introduce new {{fqltool}} command Replay which can replay N full query logs 
to two different clusters and compare the result and check for inconsistencies. 
<- Actively working on getting this done
* Log not just queries but their results to facilitate a comparison between the 
original query result and the replayed result. <- Really just don't have 
specific use case at the moment
* "Consistent" query logging allowing replay to fully replicate the original 
order of execution and completion even in the face of races (including CAS). <- 
This is more speculative


> Support a means of logging all queries as they were invoked
> ---
>
> Key: CASSANDRA-13983
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13983
> Project: Cassandra
>  Issue Type: New Feature
>  Components: CQL, Observability, Testing, Tools
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 4.0
>
>
> For correctness testing it's useful to be able to capture production traffic 
> so that it can be replayed against both the old