[jira] [Commented] (CASSANDRA-13535) Error decoding JSON for timestamp smaller than Integer.MAX_VALUE

2017-11-29 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272292#comment-16272292
 ] 

ZhaoYang commented on CASSANDRA-13535:
--

Could you try to add double quota to long value?

*INSERT INTO bar JSON '{"myfield":"0"}';

> Error decoding JSON for timestamp smaller than Integer.MAX_VALUE
> 
>
> Key: CASSANDRA-13535
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13535
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jeremy Nguyen Xuan
>
> When trying to insert a JSON with field of type timestamp, the field is 
> decoded as an Integer instead as a Long.
> {code}
> CREATE TABLE foo.bar (
> myfield timestamp,
> PRIMARY KEY (myfield)
> );
> cqlsh:foo> INSERT INTO bar JSON '{"myfield":0}';
> InvalidRequest: Error from server: code=2200 [Invalid query] message="Error 
> decoding JSON value for myfield: Expected a long or a datestring 
> representation of a timestamp value, but got a Integer: 0"
> cqlsh:foo> INSERT INTO bar JSON '{"myfield":2147483647}';
> InvalidRequest: Error from server: code=2200 [Invalid query] message="Error 
> decoding JSON value for myfield: Expected a long or a datestring 
> representation of a timestamp value, but got a Integer: 2147483647"
> cqlsh:foo> INSERT INTO bar JSON '{"myfield":2147483648}';
> cqlsh:foo> 
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14078) Fix dTest test_bulk_round_trip_blogposts_with_max_connections

2017-11-29 Thread Jaydeepkumar Chovatia (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272271#comment-16272271
 ] 

Jaydeepkumar Chovatia commented on CASSANDRA-14078:
---

Hi [~KurtG]

Thanks! for the review.

{quote}
My understanding was that the max # connections was configured so that {{COPY 
TO}} would always exceed the max and fail-over.
{quote}
Yes it is designed to failover to peer node, but this configuration 
{{'native_transport_max_concurrent_connections': '12'}} is applicable to all 
the nodes in cluster, not just a few. So client tries to fail over to peer and 
finds that peer is also busy, as a result {{COPY TO}} times out from client 
side and this test fails.
In this 
[run|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-dtest/353/testReport/cqlsh_tests.cqlsh_copy_tests/CqlshCopyTest/test_bulk_round_trip_blogposts_with_max_connections/]
 we can see that {{COPY FROM}} command tries to fail over to peer node but all 
the nodes have exhausted connections so cannot failover, it retries by 
sleeping, etc. and finally gives up.

{code}
All replicas busy, sleeping for 4 second(s)...
...
All replicas busy, sleeping for 1 second(s)...
...
All replicas busy, sleeping for 23 second(s)...
Replicas too busy, given up
{code}

{quote}
We've had no record of this particular failure before (at least in JIRA), seems 
like it could actually be something that needs fixing.
{quote}
[This run 
|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-dtest/353/testReport/cqlsh_tests.cqlsh_copy_tests/CqlshCopyTest/test_bulk_round_trip_blogposts_with_max_connections/]
 on {{trunk}} shows 7/30 failure rate

{quote}
To some degree all tests kind of depend on what hardware you run, this is the 
nature of the beast with C* and dtests. Can you elaborate on why it's not 
deterministic?
{quote}
I agree that many of the dtests depend on the hardware we run, but this 
particular test is hardware dependent + *timing related*. If threads are less 
busy then it may not require more connections and test will pass, but if 
threads are more busy then connections will pile up and results in timeout, 
etc. 

We can try tweking {{INGESTRATE}}, 
{{'native_transport_max_concurrent_connections': '<>'}}, etc. options to make 
this test working but in my opinion it would be difficult to find ideal tuning 
which always works.

Jaydeep

> Fix dTest test_bulk_round_trip_blogposts_with_max_connections
> -
>
> Key: CASSANDRA-14078
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14078
> Project: Cassandra
>  Issue Type: Test
>  Components: Testing
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Minor
>
> This ticket is regarding following dTest 
> {{cqlsh_tests.cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_blogposts_with_max_connections}}
> This test is trying to limit number of client connections and assumes that 
> once connection count has reached then client will fail-over to other node 
> and do the request. The reason is, it is not deterministic test case as it 
> totally depends on what hardware you run, timing, etc.
> For example
> If we look at 
> https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-dtest/353/testReport/cqlsh_tests.cqlsh_copy_tests/CqlshCopyTest/test_bulk_round_trip_blogposts_with_max_connections/
> {quote}
> ...
> Processed: 5000 rows; Rate:2551 rows/s; Avg. rate:2551 rows/s
> All replicas busy, sleeping for 4 second(s)...
> Processed: 1 rows; Rate:2328 rows/s; Avg. rate:2307 rows/s
> All replicas busy, sleeping for 1 second(s)...
> Processed: 15000 rows; Rate:2137 rows/s; Avg. rate:2173 rows/s
> All replicas busy, sleeping for 11 second(s)...
> Processed: 2 rows; Rate:2138 rows/s; Avg. rate:2164 rows/s
> Processed: 25000 rows; Rate:2403 rows/s; Avg. rate:2249 rows/s
> Processed: 3 rows; Rate:2582 rows/s; Avg. rate:2321 rows/s
> Processed: 35000 rows; Rate:2835 rows/s; Avg. rate:2406 rows/s
> Processed: 4 rows; Rate:2867 rows/s; Avg. rate:2458 rows/s
> Processed: 45000 rows; Rate:3163 rows/s; Avg. rate:2540 rows/s
> Processed: 5 rows; Rate:3200 rows/s; Avg. rate:2596 rows/s
> Processed: 50234 rows; Rate:2032 rows/s; Avg. rate:2572 rows/s
> All replicas busy, sleeping for 23 second(s)...
> Replicas too busy, given up
> ...
> {quote}
> Here we can see request is timing out, sometimes it resumes after 1 second, 
> next time 11 seconds and some times it doesn't work at all. 
> In my opinion this test is not a good fit for dTest as dTest(s) should be 
> deterministic.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (CASSANDRA-13857) Allow MV with only partition key

2017-11-29 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272230#comment-16272230
 ] 

ZhaoYang commented on CASSANDRA-13857:
--

Sorry for the late reply.  I don't see any technical issue of supporting those 
cases.

> Allow MV with only partition key
> 
>
> Key: CASSANDRA-13857
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13857
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Kurt Greaves
>
> We currently disallow creation of a view that has the exact same primary key 
> as the base where no clustering keys are present, however a potential use 
> case would be a view where part of the PK is filtered so as to have a subset 
> of data in the view which is faster for range queries. We actually currently 
> allow this, but only if you have a clustering key defined. If you only have a 
> partitioning key it's not possible.
> From the mailing list, the below example works:
> {code:java}
> CREATE TABLE users (
>   site_id int,
>   user_id text,
>   n int,
>   data set,
>   PRIMARY KEY ((site_id, user_id), n));
> user data is updated and read by PK and sometimes I have to fetch all user 
> for some specific site_id. It appeared that full scan by 
> token(site_id,user_id) filtered by WHERE site_id =  works much 
> slower than unfiltered full scan on
> CREATE MATERIALIZED VIEW users_1 AS
> SELECT site_id, user_id, n, data
> FROM users
> WHERE site_id = 1 AND user_id IS NOT NULL AND n IS NOT NULL
> PRIMARY KEY ((site_id, user_id), n);
> {code}
> However the following does not:
> {code:java}
> CREATE TABLE users (
> site_id int,
> user_id text,
> data set,
> PRIMARY KEY ((site_id, user_id)));
> CREATE MATERIALIZED VIEW users_1 AS
> SELECT site_id, user_id, data
> FROM users
> WHERE site_id = 1 AND user_id IS NOT NULL 
> PRIMARY KEY ((site_id, user_id));
> InvalidRequest: Error from server: code=2200 [Invalid query] message="No 
> columns are defined for Materialized View other than primary key"
> {code}
> This is because if the clustering key is empty we assume they've only defined 
> the primary key in the partition key and we haven't accounted for this use 
> case. 
> On that note, we also don't allow the following narrowing of the partition 
> key:
> {code}
> CREATE TABLE kurt.base (
> id int,
> uid text,
> data text,
> PRIMARY KEY (id, uid)
> ) 
> CREATE MATERIALIZED VIEW kurt.mv2 AS SELECT * from kurt.base where id IS NOT 
> NULL and uid='1' PRIMARY KEY ((id, uid));
> InvalidRequest: Error from server: code=2200 [Invalid query] message="No 
> columns are defined for Materialized View other than primary key"
> {code}
> But we do allow the following, which works because there is still a 
> clustering key, despite not changing the PK.
> {code}
> CREATE MATERIALIZED VIEW kurt.mv2 AS SELECT * from kurt.base where id IS NOT 
> NULL and uid='1' PRIMARY KEY (id, uid);
> {code}
> And we also allow the following, which is a narrowing of the partition key as 
> above, but with an extra clustering key.
> {code}
> create table kurt.base3 (id int, uid int, clus1 int, clus2 int, data text, 
> PRIMARY KEY ((id, uid), clus1, clus2));
> CREATE MATERIALIZED VIEW kurt.mv4 AS SELECT * from kurt.base3 where id IS NOT 
> NULL and uid IS NOT NULL and clus1 IS NOT NULL AND clus2 IS NOT NULL  PRIMARY 
> KEY ((id, uid, clus1), clus2);
> {code}
> I _think_ supporting these cases is trivial and mostly already handled in the 
> underlying MV write path, so we might be able to get away with just a simple 
> change of [this 
> condition|https://github.com/apache/cassandra/blob/83822d12d87dcb3aaad2b1e670e57ebef4ab1c36/src/java/org/apache/cassandra/cql3/statements/CreateViewStatement.java#L291].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14070) Add new method for returning list of primary/clustering key values

2017-11-29 Thread Himani Arora (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272199#comment-16272199
 ] 

Himani Arora commented on CASSANDRA-14070:
--

Hi Kurt,

I am trying to implement a trigger on a Cassandra table to fetch values of all 
columns. In order to get clustering key values, I could find the method in 
Cassandra's code on GitHub.

 public default String toCQLString(TableMetadata metadata)
{
StringBuilder sb = new StringBuilder();
for (int i = 0; i < size(); i++)
{
ColumnMetadata c = metadata.clusteringColumns().get(i);
sb.append(i == 0 ? "" : ", ").append(c.type.getString(get(i)));
}
return sb.toString();
} 
But as you can see it is providing me a concatenated string of all the 
clustering key values but it would be better if a list is provided instead of a 
concatenated string.

> Add new method for returning list of primary/clustering key values
> --
>
> Key: CASSANDRA-14070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14070
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Himani Arora
>Priority: Minor
> Fix For: 4.x
>
>
> Add a method to return a list of primary/clustering key values so that it 
> will be easier to process data. Currently, we are getting a string 
> concatenated with either colon (: ) or comma (,) which makes it quite 
> difficult to fetch one single key value.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14070) Add new method for returning list of primary/clustering key values

2017-11-29 Thread Kurt Greaves (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272157#comment-16272157
 ] 

Kurt Greaves commented on CASSANDRA-14070:
--

Can you elaborate? Which string are you referring to?

> Add new method for returning list of primary/clustering key values
> --
>
> Key: CASSANDRA-14070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14070
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Himani Arora
>Priority: Minor
> Fix For: 4.x
>
>
> Add a method to return a list of primary/clustering key values so that it 
> will be easier to process data. Currently, we are getting a string 
> concatenated with either colon (: ) or comma (,) which makes it quite 
> difficult to fetch one single key value.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14078) Fix dTest test_bulk_round_trip_blogposts_with_max_connections

2017-11-29 Thread Kurt Greaves (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272141#comment-16272141
 ] 

Kurt Greaves commented on CASSANDRA-14078:
--

This test has been problematic in the past but it has also caught a few issues 
in the C* code base that did actually get fixed. Not really sure simply 
skipping it is a good idea.

To some degree all tests kind of depend on what hardware you run, this is the 
nature of the beast with C* and dtests. Can you elaborate on why it's not 
deterministic? My understanding was that the max # connections was configured 
so that COPY TO would always exceed the max and fail-over.

We've had no record of this particular failure before (at least in JIRA), seems 
like it could actually be something that needs fixing. 

> Fix dTest test_bulk_round_trip_blogposts_with_max_connections
> -
>
> Key: CASSANDRA-14078
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14078
> Project: Cassandra
>  Issue Type: Test
>  Components: Testing
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Minor
>
> This ticket is regarding following dTest 
> {{cqlsh_tests.cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_blogposts_with_max_connections}}
> This test is trying to limit number of client connections and assumes that 
> once connection count has reached then client will fail-over to other node 
> and do the request. The reason is, it is not deterministic test case as it 
> totally depends on what hardware you run, timing, etc.
> For example
> If we look at 
> https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-dtest/353/testReport/cqlsh_tests.cqlsh_copy_tests/CqlshCopyTest/test_bulk_round_trip_blogposts_with_max_connections/
> {quote}
> ...
> Processed: 5000 rows; Rate:2551 rows/s; Avg. rate:2551 rows/s
> All replicas busy, sleeping for 4 second(s)...
> Processed: 1 rows; Rate:2328 rows/s; Avg. rate:2307 rows/s
> All replicas busy, sleeping for 1 second(s)...
> Processed: 15000 rows; Rate:2137 rows/s; Avg. rate:2173 rows/s
> All replicas busy, sleeping for 11 second(s)...
> Processed: 2 rows; Rate:2138 rows/s; Avg. rate:2164 rows/s
> Processed: 25000 rows; Rate:2403 rows/s; Avg. rate:2249 rows/s
> Processed: 3 rows; Rate:2582 rows/s; Avg. rate:2321 rows/s
> Processed: 35000 rows; Rate:2835 rows/s; Avg. rate:2406 rows/s
> Processed: 4 rows; Rate:2867 rows/s; Avg. rate:2458 rows/s
> Processed: 45000 rows; Rate:3163 rows/s; Avg. rate:2540 rows/s
> Processed: 5 rows; Rate:3200 rows/s; Avg. rate:2596 rows/s
> Processed: 50234 rows; Rate:2032 rows/s; Avg. rate:2572 rows/s
> All replicas busy, sleeping for 23 second(s)...
> Replicas too busy, given up
> ...
> {quote}
> Here we can see request is timing out, sometimes it resumes after 1 second, 
> next time 11 seconds and some times it doesn't work at all. 
> In my opinion this test is not a good fit for dTest as dTest(s) should be 
> deterministic.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13948) Reload compaction strategies when JBOD disk boundary changes

2017-11-29 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-13948:

Status: Patch Available  (was: Open)

> Reload compaction strategies when JBOD disk boundary changes
> 
>
> Key: CASSANDRA-13948
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13948
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Paulo Motta
>Assignee: Paulo Motta
> Fix For: 3.11.x, 4.x
>
> Attachments: debug.log, dtest13948.png, dtest2.png, 
> threaddump-cleanup.txt, threaddump.txt, trace.log
>
>
> The thread dump below shows a race between an sstable replacement by the 
> {{IndexSummaryRedistribution}} and 
> {{AbstractCompactionTask.getNextBackgroundTask}}:
> {noformat}
> Thread 94580: (state = BLOCKED)
>  - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information 
> may be imprecise)
>  - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, 
> line=175 (Compiled frame)
>  - 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() 
> @bci=1, line=836 (Compiled frame)
>  - 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.util.concurrent.locks.AbstractQueuedSynchronizer$Node,
>  int) @bci=67, line=870 (Compiled frame)
>  - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(int) 
> @bci=17, line=1199 (Compiled frame)
>  - java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock() @bci=5, 
> line=943 (Compiled frame)
>  - 
> org.apache.cassandra.db.compaction.CompactionStrategyManager.handleListChangedNotification(java.lang.Iterable,
>  java.lang.Iterable) @bci=359, line=483 (Interpreted frame)
>  - 
> org.apache.cassandra.db.compaction.CompactionStrategyManager.handleNotification(org.apache.cassandra.notifications.INotification,
>  java.lang.Object) @bci=53, line=555 (Interpreted frame)
>  - 
> org.apache.cassandra.db.lifecycle.Tracker.notifySSTablesChanged(java.util.Collection,
>  java.util.Collection, org.apache.cassandra.db.compaction.OperationType, 
> java.lang.Throwable) @bci=50, line=409 (Interpreted frame)
>  - 
> org.apache.cassandra.db.lifecycle.LifecycleTransaction.doCommit(java.lang.Throwable)
>  @bci=157, line=227 (Interpreted frame)
>  - 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.commit(java.lang.Throwable)
>  @bci=61, line=116 (Compiled frame)
>  - 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.commit()
>  @bci=2, line=200 (Interpreted frame)
>  - 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.finish()
>  @bci=5, line=185 (Interpreted frame)
>  - 
> org.apache.cassandra.io.sstable.IndexSummaryRedistribution.redistributeSummaries()
>  @bci=559, line=130 (Interpreted frame)
>  - 
> org.apache.cassandra.db.compaction.CompactionManager.runIndexSummaryRedistribution(org.apache.cassandra.io.sstable.IndexSummaryRedistribution)
>  @bci=9, line=1420 (Interpreted frame)
>  - 
> org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries(org.apache.cassandra.io.sstable.IndexSummaryRedistribution)
>  @bci=4, line=250 (Interpreted frame)
>  - 
> org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries() 
> @bci=30, line=228 (Interpreted frame)
>  - org.apache.cassandra.io.sstable.IndexSummaryManager$1.runMayThrow() 
> @bci=4, line=125 (Interpreted frame)
>  - org.apache.cassandra.utils.WrappedRunnable.run() @bci=1, line=28 
> (Interpreted frame)
>  - 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run()
>  @bci=4, line=118 (Compiled frame)
>  - java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=511 
> (Compiled frame)
>  - java.util.concurrent.FutureTask.runAndReset() @bci=47, line=308 (Compiled 
> frame)
>  - 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask)
>  @bci=1, line=180 (Compiled frame)
>  - java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run() 
> @bci=37, line=294 (Compiled frame)
>  - 
> java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker)
>  @bci=95, line=1149 (Compiled frame)
>  - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=624 
> (Interpreted frame)
>  - 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(java.lang.Runnable)
>  @bci=1, line=81 (Interpreted frame)
>  - org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$8.run() @bci=4 
> (Interpreted frame)
>  - java.lang.Thread.run() @bci=11, line=748 (Compiled frame)
> {noformat}
> {noformat}
> Thread 94573: (state = 

[jira] [Comment Edited] (CASSANDRA-13948) Reload compaction strategies when JBOD disk boundary changes

2017-11-29 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16267272#comment-16267272
 ] 

Paulo Motta edited comment on CASSANDRA-13948 at 11/30/17 1:00 AM:
---

Thanks for the review!

After rebasing this on top of CASSANDRA-13215 and addressing your latest 
comments, I noticed a few things which could be improved and did the following 
updates:
* Since blacklisting a directory will refresh the disk boundaries, we only need 
to reload strategies when the disk boundary changes or the table parameters 
change. To avoid equals comparison every time we call {{maybeReload}}, I moved 
the {{isOutOfDate}} check from the {{DiskBoundaryManager}} to the 
{{DiskBoundaries}} object - which is invalidated when there are any boundary 
changes. 
([commit|https://github.com/pauloricardomg/cassandra/commit/662cd063ca2e1c382ba3cd5dc8032b0d3f12683c])
* I thought that it no longer makes sense to expose the compaction strategy 
index to outside the compaction strategy manager since it's possible to get the 
correct disk placement directly from {{CFS.getDiskBoundaries}}. This should 
prevent races when the {{CompactionStrategyManager}} reloads boundaries between 
successive calls to {{CSM.getCompactionStrategyIndex}}. [This 
commit|https://github.com/pauloricardomg/cassandra/commit/abd1340b000d4596d71f00e5de8507de967ee7a5]
 updates {{relocatesstables}} and {{scrub}} to use {{CFS.getDiskBoundaries}} 
instead, and make {{CSM.getCompactionStrategyIndex}} private.
* I found it a bit hard to reason about when to use {{maybeReload}} to write 
the documentation and made its use consistent across 
{{CompactionStrategyManager}} on [this 
commit|https://github.com/pauloricardomg/cassandra/commit/c0926e99edb1ffdcda16640eda6faf8e78da9e46])
 (as you suggested before) along with the documentation. I kept the previous 
call to {{maybeReload}} from {{ColumnFamilyStore.reload}}, but we could 
probably avoid this and make {{maybeReload}} private-only as this is being 
called on pretty much every operation.

It feels like we can simplify this and get rid of these locks altogether (or at 
least greatly reduce their scope) by encapsulating the disk boundaries and 
compaction strategies in an immutable object accessed with an atomic reference 
and pessimistically cancel any tasks with an old placement when the strategies 
are reloaded. This is a significant refactor of {{CompactionStrategyManager}} 
so we should probably do it another ticket.

I submitted internal CI with the [latest 
branch|https://github.com/pauloricardomg/cassandra/tree/3.11-13948] and will 
post the results here when ready. I will create a trunk version after this 
follow-up is reviewed.


was (Author: pauloricardomg):
Thanks for the review!

After rebasing this on top of CASSANDRA-13215 and addressing your latest 
comments, I noticed a few things which could be improved and did the following 
updates:
* Since blacklisting a directory will refresh the disk boundaries, we only need 
to reload strategies when the disk boundary changes or the table parameters 
change. To avoid equals comparison every time we call {{maybeReload}}, I moved 
the {{isOutOfDate}} check from the {{DiskBoundaryManager}} to the 
{{DiskBoundaries}} object - which is invalidated when there are any boundary 
changes. 
([commit|https://github.com/pauloricardomg/cassandra/commit/662cd063ca2e1c382ba3cd5dc8032b0d3f12683c])
* I thought that it no longer makes sense to expose the compaction strategy 
index to outside the compaction strategy manager since it's possible to get the 
correct disk placement directly from {{CFS.getDiskBoundaries}}. This should 
prevent races when the {{CompactionStrategyManager}} reloads boundaries between 
successive calls to {{CSM.getCompactionStrategyIndex}}. [This 
commit|https://github.com/pauloricardomg/cassandra/commit/abd1340b000d4596d71f00e5de8507de967ee7a5]
 updates {{relocatesstables}} and {{scrub}} to use {{CFS.getDiskBoundaries}} 
instead, and make {{CSM.getCompactionStrategyIndex}} private.
* I found it a bit hard to reason about when to use {{maybeReload}} to write 
the documentation and made its use consistent across 
{{CompactionStrategyManager}} on [this 
commit|https://github.com/pauloricardomg/cassandra/commit/8518d6c4f001641da36d6fd58474ed3b50476326])
 (as you suggested before) along with the documentation. I kept the previous 
call to {{maybeReload}} from {{ColumnFamilyStore.reload}}, but we could 
probably avoid this and make {{maybeReload}} private-only as this is being 
called on pretty much every operation.

It feels like we can simplify this and get rid of these locks altogether (or at 
least greatly reduce their scope) by encapsulating the disk boundaries and 
compaction strategies in an immutable object accessed with an atomic reference 
and pessimistically cancel any tasks with an old placement when the strategies 
are 

[jira] [Commented] (CASSANDRA-13948) Reload compaction strategies when JBOD disk boundary changes

2017-11-29 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271933#comment-16271933
 ] 

Paulo Motta commented on CASSANDRA-13948:
-

There were test failures on:
* testall: {{CompactionsCQLTest.testSetLocalCompactionStrategy}} and 
{{testTriggerMinorCompactionSTCSNodetoolEnabled}}
* dtest: {{disk_balance_test.TestDiskBalance.disk_balance_bootstrap_test}}

{{testSetLocalCompactionStrategy}} and 
{{testTriggerMinorCompactionSTCSNodetoolEnabled}} were failing because when the 
strategy was updated via JMX, these manually set configurations were not 
surviving the compaction strategy reload - this was not introduced by this, but 
would also happen in case a directory was blacklisted before. This was fixed 
[on this 
commit|https://github.com/pauloricardomg/cassandra/commit/11c9a130d9cb7a6cfc5a039fdf79963f7e779d08]

While investigating why the strategies were reloaded even without a ring change 
on the tests above, I noticed that {{Keyspace.createReplicationStrategy}} was 
being called multiple times (on {{Keyspace}} construction and {{setMetadata}}), 
so I updated to only invalidate the disk boundaries when the replication 
settings actually change 
([here|https://github.com/pauloricardomg/cassandra/commit/8a398a5d0d261178547946ac4e457f9abeb90f18]).

After the fix above, {{disk_balance_bootstrap_test}} started failing with 
imbalanced disks because the disk boundaries were not being invalidated after 
the joining node broadcasted its tokens via gossip, so 
{{TokenMetadata.getPendingRanges(keyspace, FBUtilities.getBroadcastAddress())}} 
was returning empty during disk boundary creation and causing imbalance. This 
is not failing on trunk because the double invalidation above during keyspace 
creation was causing the compaction strategy manager to reload the strategies 
with the correct ring placement during streaming. The fix to this is to 
invalidate the cached ring after gossiping the local tokens 
([here|https://github.com/pauloricardomg/cassandra/commit/007d596ffe0c5f965cf398646c52daa8f73c5c46]).

This made me realize that when replacing a node with the same address, even 
though the node is on bootstrap mode, it doesn't have any pending range, 
because it sets its token to normal state during bootstrap, what will cause its 
boundaries to not be computed correctly. I added a 
[dtest|https://github.com/pauloricardomg/cassandra-dtest/commit/8d48b166c9bfce51f9ab6c3abd73dfd4779a7c04]
 to show this and a 
[fix|https://github.com/pauloricardomg/cassandra/commit/6efd9cd454ce2fbfd40e592b6aaeda9debdb1c2b].

Finally, I didn't find a good reason to pass {{ColumnFamilyStore}} as argument 
to {{getDiskBoundaries}}, so I updated it to make it a field instead 
([here|https://github.com/pauloricardomg/cassandra/commit/5df0d5ebed67aaae6ef9350d25b602af2a1702cf]).

I submitted internal CI, and testall is green and dtest failures [seem 
unrelated|https://issues.apache.org/jira/secure/attachment/12899922/dtest2.png].
 Setting to patch available as this should be ready for a new round of review 
now. Thanks!

> Reload compaction strategies when JBOD disk boundary changes
> 
>
> Key: CASSANDRA-13948
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13948
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Paulo Motta
>Assignee: Paulo Motta
> Fix For: 3.11.x, 4.x
>
> Attachments: debug.log, dtest13948.png, dtest2.png, 
> threaddump-cleanup.txt, threaddump.txt, trace.log
>
>
> The thread dump below shows a race between an sstable replacement by the 
> {{IndexSummaryRedistribution}} and 
> {{AbstractCompactionTask.getNextBackgroundTask}}:
> {noformat}
> Thread 94580: (state = BLOCKED)
>  - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information 
> may be imprecise)
>  - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, 
> line=175 (Compiled frame)
>  - 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() 
> @bci=1, line=836 (Compiled frame)
>  - 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.util.concurrent.locks.AbstractQueuedSynchronizer$Node,
>  int) @bci=67, line=870 (Compiled frame)
>  - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(int) 
> @bci=17, line=1199 (Compiled frame)
>  - java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock() @bci=5, 
> line=943 (Compiled frame)
>  - 
> org.apache.cassandra.db.compaction.CompactionStrategyManager.handleListChangedNotification(java.lang.Iterable,
>  java.lang.Iterable) @bci=359, line=483 (Interpreted frame)
>  - 
> org.apache.cassandra.db.compaction.CompactionStrategyManager.handleNotification(org.apache.cassandra.notifications.INotification,
>  java.lang.Object) 

[jira] [Commented] (CASSANDRA-13873) Ref bug in Scrub

2017-11-29 Thread Joel Knighton (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271931#comment-16271931
 ] 

Joel Knighton commented on CASSANDRA-13873:
---

Sorry for the latency here - my fault. The patch looks good to me. I considered 
a few other cases where a similar problem might exist. It seems to me the same 
issue could exist in in the Splitter/Upgrader, but since they're offline, I 
don't know what future changes would require another operation to reference 
canonical sstables in parallel. I also don't see anything in anticompaction 
grabbing a ref; am I missing something there?

The patches look good for existing cases. Unfortunately, I let the dtests age 
out before taking a closer look, but I can rerun them after you look at the 
question above. I'm +1 to merging the relatively trivial patches through to 
trunk and opening a ticket to improve it later. As you've seen, I don't have a 
huge amount of bandwidth for this right now, so I'd rather not delay a definite 
improvement with only the promise of a better one. Thanks for the patience.

> Ref bug in Scrub
> 
>
> Key: CASSANDRA-13873
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13873
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: T Jake Luciani
>Assignee: Joel Knighton
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
>
> I'm hitting a Ref bug when many scrubs run against a node.  This doesn't 
> happen on 3.0.X.  I'm not sure if/if not this happens with compactions too 
> but I suspect it does.
> I'm not seeing any Ref leaks or double frees.
> To Reproduce:
> {quote}
> ./tools/bin/cassandra-stress write n=10m -rate threads=100
> ./bin/nodetool scrub
> #Ctrl-C
> ./bin/nodetool scrub
> #Ctrl-C
> ./bin/nodetool scrub
> #Ctrl-C
> ./bin/nodetool scrub
> {quote}
> Eventually in the logs you get:
> WARN  [RMI TCP Connection(4)-127.0.0.1] 2017-09-14 15:51:26,722 
> NoSpamLogger.java:97 - Spinning trying to capture readers 
> [BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-5-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-32-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-31-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-29-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-27-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-26-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-20-big-Data.db')],
> *released: 
> [BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-5-big-Data.db')],*
>  
> This released table has a selfRef of 0 but is in the Tracker



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13948) Reload compaction strategies when JBOD disk boundary changes

2017-11-29 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-13948:

Attachment: dtest2.png

> Reload compaction strategies when JBOD disk boundary changes
> 
>
> Key: CASSANDRA-13948
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13948
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Paulo Motta
>Assignee: Paulo Motta
> Fix For: 3.11.x, 4.x
>
> Attachments: debug.log, dtest13948.png, dtest2.png, 
> threaddump-cleanup.txt, threaddump.txt, trace.log
>
>
> The thread dump below shows a race between an sstable replacement by the 
> {{IndexSummaryRedistribution}} and 
> {{AbstractCompactionTask.getNextBackgroundTask}}:
> {noformat}
> Thread 94580: (state = BLOCKED)
>  - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information 
> may be imprecise)
>  - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, 
> line=175 (Compiled frame)
>  - 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() 
> @bci=1, line=836 (Compiled frame)
>  - 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.util.concurrent.locks.AbstractQueuedSynchronizer$Node,
>  int) @bci=67, line=870 (Compiled frame)
>  - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(int) 
> @bci=17, line=1199 (Compiled frame)
>  - java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock() @bci=5, 
> line=943 (Compiled frame)
>  - 
> org.apache.cassandra.db.compaction.CompactionStrategyManager.handleListChangedNotification(java.lang.Iterable,
>  java.lang.Iterable) @bci=359, line=483 (Interpreted frame)
>  - 
> org.apache.cassandra.db.compaction.CompactionStrategyManager.handleNotification(org.apache.cassandra.notifications.INotification,
>  java.lang.Object) @bci=53, line=555 (Interpreted frame)
>  - 
> org.apache.cassandra.db.lifecycle.Tracker.notifySSTablesChanged(java.util.Collection,
>  java.util.Collection, org.apache.cassandra.db.compaction.OperationType, 
> java.lang.Throwable) @bci=50, line=409 (Interpreted frame)
>  - 
> org.apache.cassandra.db.lifecycle.LifecycleTransaction.doCommit(java.lang.Throwable)
>  @bci=157, line=227 (Interpreted frame)
>  - 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.commit(java.lang.Throwable)
>  @bci=61, line=116 (Compiled frame)
>  - 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.commit()
>  @bci=2, line=200 (Interpreted frame)
>  - 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.finish()
>  @bci=5, line=185 (Interpreted frame)
>  - 
> org.apache.cassandra.io.sstable.IndexSummaryRedistribution.redistributeSummaries()
>  @bci=559, line=130 (Interpreted frame)
>  - 
> org.apache.cassandra.db.compaction.CompactionManager.runIndexSummaryRedistribution(org.apache.cassandra.io.sstable.IndexSummaryRedistribution)
>  @bci=9, line=1420 (Interpreted frame)
>  - 
> org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries(org.apache.cassandra.io.sstable.IndexSummaryRedistribution)
>  @bci=4, line=250 (Interpreted frame)
>  - 
> org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries() 
> @bci=30, line=228 (Interpreted frame)
>  - org.apache.cassandra.io.sstable.IndexSummaryManager$1.runMayThrow() 
> @bci=4, line=125 (Interpreted frame)
>  - org.apache.cassandra.utils.WrappedRunnable.run() @bci=1, line=28 
> (Interpreted frame)
>  - 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run()
>  @bci=4, line=118 (Compiled frame)
>  - java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=511 
> (Compiled frame)
>  - java.util.concurrent.FutureTask.runAndReset() @bci=47, line=308 (Compiled 
> frame)
>  - 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask)
>  @bci=1, line=180 (Compiled frame)
>  - java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run() 
> @bci=37, line=294 (Compiled frame)
>  - 
> java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker)
>  @bci=95, line=1149 (Compiled frame)
>  - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=624 
> (Interpreted frame)
>  - 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(java.lang.Runnable)
>  @bci=1, line=81 (Interpreted frame)
>  - org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$8.run() @bci=4 
> (Interpreted frame)
>  - java.lang.Thread.run() @bci=11, line=748 (Compiled frame)
> {noformat}
> {noformat}
> Thread 94573: (state = IN_JAVA)
>  - 

[jira] [Comment Edited] (CASSANDRA-12245) initial view build can be parallel

2017-11-29 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271895#comment-16271895
 ] 

Paulo Motta edited comment on CASSANDRA-12245 at 11/30/17 12:37 AM:


bq. Good catch! Done here. I have added an overloaded version of 
ViewBuilderTask.stop to throw the CompactionInterruptedException only if the 
stop call comes from a different place than ViewBuilder. That is, the exception 
is not thrown in the case of a schema change (such as a drop), when the current 
build should be stopped without errors and maybe restarted. 

Good job! One thing I noticed is that even though the builder task and the view 
builder is aborted, the other tasks of the same builder keep running. At least 
until we have the ability to start and stop view builders, I think that 
stopping a subtask should also abort the other subtasks of the same view 
builder - since the view builder will not complete anyway. What do you think? 
I've done this 
[here|https://github.com/pauloricardomg/cassandra/commit/81853218eee702b778ba801426ba19d48336cf77]
 and the tests didn't need any change. I've also extended {{SplitterTest}} with 
a couple more test cases 
[here|https://github.com/pauloricardomg/cassandra/commit/428a990d6b3d79df9a4848d0f0f87502e72e470e].

bq. I have added a couple of dtests here. test_resume_stopped_build uses 
`nodetool stop VIEW_BUILD` to interrupt the running task of an ongoing view 
build and verifies that the unmarked build is resumed after restarting the 
nodes. test_drop_with_stopped_build verifies that a view with interrupted taks 
can still be dropped, which is something that has been problematic while 
writting the patch.

The tests looks good, but sometimes they were failing on my machine because the 
view builder task finished on some nodes before they were stopped and also 
{{_wait_for_view_build_start}} did not guarantee the view builder started in 
all nodes before issuing {{nodetool stop VIEW_BUILD}}, so I fixed this [on this 
commit|https://github.com/pauloricardomg/cassandra-dtest/commit/667315e42bd2b7d04ac038e79149f1b0e63ba0f2].
 I also extended {{test_resume_stopped_build}} to verify that view was not 
built after abort 
([here|https://github.com/pauloricardomg/cassandra-dtest/commit/f4c3ad7ac9e4ea64576d669a1cf30b0ef4e02a3f]).

I've rebased and submitted a new CI run with the suggestions above 
[here|http://jenkins-cassandra.datastax.lan/view/Dev/view/adelapena/job/pauloricardomg-12245-trunk-dtest/]
 and 
[here|http://jenkins-cassandra.datastax.lan/view/Dev/view/adelapena/job/pauloricardomg-12245-trunk-testall/].
 

Besides these minor nits, I'm happy with the latest version of the patch and 
tests. If you agree with the suggestions above and CI looks good, feel free to 
incorporate them into your branches and commit. Excellent job and thanks for 
your patience! :)


was (Author: pauloricardomg):
bq. Good catch! Done here. I have added an overloaded version of 
ViewBuilderTask.stop to throw the CompactionInterruptedException only if the 
stop call comes from a different place than ViewBuilder. That is, the exception 
is not thrown in the case of a schema change (such as a drop), when the current 
build should be stopped without errors and maybe restarted. 

Good job! One thing I noticed is that even though the builder task and the view 
builder is aborted, the other tasks of the same builder keep running. At least 
until we have the ability to start and stop view builders, I think that 
stopping a subtask should also abort the other subtasks of the same view 
builder - since the view builder will not complete anyway. What do you think? 
I've done this 
[here|https://github.com/pauloricardomg/cassandra/commit/81853218eee702b778ba801426ba19d48336cf77]
 and the tests didn't need any change. I've also extended {{SplitterTest}} with 
a couple more test cases 
[here|https://github.com/pauloricardomg/cassandra/commit/428a990d6b3d79df9a4848d0f0f87502e72e470e].

bq. I have added a couple of dtests here. test_resume_stopped_build uses 
`nodetool stop VIEW_BUILD` to interrupt the running task of an ongoing view 
build and verifies that the unmarked build is resumed after restarting the 
nodes. test_drop_with_stopped_build verifies that a view with interrupted taks 
can still be dropped, which is something that has been problematic while 
writting the patch.

The tests looks good, but sometimes they were failing on my machine because the 
view builder task finished on some nodes before they were stopped and also 
{{_wait_for_view_build_start}} did not guarantee the view builder started in 
all nodes before issuing {{nodetool stop VIEW_BUILD}}, so I fixed this [on this 
commit|https://github.com/pauloricardomg/cassandra-dtest/commit/fc62dc849d5a4d5e24d2bada6e6f8ce0f2d32b4d].
 I also extended {{test_resume_stopped_build}} to verify that view was not 
built after abort 

[jira] [Commented] (CASSANDRA-12245) initial view build can be parallel

2017-11-29 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271895#comment-16271895
 ] 

Paulo Motta commented on CASSANDRA-12245:
-

bq. Good catch! Done here. I have added an overloaded version of 
ViewBuilderTask.stop to throw the CompactionInterruptedException only if the 
stop call comes from a different place than ViewBuilder. That is, the exception 
is not thrown in the case of a schema change (such as a drop), when the current 
build should be stopped without errors and maybe restarted. 

Good job! One thing I noticed is that even though the builder task and the view 
builder is aborted, the other tasks of the same builder keep running. At least 
until we have the ability to start and stop view builders, I think that 
stopping a subtask should also abort the other subtasks of the same view 
builder - since the view builder will not complete anyway. What do you think? 
I've done this 
[here|https://github.com/pauloricardomg/cassandra/commit/81853218eee702b778ba801426ba19d48336cf77]
 and the tests didn't need any change. I've also extended {{SplitterTest}} with 
a couple more test cases 
[here|https://github.com/pauloricardomg/cassandra/commit/428a990d6b3d79df9a4848d0f0f87502e72e470e].

bq. I have added a couple of dtests here. test_resume_stopped_build uses 
`nodetool stop VIEW_BUILD` to interrupt the running task of an ongoing view 
build and verifies that the unmarked build is resumed after restarting the 
nodes. test_drop_with_stopped_build verifies that a view with interrupted taks 
can still be dropped, which is something that has been problematic while 
writting the patch.

The tests looks good, but sometimes they were failing on my machine because the 
view builder task finished on some nodes before they were stopped and also 
{{_wait_for_view_build_start}} did not guarantee the view builder started in 
all nodes before issuing {{nodetool stop VIEW_BUILD}}, so I fixed this [on this 
commit|https://github.com/pauloricardomg/cassandra-dtest/commit/fc62dc849d5a4d5e24d2bada6e6f8ce0f2d32b4d].
 I also extended {{test_resume_stopped_build}} to verify that view was not 
built after abort 
([here|https://github.com/pauloricardomg/cassandra-dtest/commit/6e38919d3c64a54688ae97bcf03611fff7d59dfe]).

I've rebased and submitted a new CI run with the suggestions above 
[here|http://jenkins-cassandra.datastax.lan/view/Dev/view/adelapena/job/pauloricardomg-12245-trunk-dtest/]
 and 
[here|http://jenkins-cassandra.datastax.lan/view/Dev/view/adelapena/job/pauloricardomg-12245-trunk-testall/].
 

Besides these minor nits, I'm happy with the latest version of the patch and 
tests. If you agree with the suggestions above and CI looks good, feel free to 
incorporate them into your branches and commit. Excellent job and thanks for 
your patience! :)

> initial view build can be parallel
> --
>
> Key: CASSANDRA-12245
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12245
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Materialized Views
>Reporter: Tom van der Woerdt
>Assignee: Andrés de la Peña
> Fix For: 4.x
>
>
> On a node with lots of data (~3TB) building a materialized view takes several 
> weeks, which is not ideal. It's doing this in a single thread.
> There are several potential ways this can be optimized :
>  * do vnodes in parallel, instead of going through the entire range in one 
> thread
>  * just iterate through sstables, not worrying about duplicates, and include 
> the timestamp of the original write in the MV mutation. since this doesn't 
> exclude duplicates it does increase the amount of work and could temporarily 
> surface ghost rows (yikes) but I guess that's why they call it eventual 
> consistency. doing it this way can avoid holding references to all tables on 
> disk, allows parallelization, and removes the need to check other sstables 
> for existing data. this is essentially the 'do a full repair' path



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14078) Fix dTest test_bulk_round_trip_blogposts_with_max_connections

2017-11-29 Thread Jaydeepkumar Chovatia (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaydeepkumar Chovatia updated CASSANDRA-14078:
--
Description: 
This ticket is regarding following dTest 
{{cqlsh_tests.cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_blogposts_with_max_connections}}

This test is trying to limit number of client connections and assumes that once 
connection count has reached then client will fail-over to other node and do 
the request. The reason is, it is not deterministic test case as it totally 
depends on what hardware you run, timing, etc.
For example
If we look at 
https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-dtest/353/testReport/cqlsh_tests.cqlsh_copy_tests/CqlshCopyTest/test_bulk_round_trip_blogposts_with_max_connections/

{quote}
...
Processed: 5000 rows; Rate:2551 rows/s; Avg. rate:2551 rows/s

All replicas busy, sleeping for 4 second(s)...
Processed: 1 rows; Rate:2328 rows/s; Avg. rate:2307 rows/s

All replicas busy, sleeping for 1 second(s)...
Processed: 15000 rows; Rate:2137 rows/s; Avg. rate:2173 rows/s

All replicas busy, sleeping for 11 second(s)...
Processed: 2 rows; Rate:2138 rows/s; Avg. rate:2164 rows/s

Processed: 25000 rows; Rate:2403 rows/s; Avg. rate:2249 rows/s

Processed: 3 rows; Rate:2582 rows/s; Avg. rate:2321 rows/s

Processed: 35000 rows; Rate:2835 rows/s; Avg. rate:2406 rows/s

Processed: 4 rows; Rate:2867 rows/s; Avg. rate:2458 rows/s

Processed: 45000 rows; Rate:3163 rows/s; Avg. rate:2540 rows/s

Processed: 5 rows; Rate:3200 rows/s; Avg. rate:2596 rows/s

Processed: 50234 rows; Rate:2032 rows/s; Avg. rate:2572 rows/s

All replicas busy, sleeping for 23 second(s)...
Replicas too busy, given up
...
{quote}

Here we can see request is timing out, sometimes it resumes after 1 second, 
next time 11 seconds and some times it doesn't work at all. 

In my opinion this test is not a good fit for dTest as dTest(s) should be 
deterministic.

  was:
This ticket is regarding following dTest 
{{cqlsh_tests.cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_blogposts_with_max_connections}}

This test is trying to limit number of client connections and assumes that once 
connection count has reached then client will fail-over to other node and do 
the request. The reason is, it is not deterministic test case as it totally 
depends on what hardware you run, timing, etc.
For example
If we look at 
https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-dtest/353/testReport/cqlsh_tests.cqlsh_copy_tests/CqlshCopyTest/test_bulk_round_trip_blogposts_with_max_connections/

{{
...
Processed: 5000 rows; Rate:2551 rows/s; Avg. rate:2551 rows/s

All replicas busy, sleeping for 4 second(s)...
Processed: 1 rows; Rate:2328 rows/s; Avg. rate:2307 rows/s

All replicas busy, sleeping for 1 second(s)...
Processed: 15000 rows; Rate:2137 rows/s; Avg. rate:2173 rows/s

All replicas busy, sleeping for 11 second(s)...
Processed: 2 rows; Rate:2138 rows/s; Avg. rate:2164 rows/s

Processed: 25000 rows; Rate:2403 rows/s; Avg. rate:2249 rows/s

Processed: 3 rows; Rate:2582 rows/s; Avg. rate:2321 rows/s

Processed: 35000 rows; Rate:2835 rows/s; Avg. rate:2406 rows/s

Processed: 4 rows; Rate:2867 rows/s; Avg. rate:2458 rows/s

Processed: 45000 rows; Rate:3163 rows/s; Avg. rate:2540 rows/s

Processed: 5 rows; Rate:3200 rows/s; Avg. rate:2596 rows/s

Processed: 50234 rows; Rate:2032 rows/s; Avg. rate:2572 rows/s

All replicas busy, sleeping for 23 second(s)...
Replicas too busy, given up
...
}}

Here we can see request is timing out, sometimes it resumes after 1 second, 
next time 11 seconds and some times it doesn't work at all. 

In my opinion this test is not a good fit for dTest as dTest(s) should be 
deterministic.


> Fix dTest test_bulk_round_trip_blogposts_with_max_connections
> -
>
> Key: CASSANDRA-14078
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14078
> Project: Cassandra
>  Issue Type: Test
>  Components: Testing
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Minor
>
> This ticket is regarding following dTest 
> {{cqlsh_tests.cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_blogposts_with_max_connections}}
> This test is trying to limit number of client connections and assumes that 
> once connection count has reached then client will fail-over to other node 
> and do the request. The reason is, it is not deterministic test case as it 
> totally depends on what hardware you run, timing, etc.
> For example
> If we look at 
> 

[jira] [Updated] (CASSANDRA-14078) Fix dTest test_bulk_round_trip_blogposts_with_max_connections

2017-11-29 Thread Jaydeepkumar Chovatia (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaydeepkumar Chovatia updated CASSANDRA-14078:
--
Status: Patch Available  (was: Open)

> Fix dTest test_bulk_round_trip_blogposts_with_max_connections
> -
>
> Key: CASSANDRA-14078
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14078
> Project: Cassandra
>  Issue Type: Test
>  Components: Testing
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Minor
>
> This ticket is regarding following dTest 
> {{cqlsh_tests.cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_blogposts_with_max_connections}}
> This test is trying to limit number of client connections and assumes that 
> once connection count has reached then client will fail-over to other node 
> and do the request. The reason is, it is not deterministic test case as it 
> totally depends on what hardware you run, timing, etc.
> For example
> If we look at 
> https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-dtest/353/testReport/cqlsh_tests.cqlsh_copy_tests/CqlshCopyTest/test_bulk_round_trip_blogposts_with_max_connections/
> {{
> ...
> Processed: 5000 rows; Rate:2551 rows/s; Avg. rate:2551 rows/s
> All replicas busy, sleeping for 4 second(s)...
> Processed: 1 rows; Rate:2328 rows/s; Avg. rate:2307 rows/s
> All replicas busy, sleeping for 1 second(s)...
> Processed: 15000 rows; Rate:2137 rows/s; Avg. rate:2173 rows/s
> All replicas busy, sleeping for 11 second(s)...
> Processed: 2 rows; Rate:2138 rows/s; Avg. rate:2164 rows/s
> Processed: 25000 rows; Rate:2403 rows/s; Avg. rate:2249 rows/s
> Processed: 3 rows; Rate:2582 rows/s; Avg. rate:2321 rows/s
> Processed: 35000 rows; Rate:2835 rows/s; Avg. rate:2406 rows/s
> Processed: 4 rows; Rate:2867 rows/s; Avg. rate:2458 rows/s
> Processed: 45000 rows; Rate:3163 rows/s; Avg. rate:2540 rows/s
> Processed: 5 rows; Rate:3200 rows/s; Avg. rate:2596 rows/s
> Processed: 50234 rows; Rate:2032 rows/s; Avg. rate:2572 rows/s
> All replicas busy, sleeping for 23 second(s)...
> Replicas too busy, given up
> ...
> }}
> Here we can see request is timing out, sometimes it resumes after 1 second, 
> next time 11 seconds and some times it doesn't work at all. 
> In my opinion this test is not a good fit for dTest as dTest(s) should be 
> deterministic.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14078) Fix dTest test_bulk_round_trip_blogposts_with_max_connections

2017-11-29 Thread Jaydeepkumar Chovatia (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271882#comment-16271882
 ] 

Jaydeepkumar Chovatia commented on CASSANDRA-14078:
---

|| branch || build ||
| 
[master|https://github.com/apache/cassandra-dtest/compare/master...jaydeepkumar1984:CASSANDRA-14078?expand=1]
 | [circleci | https://circleci.com/gh/jaydeepkumar1984/cassandra-dtest/4] |

> Fix dTest test_bulk_round_trip_blogposts_with_max_connections
> -
>
> Key: CASSANDRA-14078
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14078
> Project: Cassandra
>  Issue Type: Test
>  Components: Testing
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Minor
>
> This ticket is regarding following dTest 
> {{cqlsh_tests.cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_blogposts_with_max_connections}}
> This test is trying to limit number of client connections and assumes that 
> once connection count has reached then client will fail-over to other node 
> and do the request. The reason is, it is not deterministic test case as it 
> totally depends on what hardware you run, timing, etc.
> For example
> If we look at 
> https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-dtest/353/testReport/cqlsh_tests.cqlsh_copy_tests/CqlshCopyTest/test_bulk_round_trip_blogposts_with_max_connections/
> {{
> ...
> Processed: 5000 rows; Rate:2551 rows/s; Avg. rate:2551 rows/s
> All replicas busy, sleeping for 4 second(s)...
> Processed: 1 rows; Rate:2328 rows/s; Avg. rate:2307 rows/s
> All replicas busy, sleeping for 1 second(s)...
> Processed: 15000 rows; Rate:2137 rows/s; Avg. rate:2173 rows/s
> All replicas busy, sleeping for 11 second(s)...
> Processed: 2 rows; Rate:2138 rows/s; Avg. rate:2164 rows/s
> Processed: 25000 rows; Rate:2403 rows/s; Avg. rate:2249 rows/s
> Processed: 3 rows; Rate:2582 rows/s; Avg. rate:2321 rows/s
> Processed: 35000 rows; Rate:2835 rows/s; Avg. rate:2406 rows/s
> Processed: 4 rows; Rate:2867 rows/s; Avg. rate:2458 rows/s
> Processed: 45000 rows; Rate:3163 rows/s; Avg. rate:2540 rows/s
> Processed: 5 rows; Rate:3200 rows/s; Avg. rate:2596 rows/s
> Processed: 50234 rows; Rate:2032 rows/s; Avg. rate:2572 rows/s
> All replicas busy, sleeping for 23 second(s)...
> Replicas too busy, given up
> ...
> }}
> Here we can see request is timing out, sometimes it resumes after 1 second, 
> next time 11 seconds and some times it doesn't work at all. 
> In my opinion this test is not a good fit for dTest as dTest(s) should be 
> deterministic.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14047) test_simple_strategy_each_quorum_users - consistency_test.TestAccuracy fails: Missing: ['127.0.0.3.* now UP']:

2017-11-29 Thread Vincent White (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent White updated CASSANDRA-14047:
--
Attachment: trunk-patch_passes_test-debug.log
3_11-debug.log
trunk-debug.log
trunk-debug.log-2

On 3.11 I saw still see the {{UnknownColumnFamilyException}}. Which does cause 
the test to fail because it triggers the "Unexpected error in log" assertion 
error when tearing down the test. Strangely the test passes and doesn't hit 
this on trunk with my patch even though the logs still contains the 
UnknownColumnFamilyException (not sure if thats related to C* version specific 
config in dtests or something). 

 So the netty issue is unrelated to the flakiness of this test, not sure if it 
should have its own ticket? I've attached a few sets of debug logs that 
demonstrate the various behaviours with/without netty and with/without my patch 
from the previous comment. 

In regard to the test itself. It appears that the reads that are triggering the 
{{UnknownColumnFamilyException}} are actually from the initialisation of 
CassandraRoleManager since they are for {{system_auth.roles}} (I believe 
{{hasExistingRoles()}} in {{setupDefaultRole()}}), I'm not exactly sure what 
the best way to resolve this is. This error isn't an issue for the role manager 
itself as it will simply retry later and it doesn't affect the tests apart from 
triggering the unexpected error in log. For the tests I guess we could leave a 
gap between starting nodes. But it's probably more correct to just ignore these 
errors. I've tested that 
[https://github.com/vincewhite/cassandra-dtest/commit/7e48704713123a253a914802975f7163474ede9b]
 this resolves the failures and I assume it's probably safe to ignore this 
error for all of the tests in consistency_test but I haven't looked into that 
at this stage. 

Also these tests don't do anything fancy in regard to how they start the 
cluster, they just use the normal {{cluster.start(wait_for_binary_proto=True, 
wait_other_notice=True)}} call so I guess this could causes random failures in 
a lot of tests.

> test_simple_strategy_each_quorum_users - consistency_test.TestAccuracy fails: 
> Missing: ['127.0.0.3.* now UP']:
> --
>
> Key: CASSANDRA-14047
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14047
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Michael Kjellman
>Assignee: Vincent White
> Attachments: 3_11-debug.log, trunk-debug.log, trunk-debug.log-2, 
> trunk-patch_passes_test-debug.log
>
>
> test_simple_strategy_each_quorum_users - consistency_test.TestAccuracy fails: 
> Missing: ['127.0.0.3.* now UP']:
> 15 Nov 2017 11:23:37 [node1] Missing: ['127.0.0.3.* now UP']:
> INFO  [main] 2017-11-15 11:21:32,452 YamlConfigura.
> See system.log for remainder
>  >> begin captured logging << 
> dtest: DEBUG: cluster ccm directory: /tmp/dtest-v3VgyS
> dtest: DEBUG: Done setting configuration options:
> {   'initial_token': None,
> 'num_tokens': '32',
> 'phi_convict_threshold': 5,
> 'range_request_timeout_in_ms': 1,
> 'read_request_timeout_in_ms': 1,
> 'request_timeout_in_ms': 1,
> 'truncate_request_timeout_in_ms': 1,
> 'write_request_timeout_in_ms': 1}
> dtest: DEBUG: Testing single dc, users, each quorum reads
> - >> end captured logging << -
>   File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
>   File "/home/cassandra/cassandra-dtest/tools/decorators.py", line 48, in 
> wrapped
> f(obj)
>   File "/home/cassandra/cassandra-dtest/consistency_test.py", line 621, in 
> test_simple_strategy_each_quorum_users
> 
> self._run_test_function_in_parallel(TestAccuracy.Validation.validate_users, 
> [self.nodes], [self.rf], combinations)
>   File "/home/cassandra/cassandra-dtest/consistency_test.py", line 535, in 
> _run_test_function_in_parallel
> self._start_cluster(save_sessions=True, 
> requires_local_reads=requires_local_reads)
>   File "/home/cassandra/cassandra-dtest/consistency_test.py", line 141, in 
> _start_cluster
> cluster.start(wait_for_binary_proto=True, wait_other_notice=True)
>   File 
> "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/cluster.py", 
> line 428, in start
> node.watch_log_for_alive(other_node, from_mark=mark)
>   File 
> "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 520, in watch_log_for_alive
> self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, 
> filename=filename)
>   File 
> "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", 

[jira] [Updated] (CASSANDRA-13917) COMPACT STORAGE inserts on tables without clusterings accept hidden column1 and value columns

2017-11-29 Thread Michael Shuler (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Shuler updated CASSANDRA-13917:
---
Fix Version/s: (was: 3.11.1)
   (was: 3.0.15)
   3.11.x
   3.0.x

> COMPACT STORAGE inserts on tables without clusterings accept hidden column1 
> and value columns
> -
>
> Key: CASSANDRA-13917
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13917
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Alex Petrov
>Assignee: Aleksandr Sorokoumov
>Priority: Minor
>  Labels: lhf
> Fix For: 3.0.x, 3.11.x
>
>
> Test for the issue:
> {code}
> @Test
> public void testCompactStorage() throws Throwable
> {
> createTable("CREATE TABLE %s (a int PRIMARY KEY, b int, c int) WITH 
> COMPACT STORAGE");
> assertInvalid("INSERT INTO %s (a, b, c, column1) VALUES (?, ?, ?, 
> ?)", 1, 1, 1, ByteBufferUtil.bytes('a'));
> // This one fails with Some clustering keys are missing: column1, 
> which is still wrong
> assertInvalid("INSERT INTO %s (a, b, c, value) VALUES (?, ?, ?, ?)", 
> 1, 1, 1, ByteBufferUtil.bytes('a'));   
> assertInvalid("INSERT INTO %s (a, b, c, column1, value) VALUES (?, ?, 
> ?, ?, ?)", 1, 1, 1, ByteBufferUtil.bytes('a'), ByteBufferUtil.bytes('b'));
> assertEmpty(execute("SELECT * FROM %s"));
> }
> {code}
> Gladly, these writes are no-op, even though they succeed.
> {{value}} and {{column1}} should be completely hidden. Fixing this one should 
> be as easy as just adding validations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14078) Fix dTest test_bulk_round_trip_blogposts_with_max_connections

2017-11-29 Thread Jaydeepkumar Chovatia (JIRA)
Jaydeepkumar Chovatia created CASSANDRA-14078:
-

 Summary: Fix dTest 
test_bulk_round_trip_blogposts_with_max_connections
 Key: CASSANDRA-14078
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14078
 Project: Cassandra
  Issue Type: Test
  Components: Testing
Reporter: Jaydeepkumar Chovatia
Assignee: Jaydeepkumar Chovatia
Priority: Minor


This ticket is regarding following dTest 
{{cqlsh_tests.cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_blogposts_with_max_connections}}

This test is trying to limit number of client connections and assumes that once 
connection count has reached then client will fail-over to other node and do 
the request. The reason is, it is not deterministic test case as it totally 
depends on what hardware you run, timing, etc.
For example
If we look at 
https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-dtest/353/testReport/cqlsh_tests.cqlsh_copy_tests/CqlshCopyTest/test_bulk_round_trip_blogposts_with_max_connections/

{{
...
Processed: 5000 rows; Rate:2551 rows/s; Avg. rate:2551 rows/s

All replicas busy, sleeping for 4 second(s)...
Processed: 1 rows; Rate:2328 rows/s; Avg. rate:2307 rows/s

All replicas busy, sleeping for 1 second(s)...
Processed: 15000 rows; Rate:2137 rows/s; Avg. rate:2173 rows/s

All replicas busy, sleeping for 11 second(s)...
Processed: 2 rows; Rate:2138 rows/s; Avg. rate:2164 rows/s

Processed: 25000 rows; Rate:2403 rows/s; Avg. rate:2249 rows/s

Processed: 3 rows; Rate:2582 rows/s; Avg. rate:2321 rows/s

Processed: 35000 rows; Rate:2835 rows/s; Avg. rate:2406 rows/s

Processed: 4 rows; Rate:2867 rows/s; Avg. rate:2458 rows/s

Processed: 45000 rows; Rate:3163 rows/s; Avg. rate:2540 rows/s

Processed: 5 rows; Rate:3200 rows/s; Avg. rate:2596 rows/s

Processed: 50234 rows; Rate:2032 rows/s; Avg. rate:2572 rows/s

All replicas busy, sleeping for 23 second(s)...
Replicas too busy, given up
...
}}

Here we can see request is timing out, sometimes it resumes after 1 second, 
next time 11 seconds and some times it doesn't work at all. 

In my opinion this test is not a good fit for dTest as dTest(s) should be 
deterministic.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13987) Multithreaded commitlog subtly changed durability

2017-11-29 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271720#comment-16271720
 ] 

Jason Brown commented on CASSANDRA-13987:
-

[~jolynch] It looks like 2.1's shutdown hook in {{StorageService}} does 
[shutdown the commit 
log|https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/StorageService.java#L716].
 So, I think you should be OK, assuming that a gentle {{kill}} is issued (not 
{{kill -9}} or similar ilk). 

> Multithreaded commitlog subtly changed durability
> -
>
> Key: CASSANDRA-13987
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13987
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jason Brown
>Assignee: Jason Brown
> Fix For: 4.x
>
>
> When multithreaded commitlog was introduced in CASSANDRA-3578, we subtly 
> changed the way that commitlog durability worked. Everything still gets 
> written to an mmap file. However, not everything is replayable from the 
> mmaped file after a process crash, in periodic mode.
> In brief, the reason this changesd is due to the chained markers that are 
> required for the multithreaded commit log. At each msync, we wait for 
> outstanding mutations to serialize into the commitlog, and update a marker 
> before and after the commits that have accumluated since the last sync. With 
> those markers, we can safely replay that section of the commitlog. Without 
> the markers, we have no guarantee that the commits in that section were 
> successfully written, thus we abandon those commits on replay.
> If you have correlated process failures of multiple nodes at "nearly" the 
> same time (see ["There Is No 
> Now"|http://queue.acm.org/detail.cfm?id=2745385]), it is possible to have 
> data loss if none of the nodes msync the commitlog. For example, with RF=3, 
> if quorum write succeeds on two nodes (and we acknowledge the write back to 
> the client), and then the process on both nodes OOMs (say, due to reading the 
> index for a 100GB partition), the write will be lost if neither process 
> msync'ed the commitlog. More exactly, the commitlog cannot be fully replayed. 
> The reason why this data is silently lost is due to the chained markers that 
> were introduced with CASSANDRA-3578.
> The problem we are addressing with this ticket is incrementally improving 
> 'durability' due to process crash, not host crash. (Note: operators should 
> use batch mode to ensure greater durability, but batch mode in it's current 
> implementation is a) borked, and b) will burn through, *very* rapidly, SSDs 
> that don't have a non-volatile write cache sitting in front.) 
> The current default for {{commitlog_sync_period_in_ms}} is 10 seconds, which 
> means that a node could lose up to ten seconds of data due to process crash. 
> The unfortunate thing is that the data is still avaialble, in the mmap file, 
> but we can't replay it due to incomplete chained markers.
> ftr, I don't believe we've ever had a stated policy about commitlog 
> durability wrt process crash. Pre-2.0 we naturally piggy-backed off the 
> memory mapped file and the fact that every mutation was acquired a lock and 
> wrote into the mmap buffer, and the ability to replay everything out of it 
> came for free. With CASSANDRA-3578, that was subtly changed. 
> Something [~jjirsa] pointed out to me is that [MySQL provides a way to adjust 
> the durability 
> guarantees|https://dev.mysql.com/doc/refman/5.6/en/innodb-parameters.html#sysvar_innodb_flush_log_at_trx_commit]
>  of each commit in innodb via the {{innodb_flush_log_at_trx_commit}}. I'm 
> using that idea as a loose springboard for what to do here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14010) NullPointerException when creating keyspace

2017-11-29 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-14010:
---
Component/s: Distributed Metadata

> NullPointerException when creating keyspace
> ---
>
> Key: CASSANDRA-14010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14010
> Project: Cassandra
>  Issue Type: Bug
>  Components: Distributed Metadata
>Reporter: Jonathan Pellby
> Fix For: 3.11.x, 4.x
>
>
> We have a test environment were we drop and create keyspaces and tables 
> several times within a short time frame. Since upgrading from 3.11.0 to 
> 3.11.1, we are seeing a lot of create statements failing. See the logs below:
> {code:java}
> 2017-11-13T14:29:20.037986449Z WARN Directory /tmp/ramdisk/commitlog doesn't 
> exist
> 2017-11-13T14:29:20.038009590Z WARN Directory /tmp/ramdisk/saved_caches 
> doesn't exist
> 2017-11-13T14:29:20.094337265Z INFO Initialized prepared statement caches 
> with 10 MB (native) and 10 MB (Thrift)
> 2017-11-13T14:29:20.805946340Z INFO Initializing system.IndexInfo
> 2017-11-13T14:29:21.934686905Z INFO Initializing system.batches
> 2017-11-13T14:29:21.973914733Z INFO Initializing system.paxos
> 2017-11-13T14:29:21.994550268Z INFO Initializing system.local
> 2017-11-13T14:29:22.014097194Z INFO Initializing system.peers
> 2017-11-13T14:29:22.124211254Z INFO Initializing system.peer_events
> 2017-11-13T14:29:22.153966833Z INFO Initializing system.range_xfers
> 2017-11-13T14:29:22.174097334Z INFO Initializing system.compaction_history
> 2017-11-13T14:29:22.194259920Z INFO Initializing system.sstable_activity
> 2017-11-13T14:29:22.210178271Z INFO Initializing system.size_estimates
> 2017-11-13T14:29:22.223836992Z INFO Initializing system.available_ranges
> 2017-11-13T14:29:22.237854207Z INFO Initializing system.transferred_ranges
> 2017-11-13T14:29:22.253995621Z INFO Initializing 
> system.views_builds_in_progress
> 2017-11-13T14:29:22.264052481Z INFO Initializing system.built_views
> 2017-11-13T14:29:22.283334779Z INFO Initializing system.hints
> 2017-11-13T14:29:22.304110311Z INFO Initializing system.batchlog
> 2017-11-13T14:29:22.318031950Z INFO Initializing system.prepared_statements
> 2017-11-13T14:29:22.326547917Z INFO Initializing system.schema_keyspaces
> 2017-11-13T14:29:22.337097407Z INFO Initializing system.schema_columnfamilies
> 2017-11-13T14:29:22.354082675Z INFO Initializing system.schema_columns
> 2017-11-13T14:29:22.384179063Z INFO Initializing system.schema_triggers
> 2017-11-13T14:29:22.394222027Z INFO Initializing system.schema_usertypes
> 2017-11-13T14:29:22.414199833Z INFO Initializing system.schema_functions
> 2017-11-13T14:29:22.427205182Z INFO Initializing system.schema_aggregates
> 2017-11-13T14:29:22.427228345Z INFO Not submitting build tasks for views in 
> keyspace system as storage service is not initialized
> 2017-11-13T14:29:22.652838866Z INFO Scheduling approximate time-check task 
> with a precision of 10 milliseconds
> 2017-11-13T14:29:22.732862906Z INFO Initializing system_schema.keyspaces
> 2017-11-13T14:29:22.746598744Z INFO Initializing system_schema.tables
> 2017-11-13T14:29:22.759649011Z INFO Initializing system_schema.columns
> 2017-11-13T14:29:22.766245435Z INFO Initializing system_schema.triggers
> 2017-11-13T14:29:22.778716809Z INFO Initializing system_schema.dropped_columns
> 2017-11-13T14:29:22.791369819Z INFO Initializing system_schema.views
> 2017-11-13T14:29:22.839141724Z INFO Initializing system_schema.types
> 2017-11-13T14:29:22.852911976Z INFO Initializing system_schema.functions
> 2017-11-13T14:29:22.852938112Z INFO Initializing system_schema.aggregates
> 2017-11-13T14:29:22.869348526Z INFO Initializing system_schema.indexes
> 2017-11-13T14:29:22.874178682Z INFO Not submitting build tasks for views in 
> keyspace system_schema as storage service is not initialized
> 2017-11-13T14:29:23.700250435Z INFO Initializing key cache with capacity of 
> 25 MBs.
> 2017-11-13T14:29:23.724357053Z INFO Initializing row cache with capacity of 0 
> MBs
> 2017-11-13T14:29:23.724383599Z INFO Initializing counter cache with capacity 
> of 12 MBs
> 2017-11-13T14:29:23.724386906Z INFO Scheduling counter cache save to every 
> 7200 seconds (going to save all keys).
> 2017-11-13T14:29:23.984408710Z INFO Populating token metadata from system 
> tables
> 2017-11-13T14:29:24.032687075Z INFO Global buffer pool is enabled, when pool 
> is exhausted (max is 125.000MiB) it will allocate on heap
> 2017-11-13T14:29:24.214123695Z INFO Token metadata:
> 2017-11-13T14:29:24.304218769Z INFO Completed loading (14 ms; 8 keys) 
> KeyCache cache
> 2017-11-13T14:29:24.363978406Z INFO No commitlog files found; skipping replay
> 2017-11-13T14:29:24.364005238Z INFO Populating token metadata from system 
> tables
> 

[jira] [Updated] (CASSANDRA-14010) NullPointerException when creating keyspace

2017-11-29 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-14010:
---
Fix Version/s: 4.x
   3.11.x

> NullPointerException when creating keyspace
> ---
>
> Key: CASSANDRA-14010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14010
> Project: Cassandra
>  Issue Type: Bug
>  Components: Distributed Metadata
>Reporter: Jonathan Pellby
> Fix For: 3.11.x, 4.x
>
>
> We have a test environment were we drop and create keyspaces and tables 
> several times within a short time frame. Since upgrading from 3.11.0 to 
> 3.11.1, we are seeing a lot of create statements failing. See the logs below:
> {code:java}
> 2017-11-13T14:29:20.037986449Z WARN Directory /tmp/ramdisk/commitlog doesn't 
> exist
> 2017-11-13T14:29:20.038009590Z WARN Directory /tmp/ramdisk/saved_caches 
> doesn't exist
> 2017-11-13T14:29:20.094337265Z INFO Initialized prepared statement caches 
> with 10 MB (native) and 10 MB (Thrift)
> 2017-11-13T14:29:20.805946340Z INFO Initializing system.IndexInfo
> 2017-11-13T14:29:21.934686905Z INFO Initializing system.batches
> 2017-11-13T14:29:21.973914733Z INFO Initializing system.paxos
> 2017-11-13T14:29:21.994550268Z INFO Initializing system.local
> 2017-11-13T14:29:22.014097194Z INFO Initializing system.peers
> 2017-11-13T14:29:22.124211254Z INFO Initializing system.peer_events
> 2017-11-13T14:29:22.153966833Z INFO Initializing system.range_xfers
> 2017-11-13T14:29:22.174097334Z INFO Initializing system.compaction_history
> 2017-11-13T14:29:22.194259920Z INFO Initializing system.sstable_activity
> 2017-11-13T14:29:22.210178271Z INFO Initializing system.size_estimates
> 2017-11-13T14:29:22.223836992Z INFO Initializing system.available_ranges
> 2017-11-13T14:29:22.237854207Z INFO Initializing system.transferred_ranges
> 2017-11-13T14:29:22.253995621Z INFO Initializing 
> system.views_builds_in_progress
> 2017-11-13T14:29:22.264052481Z INFO Initializing system.built_views
> 2017-11-13T14:29:22.283334779Z INFO Initializing system.hints
> 2017-11-13T14:29:22.304110311Z INFO Initializing system.batchlog
> 2017-11-13T14:29:22.318031950Z INFO Initializing system.prepared_statements
> 2017-11-13T14:29:22.326547917Z INFO Initializing system.schema_keyspaces
> 2017-11-13T14:29:22.337097407Z INFO Initializing system.schema_columnfamilies
> 2017-11-13T14:29:22.354082675Z INFO Initializing system.schema_columns
> 2017-11-13T14:29:22.384179063Z INFO Initializing system.schema_triggers
> 2017-11-13T14:29:22.394222027Z INFO Initializing system.schema_usertypes
> 2017-11-13T14:29:22.414199833Z INFO Initializing system.schema_functions
> 2017-11-13T14:29:22.427205182Z INFO Initializing system.schema_aggregates
> 2017-11-13T14:29:22.427228345Z INFO Not submitting build tasks for views in 
> keyspace system as storage service is not initialized
> 2017-11-13T14:29:22.652838866Z INFO Scheduling approximate time-check task 
> with a precision of 10 milliseconds
> 2017-11-13T14:29:22.732862906Z INFO Initializing system_schema.keyspaces
> 2017-11-13T14:29:22.746598744Z INFO Initializing system_schema.tables
> 2017-11-13T14:29:22.759649011Z INFO Initializing system_schema.columns
> 2017-11-13T14:29:22.766245435Z INFO Initializing system_schema.triggers
> 2017-11-13T14:29:22.778716809Z INFO Initializing system_schema.dropped_columns
> 2017-11-13T14:29:22.791369819Z INFO Initializing system_schema.views
> 2017-11-13T14:29:22.839141724Z INFO Initializing system_schema.types
> 2017-11-13T14:29:22.852911976Z INFO Initializing system_schema.functions
> 2017-11-13T14:29:22.852938112Z INFO Initializing system_schema.aggregates
> 2017-11-13T14:29:22.869348526Z INFO Initializing system_schema.indexes
> 2017-11-13T14:29:22.874178682Z INFO Not submitting build tasks for views in 
> keyspace system_schema as storage service is not initialized
> 2017-11-13T14:29:23.700250435Z INFO Initializing key cache with capacity of 
> 25 MBs.
> 2017-11-13T14:29:23.724357053Z INFO Initializing row cache with capacity of 0 
> MBs
> 2017-11-13T14:29:23.724383599Z INFO Initializing counter cache with capacity 
> of 12 MBs
> 2017-11-13T14:29:23.724386906Z INFO Scheduling counter cache save to every 
> 7200 seconds (going to save all keys).
> 2017-11-13T14:29:23.984408710Z INFO Populating token metadata from system 
> tables
> 2017-11-13T14:29:24.032687075Z INFO Global buffer pool is enabled, when pool 
> is exhausted (max is 125.000MiB) it will allocate on heap
> 2017-11-13T14:29:24.214123695Z INFO Token metadata:
> 2017-11-13T14:29:24.304218769Z INFO Completed loading (14 ms; 8 keys) 
> KeyCache cache
> 2017-11-13T14:29:24.363978406Z INFO No commitlog files found; skipping replay
> 2017-11-13T14:29:24.364005238Z INFO Populating token metadata from system 
> tables
> 

[jira] [Commented] (CASSANDRA-14010) NullPointerException when creating keyspace

2017-11-29 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271446#comment-16271446
 ] 

Jeremiah Jordan commented on CASSANDRA-14010:
-

I just saw this on some tests today as well.  The issue seems to be that the 
drop is happening concurrently with tables being initialized:

{quote}
2017-11-13T14:29:40.566922871Z INFO Initializing my_keyspace.schema_version
2017-11-13T14:29:42.719380089Z INFO Drop Keyspace 'my_keyspace'
2017-11-13T14:29:43.124510221Z INFO Create new Keyspace: 
{quote}

> NullPointerException when creating keyspace
> ---
>
> Key: CASSANDRA-14010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14010
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Pellby
>
> We have a test environment were we drop and create keyspaces and tables 
> several times within a short time frame. Since upgrading from 3.11.0 to 
> 3.11.1, we are seeing a lot of create statements failing. See the logs below:
> {code:java}
> 2017-11-13T14:29:20.037986449Z WARN Directory /tmp/ramdisk/commitlog doesn't 
> exist
> 2017-11-13T14:29:20.038009590Z WARN Directory /tmp/ramdisk/saved_caches 
> doesn't exist
> 2017-11-13T14:29:20.094337265Z INFO Initialized prepared statement caches 
> with 10 MB (native) and 10 MB (Thrift)
> 2017-11-13T14:29:20.805946340Z INFO Initializing system.IndexInfo
> 2017-11-13T14:29:21.934686905Z INFO Initializing system.batches
> 2017-11-13T14:29:21.973914733Z INFO Initializing system.paxos
> 2017-11-13T14:29:21.994550268Z INFO Initializing system.local
> 2017-11-13T14:29:22.014097194Z INFO Initializing system.peers
> 2017-11-13T14:29:22.124211254Z INFO Initializing system.peer_events
> 2017-11-13T14:29:22.153966833Z INFO Initializing system.range_xfers
> 2017-11-13T14:29:22.174097334Z INFO Initializing system.compaction_history
> 2017-11-13T14:29:22.194259920Z INFO Initializing system.sstable_activity
> 2017-11-13T14:29:22.210178271Z INFO Initializing system.size_estimates
> 2017-11-13T14:29:22.223836992Z INFO Initializing system.available_ranges
> 2017-11-13T14:29:22.237854207Z INFO Initializing system.transferred_ranges
> 2017-11-13T14:29:22.253995621Z INFO Initializing 
> system.views_builds_in_progress
> 2017-11-13T14:29:22.264052481Z INFO Initializing system.built_views
> 2017-11-13T14:29:22.283334779Z INFO Initializing system.hints
> 2017-11-13T14:29:22.304110311Z INFO Initializing system.batchlog
> 2017-11-13T14:29:22.318031950Z INFO Initializing system.prepared_statements
> 2017-11-13T14:29:22.326547917Z INFO Initializing system.schema_keyspaces
> 2017-11-13T14:29:22.337097407Z INFO Initializing system.schema_columnfamilies
> 2017-11-13T14:29:22.354082675Z INFO Initializing system.schema_columns
> 2017-11-13T14:29:22.384179063Z INFO Initializing system.schema_triggers
> 2017-11-13T14:29:22.394222027Z INFO Initializing system.schema_usertypes
> 2017-11-13T14:29:22.414199833Z INFO Initializing system.schema_functions
> 2017-11-13T14:29:22.427205182Z INFO Initializing system.schema_aggregates
> 2017-11-13T14:29:22.427228345Z INFO Not submitting build tasks for views in 
> keyspace system as storage service is not initialized
> 2017-11-13T14:29:22.652838866Z INFO Scheduling approximate time-check task 
> with a precision of 10 milliseconds
> 2017-11-13T14:29:22.732862906Z INFO Initializing system_schema.keyspaces
> 2017-11-13T14:29:22.746598744Z INFO Initializing system_schema.tables
> 2017-11-13T14:29:22.759649011Z INFO Initializing system_schema.columns
> 2017-11-13T14:29:22.766245435Z INFO Initializing system_schema.triggers
> 2017-11-13T14:29:22.778716809Z INFO Initializing system_schema.dropped_columns
> 2017-11-13T14:29:22.791369819Z INFO Initializing system_schema.views
> 2017-11-13T14:29:22.839141724Z INFO Initializing system_schema.types
> 2017-11-13T14:29:22.852911976Z INFO Initializing system_schema.functions
> 2017-11-13T14:29:22.852938112Z INFO Initializing system_schema.aggregates
> 2017-11-13T14:29:22.869348526Z INFO Initializing system_schema.indexes
> 2017-11-13T14:29:22.874178682Z INFO Not submitting build tasks for views in 
> keyspace system_schema as storage service is not initialized
> 2017-11-13T14:29:23.700250435Z INFO Initializing key cache with capacity of 
> 25 MBs.
> 2017-11-13T14:29:23.724357053Z INFO Initializing row cache with capacity of 0 
> MBs
> 2017-11-13T14:29:23.724383599Z INFO Initializing counter cache with capacity 
> of 12 MBs
> 2017-11-13T14:29:23.724386906Z INFO Scheduling counter cache save to every 
> 7200 seconds (going to save all keys).
> 2017-11-13T14:29:23.984408710Z INFO Populating token metadata from system 
> tables
> 2017-11-13T14:29:24.032687075Z INFO Global buffer pool is enabled, when pool 
> is exhausted (max is 125.000MiB) it will allocate on heap
> 2017-11-13T14:29:24.214123695Z INFO Token 

[jira] [Commented] (CASSANDRA-12728) Handling partially written hint files

2017-11-29 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271255#comment-16271255
 ] 

Jeff Jirsa commented on CASSANDRA-12728:


[~alekiv] can you open a new JIRA and link it back to this one? It's possible 
that the original patch didn't consider 0 byte files (I don't have time to go 
back and look at the commit, and it was long enough ago that I've forgotten) - 
were all of your files 0 bytes?





> Handling partially written hint files
> -
>
> Key: CASSANDRA-12728
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12728
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sharvanath Pathak
>Assignee: Garvit Juniwal
>  Labels: lhf
> Fix For: 3.0.14, 3.11.0, 4.0
>
> Attachments: CASSANDRA-12728.patch
>
>
> {noformat}
> ERROR [HintsDispatcher:1] 2016-09-28 17:44:43,397 
> HintsDispatchExecutor.java:225 - Failed to dispatch hints file 
> d5d7257c-9f81-49b2-8633-6f9bda6e3dea-1474892654160-1.hints: file is corrupted 
> ({})
> org.apache.cassandra.io.FSReadError: java.io.EOFException
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:282)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:252)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:156)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:137)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:119) 
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:91) 
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:259)
>  [apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242)
>  [apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220)
>  [apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199)
>  [apache-cassandra-3.0.6.jar:3.0.6]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_77]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> [na:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_77]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]
> Caused by: java.io.EOFException: null
> at 
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:68)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.ChecksummedDataInput.readFully(ChecksummedDataInput.java:126)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402) 
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.readBuffer(HintsReader.java:310)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNextInternal(HintsReader.java:301)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:278)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> ... 15 common frames omitted
> {noformat}
> We've found out that the hint file was truncated because there was a hard 
> reboot around the time of last write to the file. I think we basically need 
> to handle partially written hint files. Also, the CRC file does not exist in 
> this case (probably because it crashed while writing the hints file). May be 
> ignoring and cleaning up such partially written hint files can be a way to 
> fix this?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To 

[jira] [Updated] (CASSANDRA-14037) sstableloader_with_failing_2i_test - sstable_generation_loading_test.TestSSTableGenerationAndLoading fails: Expected [['k', 'idx']] ... but got [[u'k', u'idx', None]

2017-11-29 Thread Joel Knighton (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Knighton updated CASSANDRA-14037:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks for the review. Patch committed to cassandra-dtest as 
{{fc68a0de8d05082a0a78196695572ff2346179c4}}.

> sstableloader_with_failing_2i_test - 
> sstable_generation_loading_test.TestSSTableGenerationAndLoading fails: 
> Expected [['k', 'idx']] ... but got [[u'k', u'idx', None]]
> --
>
> Key: CASSANDRA-14037
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14037
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Michael Kjellman
>Assignee: Joel Knighton
>
> sstableloader_with_failing_2i_test - 
> sstable_generation_loading_test.TestSSTableGenerationAndLoading fails: 
> Expected [['k', 'idx']] ... but got [[u'k', u'idx', None]]
> Expected [['k', 'idx']] from SELECT * FROM system."IndexInfo" WHERE 
> table_name='k', but got [[u'k', u'idx', None]]
>  >> begin captured logging << 
> dtest: DEBUG: cluster ccm directory: /tmp/dtest-2My0fh
> dtest: DEBUG: Done setting configuration options:
> {   'initial_token': None,
> 'num_tokens': '32',
> 'phi_convict_threshold': 5,
> 'range_request_timeout_in_ms': 1,
> 'read_request_timeout_in_ms': 1,
> 'request_timeout_in_ms': 1,
> 'truncate_request_timeout_in_ms': 1,
> 'write_request_timeout_in_ms': 1}
> cassandra.cluster: WARNING: [control connection] Error connecting to 
> 127.0.0.1:
> Traceback (most recent call last):
>   File "cassandra/cluster.py", line 2781, in 
> cassandra.cluster.ControlConnection._reconnect_internal
> return self._try_connect(host)
>   File "cassandra/cluster.py", line 2803, in 
> cassandra.cluster.ControlConnection._try_connect
> connection = self._cluster.connection_factory(host.address, 
> is_control_connection=True)
>   File "cassandra/cluster.py", line 1195, in 
> cassandra.cluster.Cluster.connection_factory
> return self.connection_class.factory(address, self.connect_timeout, 
> *args, **kwargs)
>   File "cassandra/connection.py", line 332, in 
> cassandra.connection.Connection.factory
> conn = cls(host, *args, **kwargs)
>   File 
> "/home/cassandra/env/src/cassandra-driver/cassandra/io/asyncorereactor.py", 
> line 344, in __init__
> self._connect_socket()
>   File "cassandra/connection.py", line 371, in 
> cassandra.connection.Connection._connect_socket
> raise socket.error(sockerr.errno, "Tried connecting to %s. Last error: 
> %s" % ([a[4] for a in addresses], sockerr.strerror or sockerr))
> error: [Errno 111] Tried connecting to [('127.0.0.1', 9042)]. Last error: 
> Connection refused
> cassandra.cluster: WARNING: [control connection] Error connecting to 
> 127.0.0.1:
> Traceback (most recent call last):
>   File "cassandra/cluster.py", line 2781, in 
> cassandra.cluster.ControlConnection._reconnect_internal
> return self._try_connect(host)
>   File "cassandra/cluster.py", line 2803, in 
> cassandra.cluster.ControlConnection._try_connect
> connection = self._cluster.connection_factory(host.address, 
> is_control_connection=True)
>   File "cassandra/cluster.py", line 1195, in 
> cassandra.cluster.Cluster.connection_factory
> return self.connection_class.factory(address, self.connect_timeout, 
> *args, **kwargs)
>   File "cassandra/connection.py", line 332, in 
> cassandra.connection.Connection.factory
> conn = cls(host, *args, **kwargs)
>   File 
> "/home/cassandra/env/src/cassandra-driver/cassandra/io/asyncorereactor.py", 
> line 344, in __init__
> self._connect_socket()
>   File "cassandra/connection.py", line 371, in 
> cassandra.connection.Connection._connect_socket
> raise socket.error(sockerr.errno, "Tried connecting to %s. Last error: 
> %s" % ([a[4] for a in addresses], sockerr.strerror or sockerr))
> error: [Errno 111] Tried connecting to [('127.0.0.1', 9042)]. Last error: 
> Connection refused
> cassandra.cluster: WARNING: [control connection] Error connecting to 
> 127.0.0.1:
> Traceback (most recent call last):
>   File "cassandra/cluster.py", line 2781, in 
> cassandra.cluster.ControlConnection._reconnect_internal
> return self._try_connect(host)
>   File "cassandra/cluster.py", line 2803, in 
> cassandra.cluster.ControlConnection._try_connect
> connection = self._cluster.connection_factory(host.address, 
> is_control_connection=True)
>   File "cassandra/cluster.py", line 1195, in 
> cassandra.cluster.Cluster.connection_factory
> return self.connection_class.factory(address, self.connect_timeout, 
> *args, **kwargs)
>   File 

cassandra-dtest git commit: Expect value column in sstableloader_with_failing_2i_test when reading IndexInfo table

2017-11-29 Thread jkni
Repository: cassandra-dtest
Updated Branches:
  refs/heads/master 616f952f5 -> fc68a0de8


Expect value column in sstableloader_with_failing_2i_test when reading 
IndexInfo table

patch by Joel Knighton; reviewed by Alex Petrov for CASSANDRA-14037


Project: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/commit/fc68a0de
Tree: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/tree/fc68a0de
Diff: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/diff/fc68a0de

Branch: refs/heads/master
Commit: fc68a0de8d05082a0a78196695572ff2346179c4
Parents: 616f952
Author: Joel Knighton 
Authored: Tue Nov 28 23:47:04 2017 -0600
Committer: Joel Knighton 
Committed: Wed Nov 29 09:52:01 2017 -0600

--
 sstable_generation_loading_test.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra-dtest/blob/fc68a0de/sstable_generation_loading_test.py
--
diff --git a/sstable_generation_loading_test.py 
b/sstable_generation_loading_test.py
index b45b338..335f384 100644
--- a/sstable_generation_loading_test.py
+++ b/sstable_generation_loading_test.py
@@ -330,7 +330,7 @@ class 
TestSSTableGenerationAndLoading(BaseSStableLoaderTest):
 create_schema_with_2i(session)
 
 # The table should exist and be empty, and the index should be empty 
and marked as built
-assert_one(session, """SELECT * FROM system."IndexInfo" WHERE 
table_name='k'""", ['k', 'idx'])
+assert_one(session, """SELECT * FROM system."IndexInfo" WHERE 
table_name='k'""", ['k', 'idx', None])
 assert_none(session, "SELECT * FROM k.t")
 assert_none(session, "SELECT * FROM k.t WHERE v = 8")
 


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14074) Remove "OpenJDK is not recommended" Startup Warning

2017-11-29 Thread Eric Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271073#comment-16271073
 ] 

Eric Evans commented on CASSANDRA-14074:


+1

> Remove "OpenJDK is not recommended" Startup Warning
> ---
>
> Key: CASSANDRA-14074
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14074
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Michael Kjellman
>  Labels: lhf
>
> We should remove the following warning on C* startup that OpenJDK is not 
> recommended. Now that with JDK8 OpenJDK is the reference JVM implementation 
> and things are much more stable -- and that all of our tests run on OpenJDK 
> builds due to the Oracle JDK license, this warning isn't helpful and is 
> actually wrong and we should remove it to prevent any user confusion.
> WARN  [main] 2017-11-28 19:39:08,446 StartupChecks.java:202 - OpenJDK is not 
> recommended. Please upgrade to the newest Oracle Java release



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14037) sstableloader_with_failing_2i_test - sstable_generation_loading_test.TestSSTableGenerationAndLoading fails: Expected [['k', 'idx']] ... but got [[u'k', u'idx', Non

2017-11-29 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16270792#comment-16270792
 ] 

Alex Petrov commented on CASSANDRA-14037:
-

+1 [~jkni], I had the same patch locally but I didn't get to submit it. Thank 
you!

> sstableloader_with_failing_2i_test - 
> sstable_generation_loading_test.TestSSTableGenerationAndLoading fails: 
> Expected [['k', 'idx']] ... but got [[u'k', u'idx', None]]
> --
>
> Key: CASSANDRA-14037
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14037
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Michael Kjellman
>Assignee: Joel Knighton
>
> sstableloader_with_failing_2i_test - 
> sstable_generation_loading_test.TestSSTableGenerationAndLoading fails: 
> Expected [['k', 'idx']] ... but got [[u'k', u'idx', None]]
> Expected [['k', 'idx']] from SELECT * FROM system."IndexInfo" WHERE 
> table_name='k', but got [[u'k', u'idx', None]]
>  >> begin captured logging << 
> dtest: DEBUG: cluster ccm directory: /tmp/dtest-2My0fh
> dtest: DEBUG: Done setting configuration options:
> {   'initial_token': None,
> 'num_tokens': '32',
> 'phi_convict_threshold': 5,
> 'range_request_timeout_in_ms': 1,
> 'read_request_timeout_in_ms': 1,
> 'request_timeout_in_ms': 1,
> 'truncate_request_timeout_in_ms': 1,
> 'write_request_timeout_in_ms': 1}
> cassandra.cluster: WARNING: [control connection] Error connecting to 
> 127.0.0.1:
> Traceback (most recent call last):
>   File "cassandra/cluster.py", line 2781, in 
> cassandra.cluster.ControlConnection._reconnect_internal
> return self._try_connect(host)
>   File "cassandra/cluster.py", line 2803, in 
> cassandra.cluster.ControlConnection._try_connect
> connection = self._cluster.connection_factory(host.address, 
> is_control_connection=True)
>   File "cassandra/cluster.py", line 1195, in 
> cassandra.cluster.Cluster.connection_factory
> return self.connection_class.factory(address, self.connect_timeout, 
> *args, **kwargs)
>   File "cassandra/connection.py", line 332, in 
> cassandra.connection.Connection.factory
> conn = cls(host, *args, **kwargs)
>   File 
> "/home/cassandra/env/src/cassandra-driver/cassandra/io/asyncorereactor.py", 
> line 344, in __init__
> self._connect_socket()
>   File "cassandra/connection.py", line 371, in 
> cassandra.connection.Connection._connect_socket
> raise socket.error(sockerr.errno, "Tried connecting to %s. Last error: 
> %s" % ([a[4] for a in addresses], sockerr.strerror or sockerr))
> error: [Errno 111] Tried connecting to [('127.0.0.1', 9042)]. Last error: 
> Connection refused
> cassandra.cluster: WARNING: [control connection] Error connecting to 
> 127.0.0.1:
> Traceback (most recent call last):
>   File "cassandra/cluster.py", line 2781, in 
> cassandra.cluster.ControlConnection._reconnect_internal
> return self._try_connect(host)
>   File "cassandra/cluster.py", line 2803, in 
> cassandra.cluster.ControlConnection._try_connect
> connection = self._cluster.connection_factory(host.address, 
> is_control_connection=True)
>   File "cassandra/cluster.py", line 1195, in 
> cassandra.cluster.Cluster.connection_factory
> return self.connection_class.factory(address, self.connect_timeout, 
> *args, **kwargs)
>   File "cassandra/connection.py", line 332, in 
> cassandra.connection.Connection.factory
> conn = cls(host, *args, **kwargs)
>   File 
> "/home/cassandra/env/src/cassandra-driver/cassandra/io/asyncorereactor.py", 
> line 344, in __init__
> self._connect_socket()
>   File "cassandra/connection.py", line 371, in 
> cassandra.connection.Connection._connect_socket
> raise socket.error(sockerr.errno, "Tried connecting to %s. Last error: 
> %s" % ([a[4] for a in addresses], sockerr.strerror or sockerr))
> error: [Errno 111] Tried connecting to [('127.0.0.1', 9042)]. Last error: 
> Connection refused
> cassandra.cluster: WARNING: [control connection] Error connecting to 
> 127.0.0.1:
> Traceback (most recent call last):
>   File "cassandra/cluster.py", line 2781, in 
> cassandra.cluster.ControlConnection._reconnect_internal
> return self._try_connect(host)
>   File "cassandra/cluster.py", line 2803, in 
> cassandra.cluster.ControlConnection._try_connect
> connection = self._cluster.connection_factory(host.address, 
> is_control_connection=True)
>   File "cassandra/cluster.py", line 1195, in 
> cassandra.cluster.Cluster.connection_factory
> return self.connection_class.factory(address, self.connect_timeout, 
> *args, **kwargs)
>   File "cassandra/connection.py", line 332, in 
> 

[jira] [Commented] (CASSANDRA-13873) Ref bug in Scrub

2017-11-29 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16270733#comment-16270733
 ] 

Marcus Eriksson commented on CASSANDRA-13873:
-

ping [~jkni] - should we port the trivial patch to trunk as well and get this 
in? We could improve it in another ticket I guess

> Ref bug in Scrub
> 
>
> Key: CASSANDRA-13873
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13873
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: T Jake Luciani
>Assignee: Joel Knighton
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
>
> I'm hitting a Ref bug when many scrubs run against a node.  This doesn't 
> happen on 3.0.X.  I'm not sure if/if not this happens with compactions too 
> but I suspect it does.
> I'm not seeing any Ref leaks or double frees.
> To Reproduce:
> {quote}
> ./tools/bin/cassandra-stress write n=10m -rate threads=100
> ./bin/nodetool scrub
> #Ctrl-C
> ./bin/nodetool scrub
> #Ctrl-C
> ./bin/nodetool scrub
> #Ctrl-C
> ./bin/nodetool scrub
> {quote}
> Eventually in the logs you get:
> WARN  [RMI TCP Connection(4)-127.0.0.1] 2017-09-14 15:51:26,722 
> NoSpamLogger.java:97 - Spinning trying to capture readers 
> [BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-5-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-32-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-31-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-29-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-27-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-26-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-20-big-Data.db')],
> *released: 
> [BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-5-big-Data.db')],*
>  
> This released table has a selfRef of 0 but is in the Tracker



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-3200) Repair: compare all trees together (for a given range/cf) instead of by pair in isolation

2017-11-29 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16270669#comment-16270669
 ] 

Marcus Eriksson commented on CASSANDRA-3200:


pushed another commit with the review fixes here: 
https://github.com/krummas/cassandra/commits/marcuse/CASSANDRA-3200
tests ran here: 
https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/440/
 - looks like one of the new tests failed but it passes locally and it looks 
like an environment issue. Rerunning 
[here|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/441/]
 to make sure

dtest branch is: 
https://github.com/krummas/cassandra-dtest/commits/marcuse/mt_calcs

> Repair: compare all trees together (for a given range/cf) instead of by pair 
> in isolation
> -
>
> Key: CASSANDRA-3200
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3200
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Marcus Eriksson
>Priority: Minor
>  Labels: repair
> Fix For: 4.x
>
>
> Currently, repair compare merkle trees by pair, in isolation of any other 
> tree. What that means concretely is that if I have three node A, B and C 
> (RF=3) with A and B in sync, but C having some range r inconsitent with both 
> A and B (since those are consistent), we will do the following transfer of r: 
> A -> C, C -> A, B -> C, C -> B.
> The fact that we do both A -> C and C -> A is fine, because we cannot know 
> which one is more to date from A or C. However, the transfer B -> C is 
> useless provided we do A -> C if A and B are in sync. Not doing that transfer 
> will be a 25% improvement in that case. With RF=5 and only one node 
> inconsistent with all the others, that almost a 40% improvement, etc...
> Given that this situation of one node not in sync while the others are is 
> probably fairly common (one node died so it is behind), this could be a fair 
> improvement over what is transferred. In the case where we use repair to 
> rebuild completely a node, this will be a dramatic improvement, because it 
> will avoid the rebuilded node to get RF times the data it should get.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Resolved] (CASSANDRA-14077) SAP BI Open Doc URL for retrieving pdf

2017-11-29 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp resolved CASSANDRA-14077.
--
Resolution: Invalid

This is the OSS Cassandra JIRA and not a support or development service for SAP 
BI.

> SAP BI Open Doc URL for retrieving pdf
> --
>
> Key: CASSANDRA-14077
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14077
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: emma blisa
>
> In a reporting application we use, we were using BI 3.x API to produce Web 
> reports. While doing the migration activity to 4.x version, we thought it is 
> fine to go with open doc url rather than doing the report generation through 
> API.
> Many of the samples I have seen uses sIDType and iDocID parameters along with 
> Token value to retrieve the document by constructing a URL like below
>  
> http://server:port/BOE/OpenDocument/opendoc/openDocument.jsp?token=[LogonToken]=[]=CUID
> But all those URLs get HTML page as response from BI 4.x SAP webservice, the 
> java script in that HTML page does the task of retrieving the pdf file.
> I am just wondering if there is any way I could retrieve the pdf report as 
> response from the BI Webservice directly ? Please assist me on this. Thanks
> https://mindmajix.com/sap-bi-training



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14077) SAP BI Open Doc URL for retrieving pdf

2017-11-29 Thread emma blisa (JIRA)
emma blisa created CASSANDRA-14077:
--

 Summary: SAP BI Open Doc URL for retrieving pdf
 Key: CASSANDRA-14077
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14077
 Project: Cassandra
  Issue Type: Improvement
Reporter: emma blisa


In a reporting application we use, we were using BI 3.x API to produce Web 
reports. While doing the migration activity to 4.x version, we thought it is 
fine to go with open doc url rather than doing the report generation through 
API.
Many of the samples I have seen uses sIDType and iDocID parameters along with 
Token value to retrieve the document by constructing a URL like below

 
http://server:port/BOE/OpenDocument/opendoc/openDocument.jsp?token=[LogonToken]=[]=CUID

But all those URLs get HTML page as response from BI 4.x SAP webservice, the 
java script in that HTML page does the task of retrieving the pdf file.
I am just wondering if there is any way I could retrieve the pdf report as 
response from the BI Webservice directly ? Please assist me on this. Thanks
https://mindmajix.com/sap-bi-training




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14074) Remove "OpenJDK is not recommended" Startup Warning

2017-11-29 Thread Stefan Podkowinski (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-14074:
---
Labels: lhf  (was: )

> Remove "OpenJDK is not recommended" Startup Warning
> ---
>
> Key: CASSANDRA-14074
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14074
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Michael Kjellman
>  Labels: lhf
>
> We should remove the following warning on C* startup that OpenJDK is not 
> recommended. Now that with JDK8 OpenJDK is the reference JVM implementation 
> and things are much more stable -- and that all of our tests run on OpenJDK 
> builds due to the Oracle JDK license, this warning isn't helpful and is 
> actually wrong and we should remove it to prevent any user confusion.
> WARN  [main] 2017-11-28 19:39:08,446 StartupChecks.java:202 - OpenJDK is not 
> recommended. Please upgrade to the newest Oracle Java release



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14071) Materialized view on table with TTL issue

2017-11-29 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-14071:

Reviewer: Paulo Motta

> Materialized view on table with TTL issue
> -
>
> Key: CASSANDRA-14071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination, Materialized Views
> Environment: Cassandra 3
>Reporter: Silviu Butnariu
>Assignee: ZhaoYang
>  Labels: correctness
>
> Materialized views that cluster by a column that is not part of table's PK 
> and are created from tables that have *default_time_to_live* seems to 
> malfunction.
> Having this table
> {code:java}
> CREATE TABLE sbutnariu.test_bug (
> field1 smallint,
> field2 smallint,
> date timestamp,
> PRIMARY KEY ((field1), field2)
> ) WITH default_time_to_live = 1000;
> {code}
> and the materialized view
> {code:java}
> CREATE MATERIALIZED VIEW sbutnariu.test_bug_by_date AS SELECT * FROM 
> sbutnariu.test_bug WHERE field1 IS NOT NULL AND field2 IS NOT NULL AND date 
> IS NOT NULL PRIMARY KEY ((field1), date, field2) WITH CLUSTERING ORDER BY 
> (date desc, field2 asc);
> {code}
> After inserting 3 rows with same PK (should upsert), the materialized view 
> will have 3 rows.
> {code:java}
> insert into sbutnariu.test_bug(field1, field2, date) values (1, 2, 
> toTimestamp(now()));
> insert into sbutnariu.test_bug(field1, field2, date) values (1, 2, 
> toTimestamp(now()));
> insert into sbutnariu.test_bug(field1, field2, date) values (1, 2, 
> toTimestamp(now()));
> select * from sbutnariu.test_bug; /*1 row*/
> select * from sbutnariu.test_bug_by_date;/*3 rows*/
> {code}
> If I remove the ttl and try again, it works as expected:
> {code:java}
> truncate sbutnariu.test_bug;
> alter table sbutnariu.test_bug with default_time_to_live = 0;
> select * from sbutnariu.test_bug; /*1 row*/
> select * from sbutnariu.test_bug_by_date;/*1 row*/
> {code}
> I've tested on versions 3.0.14 and 3.0.15. The bug was introduced in 3.0.15, 
> as in 3.0.14 it works as expected.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14071) Materialized view on table with TTL issue

2017-11-29 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-14071:
-
Status: Patch Available  (was: Open)

> Materialized view on table with TTL issue
> -
>
> Key: CASSANDRA-14071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination, Materialized Views
> Environment: Cassandra 3
>Reporter: Silviu Butnariu
>Assignee: ZhaoYang
>  Labels: correctness
>
> Materialized views that cluster by a column that is not part of table's PK 
> and are created from tables that have *default_time_to_live* seems to 
> malfunction.
> Having this table
> {code:java}
> CREATE TABLE sbutnariu.test_bug (
> field1 smallint,
> field2 smallint,
> date timestamp,
> PRIMARY KEY ((field1), field2)
> ) WITH default_time_to_live = 1000;
> {code}
> and the materialized view
> {code:java}
> CREATE MATERIALIZED VIEW sbutnariu.test_bug_by_date AS SELECT * FROM 
> sbutnariu.test_bug WHERE field1 IS NOT NULL AND field2 IS NOT NULL AND date 
> IS NOT NULL PRIMARY KEY ((field1), date, field2) WITH CLUSTERING ORDER BY 
> (date desc, field2 asc);
> {code}
> After inserting 3 rows with same PK (should upsert), the materialized view 
> will have 3 rows.
> {code:java}
> insert into sbutnariu.test_bug(field1, field2, date) values (1, 2, 
> toTimestamp(now()));
> insert into sbutnariu.test_bug(field1, field2, date) values (1, 2, 
> toTimestamp(now()));
> insert into sbutnariu.test_bug(field1, field2, date) values (1, 2, 
> toTimestamp(now()));
> select * from sbutnariu.test_bug; /*1 row*/
> select * from sbutnariu.test_bug_by_date;/*3 rows*/
> {code}
> If I remove the ttl and try again, it works as expected:
> {code:java}
> truncate sbutnariu.test_bug;
> alter table sbutnariu.test_bug with default_time_to_live = 0;
> select * from sbutnariu.test_bug; /*1 row*/
> select * from sbutnariu.test_bug_by_date;/*1 row*/
> {code}
> I've tested on versions 3.0.14 and 3.0.15. The bug was introduced in 3.0.15, 
> as in 3.0.14 it works as expected.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14071) Materialized view on table with TTL issue

2017-11-29 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16270458#comment-16270458
 ] 

ZhaoYang commented on CASSANDRA-14071:
--

| source | utest | dtest |
| [3.0|https://github.com/jasonstack/cassandra/commits/CASSANDRA-14071] | 
[3.0|https://circleci.com/gh/jasonstack/cassandra/658] | failures not related |
| [3.11|https://github.com/jasonstack/cassandra/commits/CASSANDRA-14071-3.11] | 
[3.11|https://circleci.com/gh/jasonstack/cassandra/656] | running |
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-14071-trunk] 
| [trunk|https://circleci.com/gh/jasonstack/cassandra/657] | failures not 
related |
| 
[dtest|https://github.com/apache/cassandra-dtest/compare/master...jasonstack:CASSANDRA-14071?expand=1]|

{Code}
Changes:
1. Added ExpiredLivenessInfo as subclass of ExpiringLivenessInfo, it's always 
{{expired}}(act as tombstone) regardless {{nowInSecs}} local time.
2. Added additional check in {{LivenessInfo.supersedes()}}. When timestamp tie, 
ExpiredLivenessInfo (by checking ttl==MAX) will supersedes another 
non-ExpiredLivenessInfo.
{Code}

[~pauloricardomg] what do you think?

> Materialized view on table with TTL issue
> -
>
> Key: CASSANDRA-14071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination, Materialized Views
> Environment: Cassandra 3
>Reporter: Silviu Butnariu
>Assignee: ZhaoYang
>  Labels: correctness
>
> Materialized views that cluster by a column that is not part of table's PK 
> and are created from tables that have *default_time_to_live* seems to 
> malfunction.
> Having this table
> {code:java}
> CREATE TABLE sbutnariu.test_bug (
> field1 smallint,
> field2 smallint,
> date timestamp,
> PRIMARY KEY ((field1), field2)
> ) WITH default_time_to_live = 1000;
> {code}
> and the materialized view
> {code:java}
> CREATE MATERIALIZED VIEW sbutnariu.test_bug_by_date AS SELECT * FROM 
> sbutnariu.test_bug WHERE field1 IS NOT NULL AND field2 IS NOT NULL AND date 
> IS NOT NULL PRIMARY KEY ((field1), date, field2) WITH CLUSTERING ORDER BY 
> (date desc, field2 asc);
> {code}
> After inserting 3 rows with same PK (should upsert), the materialized view 
> will have 3 rows.
> {code:java}
> insert into sbutnariu.test_bug(field1, field2, date) values (1, 2, 
> toTimestamp(now()));
> insert into sbutnariu.test_bug(field1, field2, date) values (1, 2, 
> toTimestamp(now()));
> insert into sbutnariu.test_bug(field1, field2, date) values (1, 2, 
> toTimestamp(now()));
> select * from sbutnariu.test_bug; /*1 row*/
> select * from sbutnariu.test_bug_by_date;/*3 rows*/
> {code}
> If I remove the ttl and try again, it works as expected:
> {code:java}
> truncate sbutnariu.test_bug;
> alter table sbutnariu.test_bug with default_time_to_live = 0;
> select * from sbutnariu.test_bug; /*1 row*/
> select * from sbutnariu.test_bug_by_date;/*1 row*/
> {code}
> I've tested on versions 3.0.14 and 3.0.15. The bug was introduced in 3.0.15, 
> as in 3.0.14 it works as expected.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org