Re: Does Java driver v3.1.x degrade cluster connect/close performance?

2017-03-06 Thread Satoshi Hikida
Hi Matija, Andrew

Thank you for your reply.

Matija:
> Do you plan to misuse it and create a new cluster object and open a new
connection for each request?
No, My app never create a new cluster for each request. Meanwhile its each
unit tests creates a new cluster and close it every time.
Of course I can change the creating and closing a cluster to at once or a
few times in the test. But I just wondered why the connection/close
performance is degraded if I update the driver version.


Andrew:
Thanks for your information about driver's ML. I'll use it from next time.

> One correction on my previous email, at 2.1.8 of the driver, Netty 4.0
was in use, so please disregard my comments about the netty dependency
changing from 3.9 to 4.0, there is a different in version, but it's only at
the patch level (4.0.27 to 4.0.37)
Does your comment mean with 2.1.8 of the driver takes at least 2 seconds at
Cluster#close? If so, it is strange because the response time of
Cluster#close was around 20ms with v2.1.8 of the driver in my test.

> I'd be interested to see if running the same test in your environment
creates different results.
I'll run the test in my test environment and share the result. Thank you
again.

Regards,
Satoshi

On Tue, Mar 7, 2017 at 12:38 AM, Andrew Tolbert  wrote:

> One correction on my previous email, at 2.1.8 of the driver, Netty 4.0 was
> in use, so please disregard my comments about the netty dependency changing
> from 3.9 to 4.0, there is a different in version, but it's only at the
> patch level (4.0.27 to 4.0.37)
>
> Just to double check, I reran that connection initialization test (source
> )
> where I got my previous numbers from (as that was from nearly 2 years ago)
> and compared driver version 2.1.8 against 3.1.3.  I first ran against a
> single node that is located in California, where my client is in Minnesota,
> so roundtrip latency is a factor:
>
> v2.1.8:
>
> Single attempt took 1837ms.
>
> 10 warmup iterations (first 10 attempts discarded), 100 trials
>
>
> -- Timers 
> --
> connectTimer
>  count = 100
>min = 458.40 milliseconds
>max = 769.43 milliseconds
>   mean = 493.45 milliseconds
> stddev = 38.54 milliseconds
> median = 488.38 milliseconds
>   75% <= 495.71 milliseconds
>   95% <= 514.73 milliseconds
>   98% <= 724.05 milliseconds
>   99% <= 769.02 milliseconds
> 99.9% <= 769.43 milliseconds
>
> v3.1.3:
>
> Single attempt took 1781ms.
>
> 10 warmup iterations (first 10 attempts discarded), 100 trials
>
>  -- Timers 
> --
> connectTimer
>  count = 100
>min = 457.32 milliseconds
>max = 539.77 milliseconds
>   mean = 485.68 milliseconds
> stddev = 10.76 milliseconds
> median = 485.52 milliseconds
>   75% <= 490.39 milliseconds
>   95% <= 499.83 milliseconds
>   98% <= 511.52 milliseconds
>   99% <= 535.56 milliseconds
> 99.9% <= 539.77 milliseconds
>
> As you can see, at least for this test, initialization times are pretty
> much identical.
>
> I ran another set of trials using a local C* node (running on same host as
> client) to limit the impact of round trip time:
>
> v2.1.8:
>
> Single attempt took 477ms.
>
> 10 warmup iterations 100 trials
>
> -- Timers 
> --
> connectTimer
>  count = 100
>min = 2.38 milliseconds
>max = 32.69 milliseconds
>   mean = 3.79 milliseconds
> stddev = 3.49 milliseconds
> median = 3.05 milliseconds
>   75% <= 3.49 milliseconds
>   95% <= 6.05 milliseconds
>   98% <= 19.55 milliseconds
>   99% <= 32.56 milliseconds
> 99.9% <= 32.69 milliseconds
>
> v3.1.3:
>
> Single attempt took 516ms.
>
> -- Timers 
> --
> connectTimer
>  count = 100
>min = 1.67 milliseconds
>max = 8.03 milliseconds
>   mean = 3.00 milliseconds
> stddev = 0.97 milliseconds
> median = 2.85 milliseconds
>   75% <= 3.10 milliseconds
>   95% <= 4.01 milliseconds
>   98% <= 6.55 milliseconds
>   99% <= 7.93 milliseconds
> 99.9% <= 8.03 milliseconds
>
> Similary when using a local C* node, initialization times are pretty
> similar.
>
> I'd be interested to see if running the same test
>  in
> your 

Changed node ID?

2017-03-06 Thread Joe Olson

I have a 9 node cluster I had shut down (cassandra stopped on all nodes, all 
nodes shutdown) that I just tried to start back up. I have done this several 
times successfully. However, on this attempt, one of the nodes failed to join 
the cluster. Upon inspection of /var/log/cassandra/system.log, I found the 
following: 

WARN [GossipStage:1] 2017-03-06 21:06:36,648 TokenMetadata.java:252 - Changing 
/192.168.211.82's host ID from cff3ef25-9a47-4ea4-9519-b85d20bef3ee to 
59f2da9f-0b85-452f-b61a-fa990de53e4b 

further down: 

ERROR [main] 2017-03-06 21:20:14,718 CassandraDaemon.java:747 - Exception 
encountered during startup 
java.lang.RuntimeException: A node with address /192.168.211.82 already exists, 
cancelling join. Use cassandra.replace_address if you want to replace this 
node. 
at 
org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:491)
 ~[apache-cassandra-3.9.0.jar:3.9.0] 
at 
org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:778)
 ~[apache-cassandra-3.9.0.jar:3.9.0] 
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:648) 
~[apache-cassandra-3.9.0.jar:3.9.0] 
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:548) 
~[apache-cassandra-3.9.0.jar:3.9.0] 
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:385) 
[apache-cassandra-3.9.0.jar:3.9.0] 
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:601) 
[apache-cassandra-3.9.0.jar:3.9.0] 
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:730) 
[apache-cassandra-3.9.0.jar:3.9.0] 

nodetool status: 

UN 192.168.211.88 2.58 TiB 256 32.0% 9de2d3ef-5ae1-4c7f-8560-730757a6d1ae rack1 
UN 192.168.211.80 2.26 TiB 256 33.9% d83829d3-a1d3-4e6c-b014-7cfe45e22d67 rack1 
UN 192.168.211.81 2.91 TiB 256 34.1% 0cafd24e-d3ed-4e51-b586-0b496835a931 rack1 
DN 192.168.211.82 551.45 KiB 256 31.9% 59f2da9f-0b85-452f-b61a-fa990de53e4b 
rack1 
UN 192.168.211.83 2.32 TiB 256 32.7% db006e31-03fa-486a-8512-f88eb583bd0c rack1 
UN 192.168.211.84 2.54 TiB 256 34.3% a9a50a74-2fc2-4866-a03a-ec95a7866183 rack1 
UN 192.168.211.85 2.4 TiB 256 35.9% 733e6703-c18f-432f-a787-3731f80ba42d rack1 
UN 192.168.211.86 2.34 TiB 256 32.1% 0daa06fa-708f-4ff8-a15e-861f1a53113a rack1 
UN 192.168.211.87 4.07 TiB 256 33.1% 2aa578c6-1332-4b94-81c6-c3ce005a52ef rack1 

My questions: 
1. Why did the host ID change? 
2. If I modify cassandra-env.sh to include 
JVM_OPTS="$JVM_OPTS -Dcassandra.replace_address=192.168.211.82", will I recover 
the data on the original node? It is still on the node's hard drive.I really 
don't want to have to restream 2.6TB of data onto a "new" node. 





Is it possible to recover a deleted-in-future record?

2017-03-06 Thread Michael Fong
Hi, all,


We recently encountered an issue in production that some records were 
mysteriously deleted with a timestamp 100+ years from now. Everything is normal 
as of now, and how the deletion happened and accuracy of system timestamp at 
that moment are unknown. We were wondering if there is a general way to recover 
the mysteriously-deleted data when the timestamp meta is screwed up.

Thanks in advanced,

Regards,

Michael Fong


Re: Attached profiled data but need help understanding it

2017-03-06 Thread Romain Hardouin
Hi Kant,
You'll find more information about ixgbevf here 
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/sriov-networking.htmlI 
repeat myself but don't underestimate VMs placement: same AZ? same placement 
group? etc.Note that LWT are not discouraged but as the doc says: "[...] 
reserve lightweight transactions for those situations where they are absolutely 
necessary;"I hope you'll be able to achieve what you want with more powerful 
VMs. Let us know!
Best,Romain
 

Le Lundi 6 mars 2017 10h49, Kant Kodali  a écrit :
 

 Hi Romain,
We may be able to achieve what we need without LWT but that would require bunch 
of changes from the application side and possibly introducing caching layers 
and designing solution around that. But for now, we are constrained to use 
LWT's for another month or so. All said, I still would like to see the 
discouraged features such as LWT's, secondary indexes, triggers get better over 
time so it would really benefit users.
Agreed High park/unpark is a sign of excessive context switching but any ideas 
why this is happening? yes today we will be experimenting with c3.2Xlarge and 
see what the numbers look like and slowly scale up from there.
How do I make sure I install  ixgbevf driver? Do M4.xlarge or C3.2Xlarge don't 
already have it? when I googled " ixgbevf driver" it tells me it is ethernet 
driver...I thought all instances by default run on ethernet on AWS. can you 
please give more context on this?
Thanks,kant
On Fri, Mar 3, 2017 at 4:42 AM, Romain Hardouin  wrote:

Also, I should have mentioned that it would be a good idea to spawn your three 
benchmark instances in the same AZ, then try with one instance on each AZ to 
see how network latency affects your LWT rate. The lower latency is achievable 
with three instances on the same placement group of course but it's kinda 
dangerous for production. 





   

Re: Does Java driver v3.1.x degrade cluster connect/close performance?

2017-03-06 Thread Andrew Tolbert
One correction on my previous email, at 2.1.8 of the driver, Netty 4.0 was
in use, so please disregard my comments about the netty dependency changing
from 3.9 to 4.0, there is a different in version, but it's only at the
patch level (4.0.27 to 4.0.37)

Just to double check, I reran that connection initialization test (source
) where
I got my previous numbers from (as that was from nearly 2 years ago) and
compared driver version 2.1.8 against 3.1.3.  I first ran against a single
node that is located in California, where my client is in Minnesota, so
roundtrip latency is a factor:

v2.1.8:

Single attempt took 1837ms.

10 warmup iterations (first 10 attempts discarded), 100 trials


-- Timers
--
connectTimer
 count = 100
   min = 458.40 milliseconds
   max = 769.43 milliseconds
  mean = 493.45 milliseconds
stddev = 38.54 milliseconds
median = 488.38 milliseconds
  75% <= 495.71 milliseconds
  95% <= 514.73 milliseconds
  98% <= 724.05 milliseconds
  99% <= 769.02 milliseconds
99.9% <= 769.43 milliseconds

v3.1.3:

Single attempt took 1781ms.

10 warmup iterations (first 10 attempts discarded), 100 trials

 -- Timers
--
connectTimer
 count = 100
   min = 457.32 milliseconds
   max = 539.77 milliseconds
  mean = 485.68 milliseconds
stddev = 10.76 milliseconds
median = 485.52 milliseconds
  75% <= 490.39 milliseconds
  95% <= 499.83 milliseconds
  98% <= 511.52 milliseconds
  99% <= 535.56 milliseconds
99.9% <= 539.77 milliseconds

As you can see, at least for this test, initialization times are pretty
much identical.

I ran another set of trials using a local C* node (running on same host as
client) to limit the impact of round trip time:

v2.1.8:

Single attempt took 477ms.

10 warmup iterations 100 trials

-- Timers
--
connectTimer
 count = 100
   min = 2.38 milliseconds
   max = 32.69 milliseconds
  mean = 3.79 milliseconds
stddev = 3.49 milliseconds
median = 3.05 milliseconds
  75% <= 3.49 milliseconds
  95% <= 6.05 milliseconds
  98% <= 19.55 milliseconds
  99% <= 32.56 milliseconds
99.9% <= 32.69 milliseconds

v3.1.3:

Single attempt took 516ms.

-- Timers
--
connectTimer
 count = 100
   min = 1.67 milliseconds
   max = 8.03 milliseconds
  mean = 3.00 milliseconds
stddev = 0.97 milliseconds
median = 2.85 milliseconds
  75% <= 3.10 milliseconds
  95% <= 4.01 milliseconds
  98% <= 6.55 milliseconds
  99% <= 7.93 milliseconds
99.9% <= 8.03 milliseconds

Similary when using a local C* node, initialization times are pretty
similar.

I'd be interested to see if running the same test
 in
your environment creates different results.

Thanks!
Andy


On Mon, Mar 6, 2017 at 8:53 AM, Andrew Tolbert 
wrote:

> Hi Satoshi,
>
> This question would be better for the 'DataStax Java Driver for Apache
> Cassandra mailing list
> ',
> but I do have a few thoughts about what you are observing:
>
> Between java-driver 2.1 and 3.0 the driver updated its Netty dependency
> from 3.9.x to 4.0.x.  Cluster#close is likely taking two seconds longer
> because the driver uses AbstractEventExecutor.shutdownGracefully()
> 
> which waits for a quiet period of 2 seconds to allow any inflight requests
> to complete.  You can disable that by passing a custom NettyOptions
> 
> to a Cluster.Builder using withNettyOptions, i.e.:
>
> /**
>  * A custom {@link NettyOptions} that shuts down the {@link
> EventLoopGroup} after
>  * no quiet time.  This is useful for tests that consistently close
> clusters as
>  * otherwise there is a 2 second delay (from JAVA-914
> ).
>  */
> public static NettyOptions nonQuietClusterCloseOptions = new
> NettyOptions() {
> @Override
> public void onClusterClose(EventLoopGroup 

Re: Any way to control/limit off-heap memory?

2017-03-06 Thread Thakrar, Jayesh
Thanks Hannu - also considered that option.
However, that's a trial and error and will have to play with the 
collision/false-positive fraction.
And each iteration will most likely result in a compaction storm - so I was 
hoping for a way to throttle/limit the max off-heap size.

The reason I was thinking of eliminating bloom filters is because due to 
application design, we search for data using a partial key (prefix columns),
hence am thinking of completely eliminating the bloom filters as they do not 
add any value in such a use case.

Is my assumption correct?

From: Hannu Kröger 
Date: Sunday, March 5, 2017 at 6:34 AM
To: 
Subject: Re: Any way to control/limit off-heap memory?

If bloom filters are taking too much memory, you can adjust bloom filters:
https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_tuning_bloom_filters_c.html

Hannu

On 4 Mar 2017, at 22:54, Thakrar, Jayesh 
> wrote:

I have a situation where the off-heap memory is bloating the jvm process 
memory, making it a candidate to be killed by the oom_killer.
My server has 256 GB RAM and Cassandra heap memory of 16 GB

Below is the output of "nodetool info" and nodetool compactionstats for a 
culprit table which causes bloom filter bloat.
Ofcourse one option is to turnoff bloom filter, but I need to look into 
application access pattern, etc.


xss =  -ea -Dorg.xerial.snappy.tempdir=/home/vchadoop/var/tmp 
-javaagent:/home/vchadoop/apps/apache-cassandra-2.2.5//lib/jamm-0.3.0.jar 
-XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms16G -Xmx16G -Xmn4800M 
-XX:+HeapDumpOnOutOfMemoryError -Xss256k
ID : 2b9b4252-0760-49c1-8d14-544be0183271
Gossip active  : true
Thrift active  : false
Native Transport active: true
Load   : 953.19 GB
Generation No  : 1488641545
Uptime (seconds)   : 15706
Heap Memory (MB)   : 7692.93 / 16309.00
Off Heap Memory (MB)   : 175115.07
Data Center: ord
Rack   : rack3
Exceptions : 0
Key Cache  : entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 
requests, NaN recent hit rate, 14400 save period in seconds
Row Cache  : entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 
requests, NaN recent hit rate, 0 save period in seconds
Counter Cache  : entries 0, size 0 bytes, capacity 50 MB, 0 hits, 0 
requests, NaN recent hit rate, 7200 save period in seconds
Token  : (invoke with -T/--tokens to see all 256 tokens)


Table: logs_by_user
SSTable count: 622
SSTables in each level: [174/4, 447/10, 0, 0, 0, 0, 0, 0, 0]
Space used (live): 313156769247
Space used (total): 313156769247
Space used by snapshots (total): 0
Off heap memory used (total): 180354511884
SSTable Compression Ratio: 0.25016314078395613
Number of keys (estimate): 147261312
Memtable cell count: 44796
Memtable data size: 57578717
Memtable off heap memory used: 0
Memtable switch count: 21
Local read count: 0
Local read latency: NaN ms
Local write count: 1148687
Local write latency: 0.123 ms
Pending flushes: 0
Bloom filter false positives: 0
Bloom filter false ratio: 0.0
Bloom filter space used: 180269125192
Bloom filter off heap memory used: 180269120216
Index summary off heap memory used: 24335340
Compression metadata off heap memory used: 61056328
Compacted partition minimum bytes: 150
Compacted partition maximum bytes: 668489532
Compacted partition mean bytes: 3539
Average live cells per slice (last five minutes): NaN
Maximum live cells per slice (last five minutes): 0
Average tombstones per slice (last five minutes): NaN
Maximum tombstones per slice (last five minutes): 0


From: Conversant 



Re: Does Java driver v3.1.x degrade cluster connect/close performance?

2017-03-06 Thread Andrew Tolbert
Hi Satoshi,

This question would be better for the 'DataStax Java Driver for Apache
Cassandra mailing list
',
but I do have a few thoughts about what you are observing:

Between java-driver 2.1 and 3.0 the driver updated its Netty dependency
from 3.9.x to 4.0.x.  Cluster#close is likely taking two seconds longer
because the driver uses AbstractEventExecutor.shutdownGracefully()

which waits for a quiet period of 2 seconds to allow any inflight requests
to complete.  You can disable that by passing a custom NettyOptions

to a Cluster.Builder using withNettyOptions, i.e.:

/**
 * A custom {@link NettyOptions} that shuts down the {@link
EventLoopGroup} after
 * no quiet time.  This is useful for tests that consistently close
clusters as
 * otherwise there is a 2 second delay (from JAVA-914
).
 */
public static NettyOptions nonQuietClusterCloseOptions = new
NettyOptions() {
@Override
public void onClusterClose(EventLoopGroup eventLoopGroup) {
eventLoopGroup.shutdownGracefully(0, 15,
SECONDS).syncUninterruptibly();
}
};

However, I wouldn't recommend doing this unless you have a requirement for
Cluster.close to be as qiuck as possible, as after all closing a Cluster
frequently is not something you should expect to be doing often as a
Cluster and it's Session are meant to be reused over the lifetime of an
application.

With regards to Cluster.connect being slower, i'm not sure I have an
explanation for that and it is not something I have noticed.  I would not
expect Cluster.connect to even take a second with a single node cluster
(for example, I recorded some numbers

a while back and mean initialization time with a 40 node cluster with auth
was ~251ms).  Have you tried executing several trials of Cluster.connect
within a single JVM process, does the initialization time improve with a
subsequent Cluster.connect?  I'm wondering if maybe there is some
additional first-time initialization required that was not before.

Thanks,
Andy

On Mon, Mar 6, 2017 at 6:01 AM, Matija Gobec  wrote:

> Interesting question since I never measured connect and close times.
> Usually this is something you do once the application starts and thats it.
> Do you plan to misuse it and create a new cluster object and open a new
> connection for each request?
>
> On Mon, Mar 6, 2017 at 7:19 AM, Satoshi Hikida  wrote:
>
>> Hi,
>>
>> I'm going to try to update the DataStax's Java Driver version from 2.1.8
>> to 3.1.3.
>> First I ran the test program and measured the time with both drivers
>> v2.1.8 and v3.1.3.
>>
>> The test program is simply Build a Cluster and connect to it and execute
>> a simple select statement, and close the Cluster.
>>
>> The read performance was almost the same for both version (around 20ms),
>> However, the performance of connecting to the cluster, and closing the
>> cluster were significant different.
>>
>> The test environment is as following:
>> - EC2 instance: m4.large(2vCPU, 8GB Memory), 1 node
>> - java1.8
>> - Cassandra v2.2.8
>>
>> Here is the result of the test. I ran the test program for several times
>> but the result almost the same as this result.
>>
>> | Method   | Time in sec (v2.1.8/v3.1.3)|
>> |---||
>> | Cluster#connect |   1.178/2.468 |
>> | Cluster#close |   0.022/2.240 |
>>
>> With v3.1.3 driver, Cluster#connect() performance degraded about 1/2 and
>> Cluster#close() degraded 1/100.  I want to know what is the cause of this
>> performance degradations. Could someone advice me?
>>
>>
>> The Snippet of the test program is as following.
>> ```
>> Cluster cluster = Cluster
>> .builder()
>> .addContactPoints(endpoints)
>> .withCredentials(USER, PASS)
>> .withClusterName(CLUSTER_NAME)
>> .withRetryPolicy(DefaultRetryPolicy.INSTANCE)
>> // .withLoadBalancingPolicy(new TokenAwarePolicy(new
>> DCAwareRoundRobinPolicy(DC_NAME))) // for driver 2.1.8
>> .withLoadBalancingPolicy(new 
>> TokenAwarePolicy(DCAwareRoundRobinPolicy.builder().build()))
>> // for driver 3.1.3
>> .build();
>>
>> Session session = cluster.connect();
>> ResultSet rs = session.execute("select * from system.local;");
>>
>> session.close();
>> cluster.close();
>> ```
>>
>> Regards,
>> Satoshi
>>
>>
>


Re: OOM on Apache Cassandra on 30 Plus node at the same time

2017-03-06 Thread Eric Evans
On Fri, Mar 3, 2017 at 11:18 AM, Shravan Ch  wrote:
> More than 30 plus Cassandra servers in the primary DC went down OOM
> exception below. What puzzles me is the scale at which it happened (at the
> same minute). I will share some more details below.

You'd be surprised; When it's the result of aberrant data/workload,
then having many nodes OOM at once is more common than you might
think.

> System Log: http://pastebin.com/iPeYrWVR

The traceback shows the OOM occurring during a read (a slice), not a
write.  What does your data model and queries look like?  Do you do
deletes (TTLs maybe)? Did the OOM result in a heap dump?

> GC Log: http://pastebin.com/CzNNGs0r
>
> During the OOM I saw lot of WARNings like the below (these were there for
> quite sometime may be weeks)
> WARN  [SharedPool-Worker-81] 2017-03-01 19:55:41,209 BatchStatement.java:252
> - Batch of prepared statements for [keyspace.table] is of size 225455,
> exceeding specified threshold of 65536 by 159919.
>
> Environment:
> We are using ApacheCassandra-2.1.9 on Multi DC cluster. Primary DC (more C*
> nodes on SSD and apps run here)  and secondary DC (geographically remote and
> more like a DR to primary) on SAS drives.
> Cassandra config:
>
> Java 1.8.0_65
> Garbage Collector: G1GC
> memtable_allocation_type: offheap_objects
>
> Post this OOM I am seeing huge hints pile up on majority of the nodes and
> the pending hints keep going up. I have increased HintedHandoff CoreThreads
> to 6 but that did not help (I admit that I tried this on one node to try).
>
> nodetool compactionstats -H
> pending tasks: 3
> compaction typekeyspace  table
> completed  totalunit   progress
> Compaction  system  hints
> 28.5 GB   92.38 GB   bytes 30.85%



-- 
Eric Evans
john.eric.ev...@gmail.com


Re: Archive node

2017-03-06 Thread Gábor Auth
Hi,

On Mon, Mar 6, 2017 at 12:46 PM Carlos Rolo  wrote:

> I would not suggest to do that, because the new "Archive" node would be a
> new DC that you would need to build (Operational wise).
>

Yes, but it is a simple copy of an exists Puppet script in our case and it
works... and the automated clean, cleanup and repair job will be move of
the old keyspaces to the 'Archive' DC without any operational overhead...
hm.

You could also snapshot the old one once it finishes and use SSTableloader
> to push it into your Development DC. This way you have isolation from
> Production. Plus no operational overhead.
>

I think, this is also an operational overhead... :)

Bye,
Gábor Auth


Re: Does Java driver v3.1.x degrade cluster connect/close performance?

2017-03-06 Thread Matija Gobec
Interesting question since I never measured connect and close times.
Usually this is something you do once the application starts and thats it.
Do you plan to misuse it and create a new cluster object and open a new
connection for each request?

On Mon, Mar 6, 2017 at 7:19 AM, Satoshi Hikida  wrote:

> Hi,
>
> I'm going to try to update the DataStax's Java Driver version from 2.1.8
> to 3.1.3.
> First I ran the test program and measured the time with both drivers
> v2.1.8 and v3.1.3.
>
> The test program is simply Build a Cluster and connect to it and execute a
> simple select statement, and close the Cluster.
>
> The read performance was almost the same for both version (around 20ms),
> However, the performance of connecting to the cluster, and closing the
> cluster were significant different.
>
> The test environment is as following:
> - EC2 instance: m4.large(2vCPU, 8GB Memory), 1 node
> - java1.8
> - Cassandra v2.2.8
>
> Here is the result of the test. I ran the test program for several times
> but the result almost the same as this result.
>
> | Method   | Time in sec (v2.1.8/v3.1.3)|
> |---||
> | Cluster#connect |   1.178/2.468 |
> | Cluster#close |   0.022/2.240 |
>
> With v3.1.3 driver, Cluster#connect() performance degraded about 1/2 and
> Cluster#close() degraded 1/100.  I want to know what is the cause of this
> performance degradations. Could someone advice me?
>
>
> The Snippet of the test program is as following.
> ```
> Cluster cluster = Cluster
> .builder()
> .addContactPoints(endpoints)
> .withCredentials(USER, PASS)
> .withClusterName(CLUSTER_NAME)
> .withRetryPolicy(DefaultRetryPolicy.INSTANCE)
> // .withLoadBalancingPolicy(new TokenAwarePolicy(new
> DCAwareRoundRobinPolicy(DC_NAME))) // for driver 2.1.8
> .withLoadBalancingPolicy(new 
> TokenAwarePolicy(DCAwareRoundRobinPolicy.builder().build()))
> // for driver 3.1.3
> .build();
>
> Session session = cluster.connect();
> ResultSet rs = session.execute("select * from system.local;");
>
> session.close();
> cluster.close();
> ```
>
> Regards,
> Satoshi
>
>


Re: Archive node

2017-03-06 Thread Carlos Rolo
I would not suggest to do that, because the new "Archive" node would be a
new DC that you would need to build (Operational wise).

You could also snapshot the old one once it finishes and use SSTableloader
to push it into your Development DC. This way you have isolation from
Production. Plus no operational overhead.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant / Datastax Certified Architect / Cassandra MVP

Pythian - Love your data

rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
*linkedin.com/in/carlosjuzarterolo
*
Mobile: +351 918 918 100
www.pythian.com

On Mon, Mar 6, 2017 at 10:43 AM, Gábor Auth  wrote:

> Hi,
>
> The background story: we are developing an MMO strategy game and every two
> week the game world ends and we are starting a new one with a slightly
> different new database scheme. So that, we have over ~100 keyspaces in our
> cluster and we want to archive the old schemes into a separated Cassandra
> node or something else to be available online for support of development.
> The archived keyspace is mostly read-only and rarely used (~once a year or
> less often).
>
> We've two DC Cassandra cluster with 4-4 nodes, the idea is the following:
> we add a new Cassandra node with DC name 'Archive' and change the
> replication factor of old keyspaces from {'class':
> 'NetworkTopologyStrategy', 'DC01': '3', 'DC02': '3'} to {'class':
> 'NetworkTopologyStrategy', 'Archive': '1'}, and repair the keyspace.
>
> What do you think? Any other idea? :)
>
> Bye,
> Gábor Auth
>
>

-- 


--





Archive node

2017-03-06 Thread Gábor Auth
Hi,

The background story: we are developing an MMO strategy game and every two
week the game world ends and we are starting a new one with a slightly
different new database scheme. So that, we have over ~100 keyspaces in our
cluster and we want to archive the old schemes into a separated Cassandra
node or something else to be available online for support of development.
The archived keyspace is mostly read-only and rarely used (~once a year or
less often).

We've two DC Cassandra cluster with 4-4 nodes, the idea is the following:
we add a new Cassandra node with DC name 'Archive' and change the
replication factor of old keyspaces from {'class':
'NetworkTopologyStrategy', 'DC01': '3', 'DC02': '3'} to {'class':
'NetworkTopologyStrategy', 'Archive': '1'}, and repair the keyspace.

What do you think? Any other idea? :)

Bye,
Gábor Auth


Re: Attached profiled data but need help understanding it

2017-03-06 Thread Kant Kodali
Hi Romain,

We may be able to achieve what we need without LWT but that would require
bunch of changes from the application side and possibly introducing caching
layers and designing solution around that. But for now, we are constrained
to use LWT's for another month or so. All said, I still would like to see
the discouraged features such as LWT's, secondary indexes, triggers get
better over time so it would really benefit users.

Agreed High park/unpark is a sign of excessive context switching but any
ideas why this is happening? yes today we will be experimenting with
c3.2Xlarge and see what the numbers look like and slowly scale up from
there.

How do I make sure I install  ixgbevf driver? Do M4.xlarge or C3.2Xlarge
don't already have it? when I googled " ixgbevf driver" it tells me it is
ethernet driver...I thought all instances by default run on ethernet on
AWS. can you please give more context on this?

Thanks,
kant

On Fri, Mar 3, 2017 at 4:42 AM, Romain Hardouin  wrote:

> Also, I should have mentioned that it would be a good idea to spawn your
> three benchmark instances in the same AZ, then try with one instance on
> each AZ to see how network latency affects your LWT rate. The lower latency
> is achievable with three instances on the same placement group of course
> but it's kinda dangerous for production.
>
>
>