Can't reach itself

2013-06-04 Thread Alain RODRIGUEZ
Hi,

I have an issue since switch to multiple DC. I use AWS EC2 instances,
C*1.2.2, 12 nodes eu-west + 6 nodes us-east (new DC).

Datacenter: eu-west
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address   Load   Owns   Host ID
UN  public ip  133.43 GB  8.3%   ae33d60c-1c24-4c10-b58c-59d06faac5ca
UN  public ip  171.3 GB   8.3%   bb94c428-c98d-454d-af80-6612548a8125
UN  public ip  140.26 GB  8.3%   136bbced-25ed-4a37-abd9-7ab0d146d1c7
UN  public ip  132.14 GB  8.3%   086ebf3e-c58f-4b76-b4d5-6600f7b79cf7
UN  public ip  178.26 GB  8.3%   9255d30f-848f-4251-800b-2c61b4e0cfbf
UN  public ip  153.79 GB  8.3%   7b4fd83a-ca9c-4115-b146-222ab040abd6
UN  public ip  146.82 GB  8.3%   bf233d59-d7a4-482f-adaf-d48531d16305
UN  public ip  151.1 GB   8.3%   fa3b617d-5d31-4db2-87bf-494ee8a9f95f
UN  public ip  131.78 GB  8.3%   dac399dc-ac7c-4ee3-9503-f55e8a9f1675
UN  public ip  130.18 GB  8.3%   56b8654a-f8b3-43d4-8b15-2e74d5dfe81b
UN  public ip 161.96 GB  8.3%   97624d02-ba48-42e7-88f7-2d3b0175d6ef
UN  public ip 130.26 GB  8.3%   868c45b3-4afc-43db-b2d0-5c0f89d018fb
Datacenter: us-east
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address   Load   Owns   Host ID
UN  public ip246.74 GB  0.0%   212888f6-ecf8-4953-8f83-c5653fb176cb
UN  public ip320.15 GB  0.0%   bcd696da-433b-4e6b-8030-11629eaf5b84
UN  public ip353.22 GB  0.0%   3f5cb04a-3ac3-46f3-b101-31a9ae7682bc
UN  public ip348.91 GB  0.0%   836b3b76-418a-4a22-bab4-c1a0bd49de65
UN  public ip269.37 GB  0.0%   9408c7ff-ec47-4824-af81-92aa311a1984
UN  public ip244.94 GB  0.0%   668eb3ca-8ee4-40ae-98e7-987c471bd675

On each node of the new DC, owns 0% (from status view). A nodetool ring
myks gives me:

Datacenter: eu-west
==
Replicas: 3

Address RackStatus State   LoadOwns
   Token
public ip1b  Up Normal  131.78 GB   25.00%
 113427455640312821154458202477256070485
public ip1b  Up Normal  161.96 GB   25.00%
 141784319550391026443072753096570088106
public ip1b  Up Normal  153.43 GB   25.00%
 70892159775195513221536376548285044053
public ip1b  Up Normal  151.1 GB25.00%
 99249023685273718510150927167599061674
public ip1b  Up Normal  130.26 GB   25.00%
 155962751505430129087380028406227096917
public ip1b  Up Normal  146.82 GB   25.00%
 85070591730234615865843651857942052864
public ip1b  Up Normal  171.35 GB   25.00%
 14178431955039102644307275309657008810
public ip1b  Up Normal  132.14 GB   25.00%
 42535295865117307932921825928971026432
public ip1b  Up Normal  140.26 GB   25.00%
 28356863910078205288614550619314017621
public ip1b  Up Normal  133.43 GB   25.00%
 0
public ip1b  Up Normal  130.18 GB   25.00%
 127605887595351923798765477786913079296
public ip1b  Up Normal  178.27 GB   25.00%
 56713727820156410577229101238628035242

Datacenter: us-east
==
Replicas: 3

Address RackStatus State   LoadOwns
   Token

   100
public ip   1b  Up Normal  320.15 GB   50.00%
 28356863910078205288614550619314017721
public ip   1b  Up Normal  353.14 GB   50.00%
 56713727820156410577229101238628035342
public ip   1b  Up Normal  348.35 GB   50.00%
 85070591730234615865843651857942052964
public ip   1b  Up Normal  269.35 GB   50.00%
 113427455640312821154458202477256070585
public ip   1b  Up Normal  244.94 GB   50.00%
 141784319550391026443072753096570088206
public ip   1b  Up Normal  246.74 GB   50.00%
 100

This seems to be ok.

When I run describe cluster; from cassandra-cli from an eu-west node :

[default@unknown] describe cluster;
Cluster Information:
   Snitch: org.apache.cassandra.locator.Ec2MultiRegionSnitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner
   Schema versions:
e968865b-3b96-3c87-af0a-6294067a832f: [My 18 publics ip]

So far so good.
From an us-east node now :

[default@unknown] describe cluster;
Cluster Information:
   Snitch: org.apache.cassandra.locator.Ec2MultiRegionSnitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner
   Schema versions:
UNREACHABLE: [public ip of the node itself]

e968865b-3b96-3c87-af0a-6294067a832f: [17 others publics ip]


Why isn't this node not able to see itself ? What port / service is in used
while describing cluster ? I have tried opening all port with no success.
Also tried the following script to help the node finding itself, but it
doesn't seems to work...

- script
---
#!/bin/bash
PUBLIC_IP=$(wget -qO- 

Re: Can't reach itself

2013-06-04 Thread Alain RODRIGUEZ
I see a lot of hinted handoff compactions too.

I might have not been clear enough, I see a lot of compaction of
system.hints that I interpret as being due to a lot of data that couldn't
reach their destination.


2013/6/4 Alain RODRIGUEZ arodr...@gmail.com

 Hi,

 I have an issue since switch to multiple DC. I use AWS EC2 instances,
 C*1.2.2, 12 nodes eu-west + 6 nodes us-east (new DC).

 Datacenter: eu-west
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address   Load   Owns   Host ID
 UN  public ip  133.43 GB  8.3%   ae33d60c-1c24-4c10-b58c-59d06faac5ca
 UN  public ip  171.3 GB   8.3%   bb94c428-c98d-454d-af80-6612548a8125
 UN  public ip  140.26 GB  8.3%   136bbced-25ed-4a37-abd9-7ab0d146d1c7
 UN  public ip  132.14 GB  8.3%   086ebf3e-c58f-4b76-b4d5-6600f7b79cf7
 UN  public ip  178.26 GB  8.3%   9255d30f-848f-4251-800b-2c61b4e0cfbf
 UN  public ip  153.79 GB  8.3%   7b4fd83a-ca9c-4115-b146-222ab040abd6
 UN  public ip  146.82 GB  8.3%   bf233d59-d7a4-482f-adaf-d48531d16305
 UN  public ip  151.1 GB   8.3%   fa3b617d-5d31-4db2-87bf-494ee8a9f95f
 UN  public ip  131.78 GB  8.3%   dac399dc-ac7c-4ee3-9503-f55e8a9f1675
 UN  public ip  130.18 GB  8.3%   56b8654a-f8b3-43d4-8b15-2e74d5dfe81b
 UN  public ip 161.96 GB  8.3%   97624d02-ba48-42e7-88f7-2d3b0175d6ef
 UN  public ip 130.26 GB  8.3%   868c45b3-4afc-43db-b2d0-5c0f89d018fb
 Datacenter: us-east
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address   Load   Owns   Host ID
 UN  public ip246.74 GB  0.0%   212888f6-ecf8-4953-8f83-c5653fb176cb
 UN  public ip320.15 GB  0.0%   bcd696da-433b-4e6b-8030-11629eaf5b84
 UN  public ip353.22 GB  0.0%   3f5cb04a-3ac3-46f3-b101-31a9ae7682bc
 UN  public ip348.91 GB  0.0%   836b3b76-418a-4a22-bab4-c1a0bd49de65
 UN  public ip269.37 GB  0.0%   9408c7ff-ec47-4824-af81-92aa311a1984
 UN  public ip244.94 GB  0.0%   668eb3ca-8ee4-40ae-98e7-987c471bd675

 On each node of the new DC, owns 0% (from status view). A nodetool ring
 myks gives me:

 Datacenter: eu-west
 ==
 Replicas: 3

 Address RackStatus State   LoadOwns
  Token
 public ip1b  Up Normal  131.78 GB   25.00%
  113427455640312821154458202477256070485
 public ip1b  Up Normal  161.96 GB   25.00%
  141784319550391026443072753096570088106
 public ip1b  Up Normal  153.43 GB   25.00%
  70892159775195513221536376548285044053
 public ip1b  Up Normal  151.1 GB25.00%
  99249023685273718510150927167599061674
 public ip1b  Up Normal  130.26 GB   25.00%
  155962751505430129087380028406227096917
 public ip1b  Up Normal  146.82 GB   25.00%
  85070591730234615865843651857942052864
 public ip1b  Up Normal  171.35 GB   25.00%
  14178431955039102644307275309657008810
 public ip1b  Up Normal  132.14 GB   25.00%
  42535295865117307932921825928971026432
 public ip1b  Up Normal  140.26 GB   25.00%
  28356863910078205288614550619314017621
 public ip1b  Up Normal  133.43 GB   25.00%
  0
 public ip1b  Up Normal  130.18 GB   25.00%
  127605887595351923798765477786913079296
 public ip1b  Up Normal  178.27 GB   25.00%
  56713727820156410577229101238628035242

 Datacenter: us-east
 ==
 Replicas: 3

 Address RackStatus State   LoadOwns
  Token

  100
 public ip   1b  Up Normal  320.15 GB   50.00%
  28356863910078205288614550619314017721
 public ip   1b  Up Normal  353.14 GB   50.00%
  56713727820156410577229101238628035342
 public ip   1b  Up Normal  348.35 GB   50.00%
  85070591730234615865843651857942052964
 public ip   1b  Up Normal  269.35 GB   50.00%
  113427455640312821154458202477256070585
 public ip   1b  Up Normal  244.94 GB   50.00%
  141784319550391026443072753096570088206
 public ip   1b  Up Normal  246.74 GB   50.00%
  100

 This seems to be ok.

 When I run describe cluster; from cassandra-cli from an eu-west node :

 [default@unknown] describe cluster;
 Cluster Information:
Snitch: org.apache.cassandra.locator.Ec2MultiRegionSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions:
 e968865b-3b96-3c87-af0a-6294067a832f: [My 18 publics ip]

 So far so good.
 From an us-east node now :

 [default@unknown] describe cluster;
 Cluster Information:
Snitch: org.apache.cassandra.locator.Ec2MultiRegionSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions:
 UNREACHABLE: [public ip of the node itself]

 e968865b-3b96-3c87-af0a-6294067a832f: [17 others publics ip]


 Why isn't this node not able to see itself