Re: High Bloom Filter FP Ratio

2014-12-19 Thread Chris Hart
Hi Tyler,

I tried what you said and false positives look much more reasonable there.  
Thanks for looking into this.

-Chris

- Original Message -
From: "Tyler Hobbs" 
To: user@cassandra.apache.org
Sent: Friday, December 19, 2014 1:25:29 PM
Subject: Re: High Bloom Filter FP Ratio

I took a look at the code where the bloom filter true/false positive
counters are updated and notice that the true-positive count isn't being
updated on key cache hits:
https://issues.apache.org/jira/browse/CASSANDRA-8525.  That may explain
your ratios.

Can you try querying for a few non-existent partition keys in cqlsh with
tracing enabled (just run "TRACING ON") and see if you really do get that
high of a false-positive ratio?

On Fri, Dec 19, 2014 at 9:59 AM, Mark Greene  wrote:
>
> We're seeing similar behavior except our FP ratio is closer to 1.0 (100%).
>
> We're using Cassandra 2.1.2.
>
>
> Schema
> ---
> CREATE TABLE contacts.contact (
> id bigint,
> property_id int,
> created_at bigint,
> updated_at bigint,
> value blob,
> PRIMARY KEY (id, property_id)
> ) WITH CLUSTERING ORDER BY (property_id ASC)
> *AND bloom_filter_fp_chance = 0.001*
> AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
> AND comment = ''
> AND compaction = {'min_threshold': '4', 'class':
> 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy',
> 'max_threshold': '32'}
> AND compression = {'sstable_compression':
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99.0PERCENTILE';
>
> CF Stats Output:
> -
> Keyspace: contacts
> Read Count: 2458375
> Read Latency: 0.852844076675 ms.
> Write Count: 10357
> Write Latency: 0.1816912233272183 ms.
> Pending Flushes: 0
> Table: contact
> SSTable count: 61
> SSTables in each level: [1, 10, 50, 0, 0, 0, 0, 0, 0]
> Space used (live): 9047112471
> Space used (total): 9047112471
> Space used by snapshots (total): 0
> SSTable Compression Ratio: 0.34119240020241487
> Memtable cell count: 24570
> Memtable data size: 1299614
> Memtable switch count: 2
> Local read count: 2458290
> Local read latency: 0.853 ms
> Local write count: 10044
> Local write latency: 0.186 ms
> Pending flushes: 0
> Bloom filter false positives: 11096
> *Bloom filter false ratio: 0.99197*
> Bloom filter space used: 3923784
> Compacted partition minimum bytes: 373
> Compacted partition maximum bytes: 152321
> Compacted partition mean bytes: 9938
> Average live cells per slice (last five minutes): 37.57851240677983
> Maximum live cells per slice (last five minutes): 63.0
> Average tombstones per slice (last five minutes): 0.0
> Maximum tombstones per slice (last five minutes): 0.0
>
> --
> about.me <http://about.me/markgreene>
>
> On Wed, Dec 17, 2014 at 1:32 PM, Chris Hart  wrote:
>>
>> Hi,
>>
>> I have create the following table with bloom_filter_fp_chance=0.01:
>>
>> CREATE TABLE logged_event (
>>   time_key bigint,
>>   partition_key_randomizer int,
>>   resource_uuid timeuuid,
>>   event_json text,
>>   event_type text,
>>   field_error_list map,
>>   javascript_timestamp timestamp,
>>   javascript_uuid uuid,
>>   page_impression_guid uuid,
>>   page_request_guid uuid,
>>   server_received_timestamp timestamp,
>>   session_id bigint,
>>   PRIMARY KEY ((time_key, partition_key_randomizer), resource_uuid)
>> ) WITH
>>   bloom_filter_fp_chance=0.01 AND
>>   caching='KEYS_ONLY' AND
>>   comment='' AND
>>   dclocal_read_repair_chance=0.00 AND
>>   gc_grace_seconds=864000 AND
>>   index_interval=128 AND
>>   read_repair_chance=0.00 AND
>>   replicate_on_write='true' AND
>>   populate_io_cache_on_flush='false' AND
>>   default_time_to_live=0 AND
>>   speculative_retry='99.0PERCENTILE' AN

High Bloom Filter FP Ratio

2014-12-17 Thread Chris Hart
Hi,

I have create the following table with bloom_filter_fp_chance=0.01:

CREATE TABLE logged_event (
  time_key bigint,
  partition_key_randomizer int,
  resource_uuid timeuuid,
  event_json text,
  event_type text,
  field_error_list map,
  javascript_timestamp timestamp,
  javascript_uuid uuid,
  page_impression_guid uuid,
  page_request_guid uuid,
  server_received_timestamp timestamp,
  session_id bigint,
  PRIMARY KEY ((time_key, partition_key_randomizer), resource_uuid)
) WITH
  bloom_filter_fp_chance=0.01 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.00 AND
  gc_grace_seconds=864000 AND
  index_interval=128 AND
  read_repair_chance=0.00 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  default_time_to_live=0 AND
  speculative_retry='99.0PERCENTILE' AND
  memtable_flush_period_in_ms=0 AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'LZ4Compressor'};


When I run cfstats, I see a much higher false positive ratio:

Table: logged_event
SSTable count: 15
Space used (live), bytes: 104128214227
Space used (total), bytes: 104129482871
SSTable Compression Ratio: 0.3295840184239226
Number of keys (estimate): 199293952
Memtable cell count: 56364
Memtable data size, bytes: 20903960
Memtable switch count: 148
Local read count: 1396402
Local read latency: 0.362 ms
Local write count: 2345306
Local write latency: 0.062 ms
Pending tasks: 0
Bloom filter false positives: 147705
Bloom filter false ratio: 0.49020
Bloom filter space used, bytes: 249129040
Compacted partition minimum bytes: 447
Compacted partition maximum bytes: 315852
Compacted partition mean bytes: 1636
Average live cells per slice (last five minutes): 0.0
Average tombstones per slice (last five minutes): 0.0

Any idea what could be causing this?  This is timeseries data.  Every time we 
read from this table, we read a single row key with 1000 
partition_key_randomizer values.  I'm running cassandra 2.0.11.  I tried 
running an upgradesstables to rewrite them, which didn't change this behavior 
at all.  I'm using size tiered compaction and I haven't done any major 
compactions.

Thanks,
Chris


Re: really bad select performance

2012-04-05 Thread Chris Hart
Thanks for all the help everyone.  The values were meant to be binary.  I ended 
making the possible values between 0 and 50 instead of just 0 or 1.  That way 
no single index row gets that wide.  I now run queries for everything from 1 to 
50 to get 'queued' items and set the value to 0 when I'm done (I will never 
query for row_loaded = 0).  It's unfortunate Cassandra doesn't delegate the 
query execution to a node that had the index row on it, but rather tries to 
move the entire index row to the node that is queried.

-Chris

- Original Message -
From: "David Leimbach" 
To: user@cassandra.apache.org
Sent: Monday, April 2, 2012 8:51:46 AM
Subject: Re: really bad select performance


This is all very hypothetical, but I've been bitten by this before. 

Does row_loaded happen to be a binary or boolean value? If so the secondary 
index generated by Cassandra will have at most 2 rows, and they'll be REALLY 
wide if you have a lot of entries. Since Cassandra doesn't distribute columns 
over rows, those potentially very wide index rows, and their replicas, must 
live in SSTables in their entirety on the nodes that own them (and their 
replicas). 


Even though you limit 1, I'm not sure what "behind the scenes" things Cassandra 
does. I've received advice to avoid the built in secondary indexes in Cassandra 
for some of these reasons. Also if row_loaded is meant to implement some kind 
of queuing behavior, it could be the wrong problem space for Cassandra as a 
result of all of the above. 









On Sat, Mar 31, 2012 at 12:22 PM, aaron morton < aa...@thelastpickle.com > 
wrote: 




Is there anything in the logs when you run the queries ? 


Try turning the logging up to DEBUG on the node that fails to return and see 
what happens. You will see it send messages to other nodes and do work itself. 

One thing to note, a query that uses secondary indexes runs on a node for each 
token range. So it will use more than CL number of nodes. 


Cheers 







- 
Aaron Morton 
Freelance Developer 
@aaronmorton 
http://www.thelastpickle.com 


On 30/03/2012, at 11:52 AM, Chris Hart wrote: 



Hi, 

I have the following cluster: 

136112946768375385385349842972707284580 
 MountainViewRAC1 Up Normal 1.86 GB 20.00% 0 
 MountainViewRAC1 Up Normal 2.17 GB 33.33% 
56713727820156410577229101238628035242 
 MountainViewRAC1 Up Normal 2.41 GB 33.33% 
113427455640312821154458202477256070485 
 Rackspace RAC1 Up Normal 3.9 GB 13.33% 
136112946768375385385349842972707284580 

The following query runs quickly on all nodes except 1 MountainView node: 

select * from Access_Log where row_loaded = 0 limit 1; 

There is a secondary index on row_loaded. The query usually doesn't complete 
(but sometimes does) on the bad node and returns very quickly on all other 
nodes. I've upping the rpc timeout to a full minute (rpc_timeout_in_ms: 6) 
in the yaml, but it still often doesn't complete in a minute. It seems just as 
likely to complete and takes about the same amount of time whether the limit is 
1, 100 or 1000. 


Thanks for any help, 
Chris 




really bad select performance

2012-03-29 Thread Chris Hart
Hi,

I have the following cluster:

136112946768375385385349842972707284580 
  MountainViewRAC1Up Normal  1.86 GB 20.00%  0  
 
  MountainViewRAC1Up Normal  2.17 GB 33.33%  
56713727820156410577229101238628035242  
  MountainViewRAC1Up Normal  2.41 GB 33.33%  
113427455640312821154458202477256070485 
 Rackspace   RAC1Up Normal  3.9 GB  13.33%  
136112946768375385385349842972707284580

The following query runs quickly on all nodes except 1 MountainView node:

 select * from Access_Log where row_loaded = 0 limit 1;

There is a secondary index on row_loaded.  The query usually doesn't complete 
(but sometimes does) on the bad node and returns very quickly on all other 
nodes.  I've upping the rpc timeout to a full minute (rpc_timeout_in_ms: 6) 
in the yaml, but it still often doesn't complete in a minute.  It seems just as 
likely to complete and takes about the same amount of time whether the limit is 
1, 100 or 1000.


Thanks for any help,
Chris


Re: internode communication using multiple network interfaces

2012-02-10 Thread Chris Hart
Thanks.  Setting the broadcast address to the external IP address and setting 
the listen_address to 0.0.0.0 seems to have fixed it.  Does that mean that all 
other nodes, even those on the same local network, will communicate with that 
node using it's external IP address?  It would be much better if nodes on the 
local network could use the internal IP address and only nodes not on the same 
network would use the external one.

- Original Message -
From: "aaron morton" 
To: user@cassandra.apache.org
Sent: Thursday, February 9, 2012 12:42:54 AM
Subject: Re: internode communication using multiple network interfaces



I have 3 Cassandra nodes in one data center all on the same local network, 
which needs to replicate from an off site data center. Only 1 of the 3 nodes, 
called dw01, is externally accessible. 


If you want to run a multi data centre cluster, all the nodes in both data 
centers need to be able to connect to each other. 


When it comes to exposing nodes behind a fire wall broadcast_address can help, 
see the help in cassandra.yam and 
https://issues.apache.org/jira/browse/CASSANDRA-2491 


Hope that helps. 







- 
Aaron Morton 
Freelance Developer 
@aaronmorton 
http://www.thelastpickle.com 


On 9/02/2012, at 9:56 AM, Chris Hart wrote: 



Hi, 

I have 3 Cassandra nodes in one data center all on the same local network, 
which needs to replicate from an off site data center. Only 1 of the 3 nodes, 
called dw01, is externally accessible. dw01 has 2 network interfaces, one 
externally accessible and one internal. All 3 nodes talk to each other fine 
when I set dw01's listen_address to the internal IP address. As soon as I set 
the listen_address to the external IP address, there is no communication 
between dw01 and other 2 nodes. The other nodes should be able to send to 
dw01's external IP address (I can telnet from them to dw01 on port 7000 and 
7001 just fine), but dw01 obviously would need to use it's internal network 
interface to send anything to the other 2 nodes. Is this a setup that is 
possible with Cassandra? If not, any recommendations on how I could implement 
this? 

Thanks, 
Chris 



internode communication using multiple network interfaces

2012-02-08 Thread Chris Hart
Hi, 

I have 3 Cassandra nodes in one data center all on the same local network, 
which needs to replicate from an off site data center.  Only 1 of the 3 nodes, 
called dw01, is externally accessible.  dw01 has 2 network interfaces, one 
externally accessible and one internal.  All 3 nodes talk to each other fine 
when I set dw01's listen_address to the internal IP address.  As soon as I set 
the listen_address to the external IP address, there is no communication 
between dw01 and other 2 nodes.  The other nodes should be able to send to 
dw01's external IP address (I can telnet from them to dw01 on port 7000 and 
7001 just fine), but dw01 obviously would need to use it's internal network 
interface to send anything to the other 2 nodes.  Is this a setup that is 
possible with Cassandra?  If not, any recommendations on how I could implement 
this?

Thanks,
Chris