Ihor Prokopov created CASSANDRA-13556:
-----------------------------------------

             Summary: Corrupted SSTables
                 Key: CASSANDRA-13556
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13556
             Project: Cassandra
          Issue Type: Bug
          Components: Compaction
         Environment: CentOS Linux release 7.3.1611 (Core)
openjdk version "1.8.0_121"
OpenJDK Runtime Environment (build 1.8.0_121-b13)
OpenJDK 64-Bit Server VM (build 25.121-b13, mixed mode)
Python cassandra (DataStax) driver v3.6.0
            Reporter: Ihor Prokopov
             Fix For: 3.9


After 3 month of working, we noticed that number of compaction tasks were 
growing (~600 pending tasks). SStables verification shows that some of them 
were corrupted. Repairing didn't help (it was crashing with error). 
Also some of requests (f.e. select * from fetcher where 
domain=8289511971670945261 and uri=-5417197141545933706; ) fails with next 
error:
{color:red}
Traceback (most recent call last):
  File "/var/cassandra/apache-cassandra-3.9/bin/cqlsh.py", line 1264, in 
perform_simple_statement
    result = future.result()
  File 
"/var/cassandra/apache-cassandra-3.9/bin/../lib/cassandra-driver-internal-only-3.5.0.post0-d8d0456.zip/cassandra-driver-3.5.0.post0-d8d0456/cassandra/cluster.py",
 line 3650, in result
    raise self._final_exception
error: unpack requires a string argument of length 4
{color}

Table chema:
{quote}
CREATE TABLE fetcher (
    domain bigint,
    uri bigint,
    date date,
    content_length int,
    elapsed float,
    encoding text,
    fetched_time bigint,
    flinks frozen<set<tuple<bigint, bigint»>,
    flinks_count int,
    html_fingerprint bigint,
    indexed boolean,
    adult boolean,
    kws_count int,
    lang_id int,
    last_updated bigint,
    redirect_url tuple<bigint, bigint>,
    revisit_date date,
    revisit_interval int,
    status_code int,
    tokens_fingerprint bigint,
    uris frozen<set<bigint»,
    url text,
    PRIMARY KEY (domain, uri, date)
) WITH CLUSTERING ORDER BY (uri ASC, date DESC)
    AND bloom_filter_fp_chance = 0.1
    AND caching = \{'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = 'fetcher history'
    AND compaction = \{'class': 
'org.apache.cassandra.db.compaction.LeveledCompactionStrategy',
                      'sstable_size_in_mb': '256',
                      'tombstone_threshold': '.2'}
    AND compression = \{'chunk_length_in_kb': '64',
                       'class': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.5
    AND speculative_retry = '99PERCENTILE';
{quote}

Corrupted 
[SSTable|https://drive.google.com/file/d/0B4ZaUOv0G9oMcHpERTdlb3ozSVk/view?usp=sharing].






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to