[jira] [Updated] (CASSANDRA-12992) when mapreduce create sstables and load to cassandra cluster,then drop the table there are much data file not moved to snapshot

2017-09-06 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-12992:
---
Description: 
when mapreduce create sstables and load to cassandra cluster,then drop the 
table there are much data file not move to snapshot,

nodetool clearsnapshot can not free the disk,

wo must Manual delete the files 


cassandra table schema:
{code}

CREATE TABLE test.st_platform_api_restaurant_export (
id_date text PRIMARY KEY,
dt text,
eleme_order_total double,
order_amt bigint,
order_date text,
restaurant_id int,
total double
) WITH bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = 'restaurant'
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 2592000
AND gc_grace_seconds = 1800
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
{code}


mapreduce job:
{code}

CREATE EXTERNAL TABLE st_platform_api_restaurant_export_h2c_sstable
(
id_date string,
order_amt bigint,
total double,
eleme_order_total double,
order_date string,
restaurant_id int,
dt string)  STORED BY 
'org.apache.hadoop.hive.cassandra.bulkload.CqlBulkStorageHandler'
TBLPROPERTIES (
'cassandra.output.keyspace.username' = 'cassandra',
'cassandra.output.keyspace'='test',
'cassandra.output.partitioner.class'='org.apache.cassandra.dht.Murmur3Partitioner',
'cassandra.output.keyspace.passwd'='cassandra',
'mapreduce.output.basename'='st_platform_api_restaurant_export',
'cassandra.output.thrift.address'='casandra cluster ips',
'cassandra.output.delete.source'='true',
'cassandra.columnfamily.insert.st_platform_api_restaurant_export'='insert into 
test.st_platform_api_restaurant_export(id_date,order_amt,total,eleme_order_total,order_date,restaurant_id,dt)values(?,?,?,?,?,?,?)',
'cassandra.columnfamily.schema.st_platform_api_restaurant_export'='CREATE TABLE 
test.st_platform_api_restaurant_export (id_date text PRIMARY KEY,dt 
text,eleme_order_total double,order_amt bigint,order_date text,restaurant_id 
int,total double)');
{code}

  was:
{code}
when mapreduce create sstables and load to cassandra cluster,then drop the 
table there are much data file not move to snapshot,

nodetool clearsnapshot can not free the disk,

wo must Manual delete the files 


cassandra table schema:

CREATE TABLE test.st_platform_api_restaurant_export (
id_date text PRIMARY KEY,
dt text,
eleme_order_total double,
order_amt bigint,
order_date text,
restaurant_id int,
total double
) WITH bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = 'restaurant'
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 2592000
AND gc_grace_seconds = 1800
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';


mapreduce job:
CREATE EXTERNAL TABLE st_platform_api_restaurant_export_h2c_sstable
(
id_date string,
order_amt bigint,
total double,
eleme_order_total double,
order_date string,
restaurant_id int,
dt string)  STORED BY 
'org.apache.hadoop.hive.cassandra.bulkload.CqlBulkStorageHandler'
TBLPROPERTIES (
'cassandra.output.keyspace.username' = 'cassandra',
'cassandra.output.keyspace'='test',
'cassandra.output.partitioner.class'='org.apache.cassandra.dht.Murmur3Partitioner',
'cassandra.output.keyspace.passwd'='cassandra',
'mapreduce.output.basename'='st_platform_api_restaurant_export',
'cassandra.output.thrift.address'='casandra cluster ips',
'cassandra.output.delete.source'='true',
'cassandra.columnfamily.insert.st_platform_api_restaurant_export'='insert into 
test.st_platform_api_restaurant_export(id_date,order_amt,total,eleme_order_total,order_date,restaurant_id,dt)values(?,?,?,?,?,?,?)',
'cassandra.columnfamily.schema.st_platform_api_restaurant_export'='CREATE TABLE 
test.st_platform_api_restaurant_export (id_date text PRIMARY KEY,dt 
text,eleme_order_total double,order_amt bigint,order_date text,restaurant_id 
int,total double)');
{code}


> when mapreduce create sstables and load to cassandra cluster,then drop the 
> table there are much data file not moved to snapshot
> 

[jira] [Updated] (CASSANDRA-12992) when mapreduce create sstables and load to cassandra cluster,then drop the table there are much data file not moved to snapshot

2017-09-06 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

翟玉勇 updated CASSANDRA-12992:

Description: 
{code}
when mapreduce create sstables and load to cassandra cluster,then drop the 
table there are much data file not move to snapshot,

nodetool clearsnapshot can not free the disk,

wo must Manual delete the files 


cassandra table schema:

CREATE TABLE test.st_platform_api_restaurant_export (
id_date text PRIMARY KEY,
dt text,
eleme_order_total double,
order_amt bigint,
order_date text,
restaurant_id int,
total double
) WITH bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = 'restaurant'
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 2592000
AND gc_grace_seconds = 1800
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';


mapreduce job:
CREATE EXTERNAL TABLE st_platform_api_restaurant_export_h2c_sstable
(
id_date string,
order_amt bigint,
total double,
eleme_order_total double,
order_date string,
restaurant_id int,
dt string)  STORED BY 
'org.apache.hadoop.hive.cassandra.bulkload.CqlBulkStorageHandler'
TBLPROPERTIES (
'cassandra.output.keyspace.username' = 'cassandra',
'cassandra.output.keyspace'='test',
'cassandra.output.partitioner.class'='org.apache.cassandra.dht.Murmur3Partitioner',
'cassandra.output.keyspace.passwd'='cassandra',
'mapreduce.output.basename'='st_platform_api_restaurant_export',
'cassandra.output.thrift.address'='casandra cluster ips',
'cassandra.output.delete.source'='true',
'cassandra.columnfamily.insert.st_platform_api_restaurant_export'='insert into 
test.st_platform_api_restaurant_export(id_date,order_amt,total,eleme_order_total,order_date,restaurant_id,dt)values(?,?,?,?,?,?,?)',
'cassandra.columnfamily.schema.st_platform_api_restaurant_export'='CREATE TABLE 
test.st_platform_api_restaurant_export (id_date text PRIMARY KEY,dt 
text,eleme_order_total double,order_amt bigint,order_date text,restaurant_id 
int,total double)');
{code}

  was:
when mapreduce create sstables and load to cassandra cluster,then drop the 
table there are much data file not move to snapshot,

nodetool clearsnapshot can not free the disk,

wo must Manual delete the files 


cassandra table schema:

CREATE TABLE test.st_platform_api_restaurant_export (
id_date text PRIMARY KEY,
dt text,
eleme_order_total double,
order_amt bigint,
order_date text,
restaurant_id int,
total double
) WITH bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = 'restaurant'
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 2592000
AND gc_grace_seconds = 1800
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';


mapreduce job:
CREATE EXTERNAL TABLE st_platform_api_restaurant_export_h2c_sstable
(
id_date string,
order_amt bigint,
total double,
eleme_order_total double,
order_date string,
restaurant_id int,
dt string)  STORED BY 
'org.apache.hadoop.hive.cassandra.bulkload.CqlBulkStorageHandler'
TBLPROPERTIES (
'cassandra.output.keyspace.username' = 'cassandra',
'cassandra.output.keyspace'='test',
'cassandra.output.partitioner.class'='org.apache.cassandra.dht.Murmur3Partitioner',
'cassandra.output.keyspace.passwd'='cassandra',
'mapreduce.output.basename'='st_platform_api_restaurant_export',
'cassandra.output.thrift.address'='casandra cluster ips',
'cassandra.output.delete.source'='true',
'cassandra.columnfamily.insert.st_platform_api_restaurant_export'='insert into 
test.st_platform_api_restaurant_export(id_date,order_amt,total,eleme_order_total,order_date,restaurant_id,dt)values(?,?,?,?,?,?,?)',
'cassandra.columnfamily.schema.st_platform_api_restaurant_export'='CREATE TABLE 
test.st_platform_api_restaurant_export (id_date text PRIMARY KEY,dt 
text,eleme_order_total double,order_amt bigint,order_date text,restaurant_id 
int,total double)');



> when mapreduce create sstables and load to cassandra cluster,then drop the 
> table there are much data file not moved to snapshot
> ---

[jira] [Updated] (CASSANDRA-12992) when mapreduce create sstables and load to cassandra cluster,then drop the table there are much data file not moved to snapshot

2016-12-05 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

翟玉勇 updated CASSANDRA-12992:

Description: 
when mapreduce create sstables and load to cassandra cluster,then drop the 
table there are much data file not move to snapshot,

nodetool clearsnapshot can not free the disk,

wo must Manual delete the files 


cassandra table schema:

CREATE TABLE test.st_platform_api_restaurant_export (
id_date text PRIMARY KEY,
dt text,
eleme_order_total double,
order_amt bigint,
order_date text,
restaurant_id int,
total double
) WITH bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = 'restaurant'
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 2592000
AND gc_grace_seconds = 1800
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';


mapreduce job:
CREATE EXTERNAL TABLE st_platform_api_restaurant_export_h2c_sstable
(
id_date string,
order_amt bigint,
total double,
eleme_order_total double,
order_date string,
restaurant_id int,
dt string)  STORED BY 
'org.apache.hadoop.hive.cassandra.bulkload.CqlBulkStorageHandler'
TBLPROPERTIES (
'cassandra.output.keyspace.username' = 'cassandra',
'cassandra.output.keyspace'='test',
'cassandra.output.partitioner.class'='org.apache.cassandra.dht.Murmur3Partitioner',
'cassandra.output.keyspace.passwd'='cassandra',
'mapreduce.output.basename'='st_platform_api_restaurant_export',
'cassandra.output.thrift.address'='casandra cluster ips',
'cassandra.output.delete.source'='true',
'cassandra.columnfamily.insert.st_platform_api_restaurant_export'='insert into 
test.st_platform_api_restaurant_export(id_date,order_amt,total,eleme_order_total,order_date,restaurant_id,dt)values(?,?,?,?,?,?,?)',
'cassandra.columnfamily.schema.st_platform_api_restaurant_export'='CREATE TABLE 
test.st_platform_api_restaurant_export (id_date text PRIMARY KEY,dt 
text,eleme_order_total double,order_amt bigint,order_date text,restaurant_id 
int,total double)');


  was:
when mapreduce create sstables and load to cassandra cluster,then drop the 
table there are much data file not move to snapshot,

nodetool clearsnapshot can not free the disk,

wo must Manual delete the files 


Summary: when mapreduce create sstables and load to cassandra 
cluster,then drop the table there are much data file not moved to snapshot  
(was: when mapreduce create sstables and load to cassandra cluster,then drop 
the table there are much data file not move to snapshot)

> when mapreduce create sstables and load to cassandra cluster,then drop the 
> table there are much data file not moved to snapshot
> ---
>
> Key: CASSANDRA-12992
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12992
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: cassandra 2.1.15
>Reporter: 翟玉勇
>Priority: Minor
> Attachments: after-droptable.png, before-droptable.png
>
>
> when mapreduce create sstables and load to cassandra cluster,then drop the 
> table there are much data file not move to snapshot,
> nodetool clearsnapshot can not free the disk,
> wo must Manual delete the files 
> cassandra table schema:
> CREATE TABLE test.st_platform_api_restaurant_export (
> id_date text PRIMARY KEY,
> dt text,
> eleme_order_total double,
> order_amt bigint,
> order_date text,
> restaurant_id int,
> total double
> ) WITH bloom_filter_fp_chance = 0.01
> AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
> AND comment = 'restaurant'
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
> AND compression = {'sstable_compression': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 2592000
> AND gc_grace_seconds = 1800
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99.0PERCENTILE';
> mapreduce job:
> CREATE EXTERNAL TABLE st_platform_api_restaurant_export_h2c_sstable
> (
> id_date string,
> order_amt bigint,
> total double,
> eleme_order_total double,
> order_date string,
> restaurant_id int,
> dt string)  STORED BY 
>