Re: missing rows while importing data using sstable loader
Arindam, What can you share regarding the source from which you are importing data? Is it a separate cassandra cluster? If so, how many nodes and datacenters? What is RF (replication factor) of source cluster? How certain are you that the rows indeed exist in the set of sstables which you are loading into sstableloader? I ask b/c as a hypothetical, if you load sstables from a single node from a 3 node single DC source cluster w/ RF=2, you won't be importing a full set of the data that existed in the source cluster. In the aforementioned case, you'd need to load sstables from at least two nodes to have imported a full set of the data, because of the RF (if RF was 3, then all you would need is a single node. If RF=1, then you'd need all sstables from all three nodes). On Fri, Jan 29, 2016 at 7:33 AM, Arindam Choudhury < arindam.choudh...@ackstorm.com> wrote: > Hi, > > I am importing data to a new cassandra cluster using sstableloader. The > sstableloader runs without any warning or error. But I am missing around > 1000 rows. > > Any feedback will be highly appreciated. > > Kind Regards, > Arindam Choudhury >
Re: missing rows while importing data using sstable loader
I sent a message to DataStax Docs to add this nodetool flush suggestion to the doc for sstableloader. -- Jack Krupansky On Fri, Feb 5, 2016 at 3:35 AM, Romain Hardouinwrote: > > What is the best practise to create sstables? > > When you run a "nodetool flush" Cassandra persists all the memtables on > disk, i.e. it produces sstables. > (You can create sstables by yourself thanks to CQLSSTableWriter, but I > don't think it was the point of your question.) >
Re: missing rows while importing data using sstable loader
What is the best practise to create sstables? On 1 February 2016 at 15:21, Romain Hardouinwrote: > Did you run "nodetool flush" on the source node? If not, the missing rows > could be in memtables. >
Re: missing rows while importing data using sstable loader
Did you run "nodetool flush" on the source node? If not, the missing rows could be in memtables.
Re: missing rows while importing data using sstable loader
Hi Romain, The RF was set to 2. I changed it to one. CREATE KEYSPACE mordor WITH replication = {'class' : 'SimpleStrategy', 'replication_factor' : 1} AND durable_writes = true; re-inserted the columns, still missing rows. Regards, Arindam On 29 January 2016 at 15:14, Romain Hardouinwrote: > Hi, > > I assume a RF > 1. Right? > What is the consistency level you used? cqlsh use ONE by default. > Try: > cqlsh> CONSISTENCY ALL > And run your query again. > > Best, > Romain > > > Le Vendredi 29 janvier 2016 13h45, Arindam Choudhury < > arindam.choudh...@ackstorm.com> a écrit : > > > Hi Kai, > > The table schema is: > > CREATE TABLE mordor.things_values_meta ( > thing_id text, > key text, > bucket_timestamp timestamp, > total_rows counter, > PRIMARY KEY ((thing_id, key), bucket_timestamp) > ) WITH CLUSTERING ORDER BY (bucket_timestamp ASC) > AND bloom_filter_fp_chance = 0.01 > AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' > AND comment = '' > AND compaction = {'min_threshold': '4', 'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32'} > AND compression = {'sstable_compression': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99.0PERCENTILE'; > > > I am just running "select count(*) from things_values_meta ;" to get the > count. > > Regards, > Arindam > > On 29 January 2016 at 13:39, Kai Wang wrote: > > Arindam, > > what's the table schema and what does your query to retrieve the rows look > like? > > On Fri, Jan 29, 2016 at 7:33 AM, Arindam Choudhury < > arindam.choudh...@ackstorm.com> wrote: > > Hi, > > I am importing data to a new cassandra cluster using sstableloader. The > sstableloader runs without any warning or error. But I am missing around > 1000 rows. > > Any feedback will be highly appreciated. > > Kind Regards, > Arindam Choudhury > > > > > >
Re: missing rows while importing data using sstable loader
Hi, I assume a RF > 1. Right?What is the consistency level you used? cqlsh use ONE by default. Try: cqlsh> CONSISTENCY ALLAnd run your query again. Best,Romain Le Vendredi 29 janvier 2016 13h45, Arindam Choudhurya écrit : Hi Kai, The table schema is: CREATE TABLE mordor.things_values_meta ( thing_id text, key text, bucket_timestamp timestamp, total_rows counter, PRIMARY KEY ((thing_id, key), bucket_timestamp) ) WITH CLUSTERING ORDER BY (bucket_timestamp ASC) AND bloom_filter_fp_chance = 0.01 AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; I am just running "select count(*) from things_values_meta ;" to get the count. Regards, Arindam On 29 January 2016 at 13:39, Kai Wang wrote: Arindam, what's the table schema and what does your query to retrieve the rows look like? On Fri, Jan 29, 2016 at 7:33 AM, Arindam Choudhury wrote: Hi, I am importing data to a new cassandra cluster using sstableloader. The sstableloader runs without any warning or error. But I am missing around 1000 rows. Any feedback will be highly appreciated. Kind Regards, Arindam Choudhury
Re: missing rows while importing data using sstable loader
I will check the output of nodetool cfstats. Its from version 2.1.2 to version 2.1.9. On 29 January 2016 at 16:02, Jack Krupanskywrote: > Are these sstables from an existing Cassandra cluster or generated by a > program? > > If the former, do a nodetool tablestats or cfstats to get the sstable > count and compare it to both the number of sstables that the loader is > reading from and the number that end up in the target cluster. > > What Cassandra version did the sstables come from and what version are you > importing into? > > > -- Jack Krupansky > > On Fri, Jan 29, 2016 at 9:34 AM, Arindam Choudhury < > arindam.choudh...@ackstorm.com> wrote: > >> Hi Romain, >> >> The RF was set to 2. >> >> I changed it to one. >> >> CREATE KEYSPACE mordor WITH replication = {'class' : 'SimpleStrategy', >> 'replication_factor' : 1} AND durable_writes = true; >> >> re-inserted the columns, still missing rows. >> >> Regards, >> Arindam >> >> On 29 January 2016 at 15:14, Romain Hardouin wrote: >> >>> Hi, >>> >>> I assume a RF > 1. Right? >>> What is the consistency level you used? cqlsh use ONE by default. >>> Try: >>> cqlsh> CONSISTENCY ALL >>> And run your query again. >>> >>> Best, >>> Romain >>> >>> >>> Le Vendredi 29 janvier 2016 13h45, Arindam Choudhury < >>> arindam.choudh...@ackstorm.com> a écrit : >>> >>> >>> Hi Kai, >>> >>> The table schema is: >>> >>> CREATE TABLE mordor.things_values_meta ( >>> thing_id text, >>> key text, >>> bucket_timestamp timestamp, >>> total_rows counter, >>> PRIMARY KEY ((thing_id, key), bucket_timestamp) >>> ) WITH CLUSTERING ORDER BY (bucket_timestamp ASC) >>> AND bloom_filter_fp_chance = 0.01 >>> AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' >>> AND comment = '' >>> AND compaction = {'min_threshold': '4', 'class': >>> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', >>> 'max_threshold': '32'} >>> AND compression = {'sstable_compression': >>> 'org.apache.cassandra.io.compress.LZ4Compressor'} >>> AND dclocal_read_repair_chance = 0.1 >>> AND default_time_to_live = 0 >>> AND gc_grace_seconds = 864000 >>> AND max_index_interval = 2048 >>> AND memtable_flush_period_in_ms = 0 >>> AND min_index_interval = 128 >>> AND read_repair_chance = 0.0 >>> AND speculative_retry = '99.0PERCENTILE'; >>> >>> >>> I am just running "select count(*) from things_values_meta ;" to get the >>> count. >>> >>> Regards, >>> Arindam >>> >>> On 29 January 2016 at 13:39, Kai Wang wrote: >>> >>> Arindam, >>> >>> what's the table schema and what does your query to retrieve the rows >>> look like? >>> >>> On Fri, Jan 29, 2016 at 7:33 AM, Arindam Choudhury < >>> arindam.choudh...@ackstorm.com> wrote: >>> >>> Hi, >>> >>> I am importing data to a new cassandra cluster using sstableloader. The >>> sstableloader runs without any warning or error. But I am missing around >>> 1000 rows. >>> >>> Any feedback will be highly appreciated. >>> >>> Kind Regards, >>> Arindam Choudhury >>> >>> >>> >>> >>> >>> >> >
Re: missing rows while importing data using sstable loader
Are these sstables from an existing Cassandra cluster or generated by a program? If the former, do a nodetool tablestats or cfstats to get the sstable count and compare it to both the number of sstables that the loader is reading from and the number that end up in the target cluster. What Cassandra version did the sstables come from and what version are you importing into? -- Jack Krupansky On Fri, Jan 29, 2016 at 9:34 AM, Arindam Choudhury < arindam.choudh...@ackstorm.com> wrote: > Hi Romain, > > The RF was set to 2. > > I changed it to one. > > CREATE KEYSPACE mordor WITH replication = {'class' : 'SimpleStrategy', > 'replication_factor' : 1} AND durable_writes = true; > > re-inserted the columns, still missing rows. > > Regards, > Arindam > > On 29 January 2016 at 15:14, Romain Hardouinwrote: > >> Hi, >> >> I assume a RF > 1. Right? >> What is the consistency level you used? cqlsh use ONE by default. >> Try: >> cqlsh> CONSISTENCY ALL >> And run your query again. >> >> Best, >> Romain >> >> >> Le Vendredi 29 janvier 2016 13h45, Arindam Choudhury < >> arindam.choudh...@ackstorm.com> a écrit : >> >> >> Hi Kai, >> >> The table schema is: >> >> CREATE TABLE mordor.things_values_meta ( >> thing_id text, >> key text, >> bucket_timestamp timestamp, >> total_rows counter, >> PRIMARY KEY ((thing_id, key), bucket_timestamp) >> ) WITH CLUSTERING ORDER BY (bucket_timestamp ASC) >> AND bloom_filter_fp_chance = 0.01 >> AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' >> AND comment = '' >> AND compaction = {'min_threshold': '4', 'class': >> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', >> 'max_threshold': '32'} >> AND compression = {'sstable_compression': >> 'org.apache.cassandra.io.compress.LZ4Compressor'} >> AND dclocal_read_repair_chance = 0.1 >> AND default_time_to_live = 0 >> AND gc_grace_seconds = 864000 >> AND max_index_interval = 2048 >> AND memtable_flush_period_in_ms = 0 >> AND min_index_interval = 128 >> AND read_repair_chance = 0.0 >> AND speculative_retry = '99.0PERCENTILE'; >> >> >> I am just running "select count(*) from things_values_meta ;" to get the >> count. >> >> Regards, >> Arindam >> >> On 29 January 2016 at 13:39, Kai Wang wrote: >> >> Arindam, >> >> what's the table schema and what does your query to retrieve the rows >> look like? >> >> On Fri, Jan 29, 2016 at 7:33 AM, Arindam Choudhury < >> arindam.choudh...@ackstorm.com> wrote: >> >> Hi, >> >> I am importing data to a new cassandra cluster using sstableloader. The >> sstableloader runs without any warning or error. But I am missing around >> 1000 rows. >> >> Any feedback will be highly appreciated. >> >> Kind Regards, >> Arindam Choudhury >> >> >> >> >> >> >
Re: missing rows while importing data using sstable loader
I am counting the rows with "select count(*) from mordor.things_values_meta;" I am doing one node cluster to one node cluster for testing. On 29 January 2016 at 16:20, Jack Krupanskywrote: > And how are you counting the rows? With a query? If, so, what is the > query. Using nodetool cfstats (estimated) key count? Or... what? > > Are the tokens for the missing rows is the same range and a distinct range > from the rest of the data in the original cluster? > > How many nodes in the original cluster? > > -- Jack Krupansky > > On Fri, Jan 29, 2016 at 10:12 AM, Arindam Choudhury < > arindam.choudh...@ackstorm.com> wrote: > >> I will check the output of nodetool cfstats. >> >> Its from version 2.1.2 to version 2.1.9. >> >> On 29 January 2016 at 16:02, Jack Krupansky >> wrote: >> >>> Are these sstables from an existing Cassandra cluster or generated by a >>> program? >>> >>> If the former, do a nodetool tablestats or cfstats to get the sstable >>> count and compare it to both the number of sstables that the loader is >>> reading from and the number that end up in the target cluster. >>> >>> What Cassandra version did the sstables come from and what version are >>> you importing into? >>> >>> >>> -- Jack Krupansky >>> >>> On Fri, Jan 29, 2016 at 9:34 AM, Arindam Choudhury < >>> arindam.choudh...@ackstorm.com> wrote: >>> Hi Romain, The RF was set to 2. I changed it to one. CREATE KEYSPACE mordor WITH replication = {'class' : 'SimpleStrategy', 'replication_factor' : 1} AND durable_writes = true; re-inserted the columns, still missing rows. Regards, Arindam On 29 January 2016 at 15:14, Romain Hardouin wrote: > Hi, > > I assume a RF > 1. Right? > What is the consistency level you used? cqlsh use ONE by default. > Try: > cqlsh> CONSISTENCY ALL > And run your query again. > > Best, > Romain > > > Le Vendredi 29 janvier 2016 13h45, Arindam Choudhury < > arindam.choudh...@ackstorm.com> a écrit : > > > Hi Kai, > > The table schema is: > > CREATE TABLE mordor.things_values_meta ( > thing_id text, > key text, > bucket_timestamp timestamp, > total_rows counter, > PRIMARY KEY ((thing_id, key), bucket_timestamp) > ) WITH CLUSTERING ORDER BY (bucket_timestamp ASC) > AND bloom_filter_fp_chance = 0.01 > AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' > AND comment = '' > AND compaction = {'min_threshold': '4', 'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32'} > AND compression = {'sstable_compression': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99.0PERCENTILE'; > > > I am just running "select count(*) from things_values_meta ;" to get > the count. > > Regards, > Arindam > > On 29 January 2016 at 13:39, Kai Wang wrote: > > Arindam, > > what's the table schema and what does your query to retrieve the rows > look like? > > On Fri, Jan 29, 2016 at 7:33 AM, Arindam Choudhury < > arindam.choudh...@ackstorm.com> wrote: > > Hi, > > I am importing data to a new cassandra cluster using sstableloader. > The sstableloader runs without any warning or error. But I am missing > around 1000 rows. > > Any feedback will be highly appreciated. > > Kind Regards, > Arindam Choudhury > > > > > > >>> >> >
Re: missing rows while importing data using sstable loader
Arindam, what's the table schema and what does your query to retrieve the rows look like? On Fri, Jan 29, 2016 at 7:33 AM, Arindam Choudhury < arindam.choudh...@ackstorm.com> wrote: > Hi, > > I am importing data to a new cassandra cluster using sstableloader. The > sstableloader runs without any warning or error. But I am missing around > 1000 rows. > > Any feedback will be highly appreciated. > > Kind Regards, > Arindam Choudhury >
Re: missing rows while importing data using sstable loader
Hi Kai, The table schema is: CREATE TABLE mordor.things_values_meta ( thing_id text, key text, bucket_timestamp timestamp, total_rows counter, PRIMARY KEY ((thing_id, key), bucket_timestamp) ) WITH CLUSTERING ORDER BY (bucket_timestamp ASC) AND bloom_filter_fp_chance = 0.01 AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; I am just running "select count(*) from things_values_meta ;" to get the count. Regards, Arindam On 29 January 2016 at 13:39, Kai Wangwrote: > Arindam, > > what's the table schema and what does your query to retrieve the rows look > like? > > On Fri, Jan 29, 2016 at 7:33 AM, Arindam Choudhury < > arindam.choudh...@ackstorm.com> wrote: > >> Hi, >> >> I am importing data to a new cassandra cluster using sstableloader. The >> sstableloader runs without any warning or error. But I am missing around >> 1000 rows. >> >> Any feedback will be highly appreciated. >> >> Kind Regards, >> Arindam Choudhury >> > >
Re: missing rows while importing data using sstable loader
Why in cqlsh when I query "select count(*) from mordor.things_values_meta ;" it says: 4692 But in nodetool cfstats it says Number of keys (estimate): 4720? On 29 January 2016 at 16:25, Arindam Choudhury < arindam.choudh...@ackstorm.com> wrote: > I am counting the rows with "select count(*) from > mordor.things_values_meta;" > > I am doing one node cluster to one node cluster for testing. > > On 29 January 2016 at 16:20, Jack Krupansky> wrote: > >> And how are you counting the rows? With a query? If, so, what is the >> query. Using nodetool cfstats (estimated) key count? Or... what? >> >> Are the tokens for the missing rows is the same range and a distinct >> range from the rest of the data in the original cluster? >> >> How many nodes in the original cluster? >> >> -- Jack Krupansky >> >> On Fri, Jan 29, 2016 at 10:12 AM, Arindam Choudhury < >> arindam.choudh...@ackstorm.com> wrote: >> >>> I will check the output of nodetool cfstats. >>> >>> Its from version 2.1.2 to version 2.1.9. >>> >>> On 29 January 2016 at 16:02, Jack Krupansky >>> wrote: >>> Are these sstables from an existing Cassandra cluster or generated by a program? If the former, do a nodetool tablestats or cfstats to get the sstable count and compare it to both the number of sstables that the loader is reading from and the number that end up in the target cluster. What Cassandra version did the sstables come from and what version are you importing into? -- Jack Krupansky On Fri, Jan 29, 2016 at 9:34 AM, Arindam Choudhury < arindam.choudh...@ackstorm.com> wrote: > Hi Romain, > > The RF was set to 2. > > I changed it to one. > > CREATE KEYSPACE mordor WITH replication = {'class' : > 'SimpleStrategy', 'replication_factor' : 1} AND durable_writes = true; > > re-inserted the columns, still missing rows. > > Regards, > Arindam > > On 29 January 2016 at 15:14, Romain Hardouin > wrote: > >> Hi, >> >> I assume a RF > 1. Right? >> What is the consistency level you used? cqlsh use ONE by default. >> Try: >> cqlsh> CONSISTENCY ALL >> And run your query again. >> >> Best, >> Romain >> >> >> Le Vendredi 29 janvier 2016 13h45, Arindam Choudhury < >> arindam.choudh...@ackstorm.com> a écrit : >> >> >> Hi Kai, >> >> The table schema is: >> >> CREATE TABLE mordor.things_values_meta ( >> thing_id text, >> key text, >> bucket_timestamp timestamp, >> total_rows counter, >> PRIMARY KEY ((thing_id, key), bucket_timestamp) >> ) WITH CLUSTERING ORDER BY (bucket_timestamp ASC) >> AND bloom_filter_fp_chance = 0.01 >> AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' >> AND comment = '' >> AND compaction = {'min_threshold': '4', 'class': >> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', >> 'max_threshold': '32'} >> AND compression = {'sstable_compression': >> 'org.apache.cassandra.io.compress.LZ4Compressor'} >> AND dclocal_read_repair_chance = 0.1 >> AND default_time_to_live = 0 >> AND gc_grace_seconds = 864000 >> AND max_index_interval = 2048 >> AND memtable_flush_period_in_ms = 0 >> AND min_index_interval = 128 >> AND read_repair_chance = 0.0 >> AND speculative_retry = '99.0PERCENTILE'; >> >> >> I am just running "select count(*) from things_values_meta ;" to get >> the count. >> >> Regards, >> Arindam >> >> On 29 January 2016 at 13:39, Kai Wang wrote: >> >> Arindam, >> >> what's the table schema and what does your query to retrieve the rows >> look like? >> >> On Fri, Jan 29, 2016 at 7:33 AM, Arindam Choudhury < >> arindam.choudh...@ackstorm.com> wrote: >> >> Hi, >> >> I am importing data to a new cassandra cluster using sstableloader. >> The sstableloader runs without any warning or error. But I am missing >> around 1000 rows. >> >> Any feedback will be highly appreciated. >> >> Kind Regards, >> Arindam Choudhury >> >> >> >> >> >> > >>> >> >
Re: missing rows while importing data using sstable loader
I agree that there should be more clear doc on exactly how the estimation is calculated. When I inquired about this recently the response was that it should be within about 2% of the actual key count. I started looking at the code, but I ran out of time before I chased down all the subsidiary factors in the calculation. It would be nice to have an explicit nodetool option to count actual keys. Presumably that would be more efficient than a select count(*). -- Jack Krupansky On Fri, Jan 29, 2016 at 11:27 AM, Arindam Choudhury < arindam.choudh...@ackstorm.com> wrote: > Why in cqlsh when I query "select count(*) from mordor.things_values_meta > ;" it says: 4692 > > But in nodetool cfstats it says Number of keys (estimate): 4720? > > On 29 January 2016 at 16:25, Arindam Choudhury < > arindam.choudh...@ackstorm.com> wrote: > >> I am counting the rows with "select count(*) from >> mordor.things_values_meta;" >> >> I am doing one node cluster to one node cluster for testing. >> >> On 29 January 2016 at 16:20, Jack Krupansky>> wrote: >> >>> And how are you counting the rows? With a query? If, so, what is the >>> query. Using nodetool cfstats (estimated) key count? Or... what? >>> >>> Are the tokens for the missing rows is the same range and a distinct >>> range from the rest of the data in the original cluster? >>> >>> How many nodes in the original cluster? >>> >>> -- Jack Krupansky >>> >>> On Fri, Jan 29, 2016 at 10:12 AM, Arindam Choudhury < >>> arindam.choudh...@ackstorm.com> wrote: >>> I will check the output of nodetool cfstats. Its from version 2.1.2 to version 2.1.9. On 29 January 2016 at 16:02, Jack Krupansky wrote: > Are these sstables from an existing Cassandra cluster or generated by > a program? > > If the former, do a nodetool tablestats or cfstats to get the sstable > count and compare it to both the number of sstables that the loader is > reading from and the number that end up in the target cluster. > > What Cassandra version did the sstables come from and what version are > you importing into? > > > -- Jack Krupansky > > On Fri, Jan 29, 2016 at 9:34 AM, Arindam Choudhury < > arindam.choudh...@ackstorm.com> wrote: > >> Hi Romain, >> >> The RF was set to 2. >> >> I changed it to one. >> >> CREATE KEYSPACE mordor WITH replication = {'class' : >> 'SimpleStrategy', 'replication_factor' : 1} AND durable_writes = true; >> >> re-inserted the columns, still missing rows. >> >> Regards, >> Arindam >> >> On 29 January 2016 at 15:14, Romain Hardouin >> wrote: >> >>> Hi, >>> >>> I assume a RF > 1. Right? >>> What is the consistency level you used? cqlsh use ONE by default. >>> Try: >>> cqlsh> CONSISTENCY ALL >>> And run your query again. >>> >>> Best, >>> Romain >>> >>> >>> Le Vendredi 29 janvier 2016 13h45, Arindam Choudhury < >>> arindam.choudh...@ackstorm.com> a écrit : >>> >>> >>> Hi Kai, >>> >>> The table schema is: >>> >>> CREATE TABLE mordor.things_values_meta ( >>> thing_id text, >>> key text, >>> bucket_timestamp timestamp, >>> total_rows counter, >>> PRIMARY KEY ((thing_id, key), bucket_timestamp) >>> ) WITH CLUSTERING ORDER BY (bucket_timestamp ASC) >>> AND bloom_filter_fp_chance = 0.01 >>> AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' >>> AND comment = '' >>> AND compaction = {'min_threshold': '4', 'class': >>> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', >>> 'max_threshold': '32'} >>> AND compression = {'sstable_compression': >>> 'org.apache.cassandra.io.compress.LZ4Compressor'} >>> AND dclocal_read_repair_chance = 0.1 >>> AND default_time_to_live = 0 >>> AND gc_grace_seconds = 864000 >>> AND max_index_interval = 2048 >>> AND memtable_flush_period_in_ms = 0 >>> AND min_index_interval = 128 >>> AND read_repair_chance = 0.0 >>> AND speculative_retry = '99.0PERCENTILE'; >>> >>> >>> I am just running "select count(*) from things_values_meta ;" to get >>> the count. >>> >>> Regards, >>> Arindam >>> >>> On 29 January 2016 at 13:39, Kai Wang wrote: >>> >>> Arindam, >>> >>> what's the table schema and what does your query to retrieve the >>> rows look like? >>> >>> On Fri, Jan 29, 2016 at 7:33 AM, Arindam Choudhury < >>> arindam.choudh...@ackstorm.com> wrote: >>> >>> Hi, >>> >>> I am importing data to a new cassandra cluster using sstableloader. >>> The sstableloader runs without any warning or error. But I am missing >>> around 1000 rows. >>> >>> Any feedback will
Re: missing rows while importing data using sstable loader
And how are you counting the rows? With a query? If, so, what is the query. Using nodetool cfstats (estimated) key count? Or... what? Are the tokens for the missing rows is the same range and a distinct range from the rest of the data in the original cluster? How many nodes in the original cluster? -- Jack Krupansky On Fri, Jan 29, 2016 at 10:12 AM, Arindam Choudhury < arindam.choudh...@ackstorm.com> wrote: > I will check the output of nodetool cfstats. > > Its from version 2.1.2 to version 2.1.9. > > On 29 January 2016 at 16:02, Jack Krupansky> wrote: > >> Are these sstables from an existing Cassandra cluster or generated by a >> program? >> >> If the former, do a nodetool tablestats or cfstats to get the sstable >> count and compare it to both the number of sstables that the loader is >> reading from and the number that end up in the target cluster. >> >> What Cassandra version did the sstables come from and what version are >> you importing into? >> >> >> -- Jack Krupansky >> >> On Fri, Jan 29, 2016 at 9:34 AM, Arindam Choudhury < >> arindam.choudh...@ackstorm.com> wrote: >> >>> Hi Romain, >>> >>> The RF was set to 2. >>> >>> I changed it to one. >>> >>> CREATE KEYSPACE mordor WITH replication = {'class' : 'SimpleStrategy', >>> 'replication_factor' : 1} AND durable_writes = true; >>> >>> re-inserted the columns, still missing rows. >>> >>> Regards, >>> Arindam >>> >>> On 29 January 2016 at 15:14, Romain Hardouin >>> wrote: >>> Hi, I assume a RF > 1. Right? What is the consistency level you used? cqlsh use ONE by default. Try: cqlsh> CONSISTENCY ALL And run your query again. Best, Romain Le Vendredi 29 janvier 2016 13h45, Arindam Choudhury < arindam.choudh...@ackstorm.com> a écrit : Hi Kai, The table schema is: CREATE TABLE mordor.things_values_meta ( thing_id text, key text, bucket_timestamp timestamp, total_rows counter, PRIMARY KEY ((thing_id, key), bucket_timestamp) ) WITH CLUSTERING ORDER BY (bucket_timestamp ASC) AND bloom_filter_fp_chance = 0.01 AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; I am just running "select count(*) from things_values_meta ;" to get the count. Regards, Arindam On 29 January 2016 at 13:39, Kai Wang wrote: Arindam, what's the table schema and what does your query to retrieve the rows look like? On Fri, Jan 29, 2016 at 7:33 AM, Arindam Choudhury < arindam.choudh...@ackstorm.com> wrote: Hi, I am importing data to a new cassandra cluster using sstableloader. The sstableloader runs without any warning or error. But I am missing around 1000 rows. Any feedback will be highly appreciated. Kind Regards, Arindam Choudhury >>> >> >