date:20170818

Re: Cassandra Writes Duplicated/Concatenated List Data

2017-08-18 Thread Sagar Jambhulkar

For the example  provided by you , are you saying you are getting two rows
for same pk1,pk2,time?
It may be a problem with your inserts when you are inserting multiple
distinct rows or  to validate all nodes are in sync try fetching using
CONSISTENCY ALL in cql.

On 18-Aug-2017 9:37 PM, "Nathan McLean"  wrote:

> @Sagar,
>
> A query to get the data looks like this (primary key values included in
> the query).
>
> SELECT * FROM table WHERE pk1='2269202-onstreet_high' AND pk2=2017 AND
> time='2017-07-18 03:15:00+';
>
> (in actual practice, the queries in our code would use query a range of
> time values).
>
> @Cristophe
>
> I actually haven't been able to reproduce this problem while testing. Rows
> like the example I gave just seem to show up very occasionally in our
> production data.
>
> On Wed, Aug 16, 2017 at 9:11 PM, Sagar Jambhulkar <
> sagar.jambhul...@gmail.com> wrote:
>
>> What is your query to fetch rows. Can you share P1,pk2,time for the
>> sample rows you pasted?
>>
>> On 17-Aug-2017 2:20 AM, "Nathan McLean" 
>> wrote:
>>
>>> Hello All,
>>>
>>> I have a Cassandra cluster with a table similar to the following:
>>>
>>> ```
>>> CREATE TABLE table (
>>> pk1 text,
>>> pk2 int,
>>> time timestamp,
>>> ...
>>> probability list,
>>> PRIMARY KEY ((pk1, pk2), time)
>>> ) WITH CLUSTERING ORDER BY (time DESC)
>>> ```
>>>
>>> Python processes write to this table using the DataStax python Cassandra
>>> driver package. I am occasionally seeing rows written to the table where
>>> the "probability" column list is the same list, duplicated and concatenated.
>>>
>>> e.g.
>>>
>>> probability
>>> ---
>>> [3.0951e-43, 1.695e-37, 2.7641e-32, 2.8028e-27, 1.9887e-22, 1.0165e-17,
>>> 3.7058e-13, 9.2127e-09, 0.000141, 0.999859,
>>>  3.0951e-43, 1.695e-37, 2.7641e-32, 2.8028e-27, 1.9887e-22, 1.0165e-17,
>>> 3.7058e-13, 9.2127e-09, 0.000141, 0.999859]
>>>
>>> The code that writes to Cassandra uses "INSERT" statements and validates
>>> that "probability" lists must always approximately sum to 1.0, so it does
>>> not seem possible that the python code that writes to Cassandra has a bug
>>> which is generating this data. The code may occasionally write to the same
>>> row multiple times.
>>>
>>> It appears that there may be a bug in either Cassandra or the python
>>> driver package which results in this list column being written to and
>>> appended to with the same data.
>>>
>>> Similar invalid data was also generated by a PySpark data migration
>>> script (using the DataStax spark Cassandra connector) that copied this list
>>> data to a new table.
>>>
>>> Here are the versions of libraries we are using:
>>>
>>> Cassandra version 3.6
>>> Spark version 1.6.0-hadoop2.6
>>> Python Cassandra driver 3.7.1
>>> (https://github.com/datastax/python-driver)
>>>
>>> Any help/insight into this problem would be greatly appreciated.
>>>
>>> Regards,
>>>
>>> Nathan
>>>
>>
>

Re: Getting all unique keys

2017-08-18 Thread kurt greaves

You can SELECT DISTINCT in CQL, however I would recommend against such a
pattern as it is very unlikely to be efficient, and prone to errors. A
distinct query will search every partition for the first live cell, which
could be buried behind a lot of tombstones. It's safe to say at some point
you will run into serious issues. Selecting all the keys in a table purely
to see what exists is not going to be cheap, and sounds awfully like an
anti-pattern. Why do you need this behaviour?

Re: Moving all LCS SSTables to a repaired state

2017-08-18 Thread kurt greaves

You need to run an incremental repair for sstables to be marked repaired.
However only if all of the data in that Sstable is repaired during the
repair will you end up with it being marked repaired, otherwise an
anticompaction will occur and split the unrepaired data into its own
sstable.
It's pretty unlikely you will get all SSTables marked as repaired unless
you stop writing data and run inc repair multiple times.

ExceptionInInitializerError encountered during startup

2017-08-18 Thread Russell Bateman


Cassandra version 3.9, -unit version 3.1.3.2.

In my (first ever) unit test, I've coded:

@BeforeClass public static void initFakeCassandra() throws 
InterruptedException, IOException, TTransportException

{
EmbeddedCassandraServerHelper.startEmbeddedCassandra( 2L );
}

Execution crashes down inside at

at org.apache.cassandra.transport.Server.start(Server.java:128)
at java.util.Collections$SingletonSet.forEach(Collections.java:4767)
at 
org.apache.cassandra.service.NativeTransportService.start(NativeTransportService.java:128)
at 
org.apache.cassandra.service.CassandraDaemon.startNativeTransport(CassandraDaemon.java:649)
at 
org.apache.cassandra.service.CassandraDaemon.start(CassandraDaemon.java:511)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:616)
at 
org.cassandraunit.utils.EmbeddedCassandraServerHelper$1.run(EmbeddedCassandraServerHelper.java:129)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException: name
at 
io.netty.util.internal.logging.AbstractInternalLogger.(AbstractInternalLogger.java:39)
at 
io.netty.util.internal.logging.Slf4JLogger.(Slf4JLogger.java:30)
at 
io.netty.util.internal.logging.Slf4JLoggerFactory.newInstance(Slf4JLoggerFactory.java:73)
at 
io.netty.util.internal.logging.InternalLoggerFactory.getInstance(InternalLoggerFactory.java:84)
at 
io.netty.util.internal.logging.InternalLoggerFactory.getInstance(InternalLoggerFactory.java:77)

at io.netty.bootstrap.ServerBootstrap.(ServerBootstrap.java:46)
... 10 more

I am following the tutorial at Baeldung. Not sure where to go from here. 
Stackoverflow response 
 
was not helpful to me, I probably don't know enough yet.


Thanks.

Re: Cassandra isn't compacting old files

2017-08-18 Thread Sotirios Delimanolis

There seem to be a lot of SSTables in a repaired state and a lot in an 
unrepaired state. For example, for this one table, the logs report

TRACE [main] 2017-08-15 23:50:30,732 LeveledManifest.java:473 - L0 contains 2 
SSTables (176997267 bytes) in Manifest@1217144872
TRACE [main] 2017-08-15 23:50:30,732 LeveledManifest.java:473 - L1 contains 10 
SSTables (2030691642 bytes) in Manifest@1217144872
TRACE [main] 2017-08-15 23:50:30,732 LeveledManifest.java:473 - L2 contains 94 
SSTables (19352545435 bytes) in Manifest@1217144872

and 

TRACE [main] 2017-08-15 23:50:30,731 LeveledManifest.java:473 - L0 contains 1 
SSTables (65038718 bytes) in Manifest@499561185
TRACE [main] 2017-08-15 23:50:30,731 LeveledManifest.java:473 - L2 contains 5 
SSTables (11722 bytes) in Manifest@499561185
TRACE [main] 2017-08-15 23:50:30,731 LeveledManifest.java:473 - L3 contains 39 
SSTables (7377654173 bytes) in Manifest@499561185

Is it possible that there's always a compaction to be run in the "repaired" 
state, with that many SSTables, that unrepaired compactions are essentially 
"starved", considering the WrappingCompactionStrategy prioritizes the 
"repaired" set?On Wednesday, August 2, 2017, 2:35:02 PM PDT, Sotirios 
Delimanolis  wrote:

Turns out there are already logs for this in Tracker.java. I enabled those and 
clearly saw the old files are being tracked.
What else can I look at for hints about whether these files are later 
invalidated/filtered out somehow?

On Tuesday, August 1, 2017, 3:29:38 PM PDT, Sotirios Delimanolis 
 wrote:

There aren't any ERROR logs for failure to load these files and they do get 
compacted away. I'll try to plug some DEBUG logs in a custom Cassandra 
version.On Tuesday, August 1, 2017, 12:13:09 PM PDT, Jeff Jirsa 
 wrote:

I don't have time to dive deep into the code of your version, but it may be ( 
https://issues.apache.org/jira/browse/CASSANDRA-13620 ) , or it may be 
something else.
I wouldn't expect compaction to touch them if they're invalid. The handle may 
be a leftover from trying to load them. 


On Tue, Aug 1, 2017 at 10:01 AM, Sotirios Delimanolis 
 wrote:

@Jeff, why does compaction clear them and why does Cassandra keep a handle to 
them? Shouldn't they be ignored entirely? Is there an error log I can enable to 
detect them?
@kurt, there are no such logs for any of these tables. We have a custom log in 
our build of Cassandra that does shows that compactions are happening for that 
table but only ever include the files from July.
On Tuesday, August 1, 2017, 12:55:53 AM PDT, kurt greaves 
 wrote:

Seeing as there aren't even 100 SSTables in L2, LCS should be gradually trying 
to compact L3 with L2. You could search the logs for "Adding high-level (L3)" 
to check if this is happening.

Moving all LCS SSTables to a repaired state

2017-08-18 Thread Sotirios Delimanolis

I have a table that uses LeveledCompactionStrategy on Cassandra 2.2. At the 
moment, it has two SSTables, both in level 1, one that's repaired and one that 
isn't.
$ sstablemetadata lb-135366-big-Data.db | head
SSTable: /home/cassandra/data/my_keyspace/my_table/lb-135366-big
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Bloom Filter FP chance: 0.10
Minimum timestamp: 635879632038598571
Maximum timestamp: 636386914930971960
SSTable max local deletion time: 2147483647
Compression ratio: 0.3329089923791937
Estimated droppable tombstones: 0.04952060383516932
SSTable Level: 1
Repaired at: 1503094842214
$ sstablemetadata lb-135367-big-Data.db | head
SSTable: /home/cassandra/data/my_keyspace/my_table/lb-135367-big
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Bloom Filter FP chance: 0.10
Minimum timestamp: 636386903663409770
Maximum timestamp: 636386932592309420
SSTable max local deletion time: 2147483647
Compression ratio: 0.34908682568154525
Estimated droppable tombstones: 0.4720670391061452
SSTable Level: 1
Repaired at: 0

What can I do to get these both into a repaired state? I tried running a full 
repair, but that didn't set the "Repaired at" state in the metadata and 
Cassandra still manages the SSTables separately as "repaired" and "unrepaired".
I've never ran this cluster through the migration steps detailed here. Is that 
what's necessary in my case?

Cassandra-count gives wrong results

2017-08-18 Thread Alain Rastoul


Hi,

I use cassandra-count (github 
https://github.com/brianmhess/cassandra-count) to count records in a 
table, but I have wrong results.


When I export data with cqlsh /copy to csv, I have 1M records in my test 
table, when I use cassandra-count I have different results for each node :
build/cassandra-count -host cstar1 -user cassandra -pw cassandra 
-keyspace metrics -table datapoints

metrics.datapoints: 379285

build/cassandra-count -host cstar2 -user cassandra -pw cassandra 
-keyspace metrics -table datapoints

metrics.datapoints: 324856

build/cassandra-count -host cstar3 -user cassandra -pw cassandra 
-keyspace metrics -table datapoints

metrics.datapoints: 340615

It used to work in previous runs, but suddenly, the results went  wrong, 
I can't understand why.


I downloaded the cassandra-count project, built and debugged it, but 
still can't understand.


the program reads partitions in system.size_estimates, then for each 
partition execute
SELECT COUNT(*) FROM keyspaceName.tableName WHERE Token("path") > ? AND 
Token("path") <= ?

with start and end ranges.
The ring is correct, each node has the same ring.


Any clue about this ?

TIA


--
best,
Alain


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

Re: Removing Columns from production table

2017-08-18 Thread Jonathan Baynes

Thanks Jeff I have a horrible feeling it may be. I'll get the errors from the 
dev guys Monday and email the group, hopefully I can tie this down.

Thanks

Sent from my iPhone

> On 18 Aug 2017, at 17:24, Jeff Jirsa  wrote:
>
> Cassandra-13004

This e-mail may contain confidential and/or privileged information. If you are 
not the intended recipient (or have received this e-mail in error) please 
notify the sender immediately and destroy it. Any unauthorized copying, 
disclosure or distribution of the material in this e-mail is strictly 
forbidden. Tradeweb reserves the right to monitor all e-mail communications 
through its networks. If you do not wish to receive marketing emails about our 
products / services, please let us know by contacting us, either by email at 
contac...@tradeweb.com or by writing to us at the registered office of Tradeweb 
in the UK, which is: Tradeweb Europe Limited (company number 3912826), 1 Fore 
Street Avenue London EC2Y 9DT. To see our privacy policy, visit our website @ 
www.tradeweb.com.

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

Re: Removing Columns from production table

2017-08-18 Thread Jeff Jirsa

Regrettably, this may be a manifestation of Cassandra-13004, which could 
corrupt data being read at the time you issued the ALTER TABLE command

What type of issues are you seeing? Anything in the logs? 

-- 
Jeff Jirsa


> On Aug 18, 2017, at 8:41 AM, Jonathan Baynes  
> wrote:
> 
> Hi
>  
> Is there anything I need to do after dropping a column and adding in a column 
> to flush cassandra of the changes? We are experiencing issues with our front 
> end application, and the developers are asking if the issue was caused by the 
> change in schema, as I’ve done the drop and add of a column, have I missed 
> something, do I need to make any further changes to the system with schema 
> changes?
>  
> Using Cassandra 3.0.11
>  
> Thanks
> J
>  
>  
> 
> 
> This e-mail may contain confidential and/or privileged information. If you 
> are not the intended recipient (or have received this e-mail in error) please 
> notify the sender immediately and destroy it. Any unauthorized copying, 
> disclosure or distribution of the material in this e-mail is strictly 
> forbidden. Tradeweb reserves the right to monitor all e-mail communications 
> through its networks. If you do not wish to receive marketing emails about 
> our products / services, please let us know by contacting us, either by email 
> at contac...@tradeweb.com or by writing to us at the registered office of 
> Tradeweb in the UK, which is: Tradeweb Europe Limited (company number 
> 3912826), 1 Fore Street Avenue London EC2Y 9DT. To see our privacy policy, 
> visit our website @ www.tradeweb.com.
>

Re: Cassandra Writes Duplicated/Concatenated List Data

2017-08-18 Thread Nathan McLean

@Sagar,

A query to get the data looks like this (primary key values included in the
query).

SELECT * FROM table WHERE pk1='2269202-onstreet_high' AND pk2=2017 AND
time='2017-07-18 03:15:00+';

(in actual practice, the queries in our code would use query a range of
time values).

@Cristophe

I actually haven't been able to reproduce this problem while testing. Rows
like the example I gave just seem to show up very occasionally in our
production data.

On Wed, Aug 16, 2017 at 9:11 PM, Sagar Jambhulkar <
sagar.jambhul...@gmail.com> wrote:

> What is your query to fetch rows. Can you share P1,pk2,time for the sample
> rows you pasted?
>
> On 17-Aug-2017 2:20 AM, "Nathan McLean" 
> wrote:
>
>> Hello All,
>>
>> I have a Cassandra cluster with a table similar to the following:
>>
>> ```
>> CREATE TABLE table (
>> pk1 text,
>> pk2 int,
>> time timestamp,
>> ...
>> probability list,
>> PRIMARY KEY ((pk1, pk2), time)
>> ) WITH CLUSTERING ORDER BY (time DESC)
>> ```
>>
>> Python processes write to this table using the DataStax python Cassandra
>> driver package. I am occasionally seeing rows written to the table where
>> the "probability" column list is the same list, duplicated and concatenated.
>>
>> e.g.
>>
>> probability
>> ---
>> [3.0951e-43, 1.695e-37, 2.7641e-32, 2.8028e-27, 1.9887e-22, 1.0165e-17,
>> 3.7058e-13, 9.2127e-09, 0.000141, 0.999859,
>>  3.0951e-43, 1.695e-37, 2.7641e-32, 2.8028e-27, 1.9887e-22, 1.0165e-17,
>> 3.7058e-13, 9.2127e-09, 0.000141, 0.999859]
>>
>> The code that writes to Cassandra uses "INSERT" statements and validates
>> that "probability" lists must always approximately sum to 1.0, so it does
>> not seem possible that the python code that writes to Cassandra has a bug
>> which is generating this data. The code may occasionally write to the same
>> row multiple times.
>>
>> It appears that there may be a bug in either Cassandra or the python
>> driver package which results in this list column being written to and
>> appended to with the same data.
>>
>> Similar invalid data was also generated by a PySpark data migration
>> script (using the DataStax spark Cassandra connector) that copied this list
>> data to a new table.
>>
>> Here are the versions of libraries we are using:
>>
>> Cassandra version 3.6
>> Spark version 1.6.0-hadoop2.6
>> Python Cassandra driver 3.7.1
>> (https://github.com/datastax/python-driver)
>>
>> Any help/insight into this problem would be greatly appreciated.
>>
>> Regards,
>>
>> Nathan
>>
>

Removing Columns from production table

2017-08-18 Thread Jonathan Baynes

Hi

Is there anything I need to do after dropping a column and adding in a column 
to flush cassandra of the changes? We are experiencing issues with our front 
end application, and the developers are asking if the issue was caused by the 
change in schema, as I've done the drop and add of a column, have I missed 
something, do I need to make any further changes to the system with schema 
changes?

Using Cassandra 3.0.11

Thanks
J





This e-mail may contain confidential and/or privileged information. If you are 
not the intended recipient (or have received this e-mail in error) please 
notify the sender immediately and destroy it. Any unauthorized copying, 
disclosure or distribution of the material in this e-mail is strictly 
forbidden. Tradeweb reserves the right to monitor all e-mail communications 
through its networks. If you do not wish to receive marketing emails about our 
products / services, please let us know by contacting us, either by email at 
contac...@tradeweb.com or by writing to us at the registered office of Tradeweb 
in the UK, which is: Tradeweb Europe Limited (company number 3912826), 1 Fore 
Street Avenue London EC2Y 9DT. To see our privacy policy, visit our website @ 
www.tradeweb.com.

Re: Getting all unique keys

2017-08-18 Thread Sruti S

hi:

Is this sensor data, hence timestamp? Ho w are you generating this 'key'
field?Can you have only the 'key' field as primary key? Even if not, since
that field is a part of the PK may make such queries fast.

However,  are there other attributes thst can be added that define unique
business keys? If so you can make those attributes primary keys so an index
is used.
It would help to understand why the list if distinct keys is needed, and at
this constant query rate.

Do you need to ensure the same key is not reused? If so, make your key
field a PK or add a uniqueness constraint, so duplicate inserts will fail.

If you *must* query constantly, u can also consider using a secondary index
to help speed up.
HTH.
Look forward to further clarification.

Friday, August 18, 2017, Avi Levi  wrote:

> Hi
>
> what is the most efficient way to get a distinct key list from a big table
> (aprox 20 mil inserts per minute) ?
>
> equivalent to *select distinct key from my_table *for this table
>
> *CREATE TABLE my_table (*
>
> *key text,*
>
> *timestamp bigint,*
>
> *value double,*
>
> *PRIMARY KEY (key, timestamp) )*
>
> I need to execute this query quite often ( every couple of minutes )
>
> I can of course maintain a table to hold only unique set of keys but this
> is of course error prone so I rather avoid it. but it's an option.
>
> Cheers
>
> Avi
>

Getting all unique keys

2017-08-18 Thread Avi Levi

Hi

what is the most efficient way to get a distinct key list from a big table
(aprox 20 mil inserts per minute) ?

equivalent to *select distinct key from my_table *for this table

*CREATE TABLE my_table (*

*key text,*

*timestamp bigint,*

*value double,*

*PRIMARY KEY (key, timestamp) )*

I need to execute this query quite often ( every couple of minutes )

I can of course maintain a table to hold only unique set of keys but this
is of course error prone so I rather avoid it. but it's an option.

Cheers

Avi

RE: Adding a new node with the double of disk space

2017-08-18 Thread Durity, Sean R

I am doing some on-the-job-learning on this newer feature of the 3.x line, 
where the token generation algorithm will compensate for different size nodes 
in a cluster. In fact, it is one of the main reasons I upgraded to 3.0.13, 
because I have a number of original nodes in a cluster that are about half the 
size of the newer nodes. With the same number of vnodes, they can get 
overwhelmed with too much data and have to be rebuilt, etc.

So, I am cutting vnodes in half on those original nodes and rebuilding them. So 
far, it is working as designed. The data size is about half on the smaller 
nodes.

With the more current advice being to use less vnodes, for the original 
question below, I might consider adding the new node in at 256 vnodes and then 
rebuilding all the other nodes at 128. Of course the cluster size and amount of 
data would be important factors, as well as the future growth of the cluster 
and the expected size of any additional nodes.

Sean Durity

From: Jeff Jirsa [mailto:jji...@gmail.com]
Sent: Thursday, August 17, 2017 4:20 PM
To: cassandra 
Subject: Re: Adding a new node with the double of disk space

If you really double the hardware in every way, it's PROBABLY reasonable to 
double num_tokens. It won't be quite the same as doubling all-the-things, 
because you still have a single JVM, and you'll still have to deal with GC as 
you're now reading twice as much and generating twice as much garbage, but you 
can probably adjust the tuning of the heap to compensate.

On Thu, Aug 17, 2017 at 1:00 PM, Kevin O'Connor 
> wrote:
Are you saying if a node had double the hardware capacity in every way it would 
be a bad idea to up num_tokens? I thought that was the whole idea of that 
setting though?

On Thu, Aug 17, 2017 at 9:52 AM, Carlos Rolo 
> wrote:
No.

If you would double all the hardware on that node vs the others would still be 
a bad idea.
Keep the cluster uniform vnodes wise.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant / Datastax Certified Architect / Cassandra MVP

Pythian - Love your data

rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin: 
linkedin.com/in/carlosjuzarterolo

Mobile: +351 918 918 100
www.pythian.com

On Thu, Aug 17, 2017 at 5:47 PM, Cogumelos Maravilha 
> wrote:
Hi all,

I need to add a new node to my cluster but this time the new node will
have the double of disk space comparing to the other nodes.

I'm using the default vnodes (num_tokens: 256). To fully use the disk
space in the new node I just have to configure num_tokens: 512?

Thanks in advance.

-
To unsubscribe, e-mail: 
user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: 
user-h...@cassandra.apache.org

--

The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.

Re: Adding a new node with the double of disk space

2017-08-18 Thread Carlos Rolo

I would preferably spin 2 JVMs inside the same hardware (if you double
everything) than having to deal with what Jeff stated.

Also certain operations are not really found of a large number of vnodes
(eg. repair). There was a lot of improvements in the 3.x release cycle, but
I do still tend to reduce vnodes number and not increase.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant / Datastax Certified Architect / Cassandra MVP

Pythian - Love your data

rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
*linkedin.com/in/carlosjuzarterolo
*
Mobile: +351 918 918 100
www.pythian.com

On Thu, Aug 17, 2017 at 9:19 PM, Jeff Jirsa  wrote:

> If you really double the hardware in every way, it's PROBABLY reasonable
> to double num_tokens. It won't be quite the same as doubling
> all-the-things, because you still have a single JVM, and you'll still have
> to deal with GC as you're now reading twice as much and generating twice as
> much garbage, but you can probably adjust the tuning of the heap to
> compensate.
>
>
>
> On Thu, Aug 17, 2017 at 1:00 PM, Kevin O'Connor 
> wrote:
>
>> Are you saying if a node had double the hardware capacity in every way it
>> would be a bad idea to up num_tokens? I thought that was the whole idea of
>> that setting though?
>>
>> On Thu, Aug 17, 2017 at 9:52 AM, Carlos Rolo  wrote:
>>
>>> No.
>>>
>>> If you would double all the hardware on that node vs the others would
>>> still be a bad idea.
>>> Keep the cluster uniform vnodes wise.
>>>
>>> Regards,
>>>
>>> Carlos Juzarte Rolo
>>> Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
>>>
>>> Pythian - Love your data
>>>
>>> rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
>>> *linkedin.com/in/carlosjuzarterolo
>>> *
>>> Mobile: +351 918 918 100 <+351%20918%20918%20100>
>>> www.pythian.com
>>>
>>> On Thu, Aug 17, 2017 at 5:47 PM, Cogumelos Maravilha <
>>> cogumelosmaravi...@sapo.pt> wrote:
>>>
 Hi all,

 I need to add a new node to my cluster but this time the new node will
 have the double of disk space comparing to the other nodes.

 I'm using the default vnodes (num_tokens: 256). To fully use the disk
 space in the new node I just have to configure num_tokens: 512?

 Thanks in advance.



 -
 To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
 For additional commands, e-mail: user-h...@cassandra.apache.org


>>>
>>> --
>>>
>>>
>>>
>>>
>>
>

-- 


--

Re: Cassandra Writes Duplicated/Concatenated List Data

Re: Getting all unique keys

Re: Moving all LCS SSTables to a repaired state

ExceptionInInitializerError encountered during startup

Re: Cassandra isn't compacting old files

Moving all LCS SSTables to a repaired state

Cassandra-count gives wrong results

Re: Removing Columns from production table

Re: Removing Columns from production table

Re: Cassandra Writes Duplicated/Concatenated List Data

Removing Columns from production table

Re: Getting all unique keys

Getting all unique keys

RE: Adding a new node with the double of disk space

Re: Adding a new node with the double of disk space

15 matches

Site Navigation

Mail list logo

Footer information