If you're experiencing high I/O load and not getting any Java OutOfMemory
(OOM) errors, you should try to keep your heap size as low as possible as
this provides the OS filesystem cache with more memory, which will reduce
read I/O load significantly. I'm not familiar the performance of Windows
I am having time out errors while reading.
I have 5 CFs but two CFs with high write/read.
The data is organized in time series rows, in CF1 the new rows are read
every 10 seconds and then the whole rows are deleted, While in CF2 the rows
are read in different time range slices and eventually
If you're bottle-necking on read I/O making proper use of Cassandras key
cache and row cache will improve things dramatically.
A little maths using the numbers you've provided tells me that you have
about 80GB of hot data (data valid in a 4 hour period). That's obviously
too much to directly
Its a little bit different than what most people use it for, and that's
why we are trying to test it, to see if we can benefit from the speed of
writing/reading, scalability when and if we need it, and also the coast.
and part of the testing we are doing, is trying to see how many nodes do
we
Thanks for the advise...
We are running on Windows, and I just added more memory to my system,
16G I will run the test again with 8G heap.
The load is continues, however, the CPU usage is around 40% with max of 70%.
As for cache, I am not using cache, because I am under the impression
that
SSDs are not reliable after a (relatively-low compared to spinning
disk) number of writes.
They may significantly boost performance if used on the journal
storage, but will suffer short lifetimes for highly-random write
patterns.
In general, plan to replace them frequently. Whether they are worth
SSD will not generally improve your write performance very much, but they
can significantly improve read performance.
You do *not* want to waste an SSD on the commitlog drive, as even a slow HDD
can write sequentially very quickly. For the data drive, they might make
sense.
As Jonathan talks
Thanks for the reply.
I am having time out errors while reading.
I have 5 CFs but two CFs with high write/read.
The data is organized in time series rows, in CF1 the new rows are read
every 10 seconds and then the whole rows are deleted, While in CF2 the
rows are read in different time range
Ah. Point taken on the random access SSD performance. I was trying to
emphasize the relative failure rates given the two scenarios. I didn't
mean to imply that SSD random access performance was not a likely
improvement here, just that it was a complicated trade-off in the
grand scheme of things..
How high is high and how much data do you have (Cassandra disk usage).
Regards,
Terje
On 4 Nov 2010, at 04:32, Alaa Zubaidi alaa.zuba...@pdf.com wrote:
Hi,
we have a continuous high throughput writes, read and delete, and we are
trying to find the best hardware.
Is using SSD for Cassandra
around 1800 col/sec per node, 3kb columns, reading is the same.
Data will be deleted after 4 hours.
On 11/3/2010 5:00 PM, Terje Marthinussen wrote:
How high is high and how much data do you have (Cassandra disk usage).
Regards,
Terje
On 4 Nov 2010, at 04:32, Alaa Zubaidialaa.zuba...@pdf.com
Some comments inline...
On Wed, Nov 3, 2010 at 1:44 PM, Jonathan Shook jsh...@gmail.com wrote:
SSDs are not reliable after a (relatively-low compared to spinning
disk) number of writes.
They may significantly boost performance if used on the journal
storage, but will suffer short lifetimes
12 matches
Mail list logo