Re: 0.7 release

2010-10-15 Thread Juho Mäkinen
probably not going to upgrade to 0.7 until it has been stable for at least a month or two. - Juho Mäkinen On Thu, Oct 14, 2010 at 11:07 PM, Chris Oei chris@nestria.com wrote: Hi all, I'm trying to figure out whether I should migrate from 0.6.5 to 0.6.6 or go directly to 0.7 when it's

Re: 10G Ethernet / Infiniband

2010-10-27 Thread Juho Mäkinen
The disks in cassandra node will most probably be your bottleneck. I'd suggest (haven't tried, this is just based on my intuition) to invest in SSD disks first and only after that think about going 10Gbps. - Garo On Tue, Oct 26, 2010 at 10:11 PM, Wayne wav...@gmail.com wrote: Is anyone out

Re: cassandra data spreading across the cluster

2010-11-04 Thread Juho Mäkinen
The load contains duplicate data which is created due to compaction. Run 'cleanup' command with nodetool to those big nodes and you should see the load drop to the actual usage. - Garo On Thu, Nov 4, 2010 at 11:08 AM, Mark Zitnik mark.zit...@gmail.com wrote: Hi All, I'm having a problem in

TSocket: Could not write 85 bytes -exception on some get_column calls

2010-07-06 Thread Juho Mäkinen
,]) But the requests which fails to the TSocket: Could not write 85 bytes -exception doesn't print anything to the logs. I'm doing the same request all the time with same parameters. How I should debug this issue? All ideas and tips are greatly appreciated. - Juho Mäkinen

Re: TSocket: Could not write 85 bytes -exception on some get_column calls

2010-07-06 Thread Juho Mäkinen
started at version 0.6.1, but updating to 0.6.3 did not fix the problem. - Juho Mäkinen On Tue, Jul 6, 2010 at 12:05 PM, Pieter Maes maesc...@gmail.com wrote:  I'm having a simmular problem, but i get verry random read problems 4 bytes timouts. (mailed before about this) The only fix i got/found

Re: Digg 4 Preview on TWiT

2010-07-07 Thread Juho Mäkinen
doing this, or have you switched to store the key-value -pairs in cassandra instead of mysql? What else are you storing in cassandra than just the inbox search? - Juho Mäkinen On Tue, Jul 6, 2010 at 10:01 PM, Prashant Malik pma...@gmail.com wrote: This is a ridiculous statement by some newbie I

Re: a_long_is_exactly_8_bytes

2010-07-07 Thread Juho Mäkinen
on 32bit environment. - Juho Mäkinen On Wed, Jul 7, 2010 at 2:08 PM, john xie shanfengg...@gmail.com wrote: http://wiki.apache.org/cassandra/FAQ#a_long_is_exactly_8_bytes /** * Takes php integer and packs it to 64bit (8 bytes) long big endian binary representation

Re: Cassandra benchmarking on Rackspace Cloud

2010-07-19 Thread Juho Mäkinen
yet, any tips on that? Could you, David, send me the stress.py command line which you used? - Juho Mäkinen On Mon, Jul 19, 2010 at 10:51 PM, David Schoonover david.schoono...@gmail.com wrote: Sorry, mixed signals in my response. I was partially replying to suggestions that we were limited

Re: Cassandra benchmarking on Rackspace Cloud

2010-07-20 Thread Juho Mäkinen
time to run the test with three servers, but I'll do it later anyway to see what kind of results it will produce. Also doing the test with RF=2 should confirm that we can increase the cluster throughput by increasing the RF count even if the requests don't hit the disks. - Juho Mäkinen On Tue

Re: what causes MESSAGE-DESERIALIZER-POOL to spike

2010-07-27 Thread Juho Mäkinen
Off topic, but what was this tool which prints per cpu utilization? - Garo On Mon, Jul 26, 2010 at 10:22 PM, Dathan Pattishall datha...@gmail.com wrote: But the 16 cores are hardly utilized. Which indicates to me there is some bad thread thrashing, but why?   1  [|   

Re: Thrift + PHP: help!

2010-08-23 Thread Juho Mäkinen
I have had to build a wrapper around php thrift calls which automatically retry the cassandra thrift operation in case there was a failure. It's not a proper sollution, but it has worked in our case well enough to be reliable. Of course it would be nice if I wouldn't need such ugly hack. - Garo

Re: Poor performance; PHP amp; Thrift to blame

2010-08-23 Thread Juho Mäkinen
Beware that the native thrift php bindings has a bug which might change provided argument types. Check out the bug report which I filled: https://issues.apache.org/jira/browse/THRIFT-796 - Garo On Fri, Aug 20, 2010 at 10:35 AM, sasha sasha2...@gmail.com wrote: Julian Simon jsimon at

get_slice sometimes returns previous result on php

2010-08-30 Thread Juho Mäkinen
what's really going on, but I'd be very happy if someone could have any clue or helpful ideas how to debug this out. - Juho Mäkinen

Re: Thrift + PHP: help!

2010-08-30 Thread Juho Mäkinen
the list back to thrift connection function. In case all nodes have been tried (and thus removed) it refills the node list and starts looping it again. In practice this will never happen but the code is there just to be sure :) - Juho Mäkinen If one node is failing (let's assume it's overloaded

Re: NodeTool won't connect remotely

2010-08-30 Thread Juho Mäkinen
I think that JMX needs additional ports to function correctly. Try to disable all firewalls between the client and the server so that client can connect to any port in the server and try again. - Juho Mäkinen On Mon, Aug 30, 2010 at 7:07 PM, Allan Carroll alla...@gmail.com wrote: Hi, I'm

Re: get_slice sometimes returns previous result on php

2010-08-30 Thread Juho Mäkinen
happens in the between. Tomorrow I'm going to implement a history buffer which logs all cassandra operations within the php request and logs it out in case I detect this anomaly again. Hopefully that gives some light to the problem. - Juho Mäkinen On Mon, Aug 30, 2010 at 10:50 PM, Benjamin Black

Re: get_slice sometimes returns previous result on php

2010-08-31 Thread Juho Mäkinen
:) - Juho Mäkinen On Mon, Aug 30, 2010 at 11:15 PM, Juho Mäkinen juho.maki...@gmail.com wrote: I'm not using connection poolin where the same tcp socket is used between different php requests. I open a new thrift connection with new socket to the node and I use the node through the request and I

Re: IndexingLocking in Cassandra

2010-09-16 Thread Juho Mäkinen
capabilities. You can always use some external locking mechanism like zookeeper [http://hadoop.apache.org/zookeeper/] or implement your own sollution on top of cassandra (not recommended as it's quite hard to get it correctly). - Juho Mäkinen / Garo

Re: Schema question

2010-09-20 Thread Juho Mäkinen
and insert it back to TalkLastMessages There are also other operations and the actual payload is a bit more complex. I'm happy to answer questions if somebody is interested :) - Juho Mäkinen On Mon, Sep 20, 2010 at 12:57 PM, Morten Wegelbye Nissen m...@monit.dk wrote:  Hello List, No matter where

Cassandra operation success ratio survey results

2010-09-21 Thread Juho Mäkinen
preparing to try up to ten times is not a bad idea. The cluster users 0.6.5 with RF=3. Each operation is executed until it succeeds or until 10 retries using this php wrapper http://github.com/dynamoid/cassandra-utilities Have others found similar results? Please discuss :) - Juho Mäkinen

Re: Schema question

2010-09-21 Thread Juho Mäkinen
formatted json text in a column, a supercolumn could have serve you? Yes but that wouldn't benefit us so I just choosed to use simple CF with JSON as column payload as they're easier to handle. Also check my post from today Cassandra operation success ratio survey results. - Juho Mäkinen ./Morten

Re: Cassandra operation success ratio survey results

2010-09-21 Thread Juho Mäkinen
by request type (get_slice takes longer than get_column) - Juho Mäkinen On Tue, Sep 21, 2010 at 5:56 PM, Morten Wegelbye Nissen m...@monit.dk wrote:  On 21-09-2010 15:29, Juho Mäkinen wrote: It's known that compaction hurts the node performance so that it might miss some requests. That's why

Re: first step with Cassandra

2010-10-04 Thread Juho Mäkinen
I posted a real life example how we used cassandra to store data for a facebook chat like application. Check it out at http://www.juhonkoti.net/2010/09/25/example-how-to-model-your-data-into-nosql-with-cassandra - Juho Mäkinen On Mon, Oct 4, 2010 at 7:04 PM, Petr Odut petr.o...@gmail.com wrote

Question on how to run incremental repairs

2014-10-22 Thread Juho Mäkinen
I'm having problems understanding how incremental repairs are supposed to be run. If I try to do nodetool repair -inc cassandra will complain that It is not possible to mix sequential repair and incremental repairs. However it seems that running nodetool repair -inc -par does the job, but I

Re: Multi Datacenter / MultiRegion on AWS Best practice ?

2014-10-27 Thread Juho Mäkinen
Hi! 2014-10-23 11:16 GMT+02:00 Alain RODRIGUEZ arodr...@gmail.com: We are currently wondering about the best way to configure network architecture to have a Cassandra cluster multi DC. On solution 2, we would need to open IPs one by one on 3 ports (7000, 9042, 9160) at least. 100 entries

Re: bulk data load

2014-10-30 Thread Juho Mäkinen
You should split your batch statements into smaller batches, say 100 operations per batch (or less if you keep getting those errors). You can also grow the batch_size_warn_threshold_in_kb in your cassandra.yaml a bit, I'm using 20kb in my cluster. You can read more from the relevant Jira:

Did not get positive replies from all endpoints error on incremental repair

2014-10-30 Thread Juho Mäkinen
I'm having problems running nodetool repair -inc -par -pr on my 2.1.1 cluster due to Did not get positive replies from all endpoints error. Here's an example output: root@db08-3:~# nodetool repair -par -inc -pr [2014-10-30 10:33:02,396] Nothing to repair for keyspace 'system' [2014-10-30

Re: Did not get positive replies from all endpoints error on incremental repair

2014-10-30 Thread Juho Mäkinen
...@rahul.be wrote: It appears to come from the ActiveRepairService.prepareForRepair portion of the Code. Are you sure all nodes are reachable from the node you are initiating repair on, at the same time? Any Node up/down/died messages? Rahul Neelakantan On Oct 30, 2014, at 6:37 AM, Juho Mäkinen

Re: Did not get positive replies from all endpoints error on incremental repair

2014-10-31 Thread Juho Mäkinen
a different code path which would explain this. I can't yet call this conclusive, but it seems that I can't run incremental repairs on the current 2.1.1 and I'm still wondering if anybody else is experiencing the same problem. On Thu, Oct 30, 2014 at 1:14 PM, Juho Mäkinen juho.maki...@gmail.com wrote

Re: Re[2]: Redundancy inside a cassandra node

2014-11-08 Thread Juho Mäkinen
I have used Supermicro servers in my previous work and they give excellent quality for their money. They have been considered a bit cheap quality wise in the past, but the current models are pretty good. They offer all standard stuff like remote control cards (IPMI), dual power supplies (if you

Re: My cluster shows high system load without any apparent reason

2016-07-22 Thread Juho Mäkinen
> > -Mark > > On Fri, Jul 22, 2016 at 8:10 AM, Juho Mäkinen <juho.maki...@gmail.com> > wrote: > > After a few days I've also tried disabling Linux kernel huge pages > > defragement (echo never > /sys/kernel/mm/transparent_hugepage/defrag) and > > turning coale

Re: Open source equivalents of OpsCenter

2016-07-14 Thread Juho Mäkinen
I'm doing some work on replacing OpsCenter in out setup. I ended creating a Docker container which contains the following features: - Cassandra 2.2.7 - MX4J (a JMX to REST bridge) as a java-agent - metrics-graphite-3.1.0.jar (export some but not all JMX to graphite) - a custom ruby which uses

Re: Ring connection timeouts with 2.2.6

2016-07-20 Thread Juho Mäkinen
Just to pick this up: Did you see any system load spikes? I'm tracing a problem on 2.2.7 where my cluster sees load spikes up to 20-30, when the normal average load is around 3-4. So far I haven't found any good reason, but I'm going to try otc_coalescing_strategy: disabled tomorrow. - Garo On

My cluster shows high system load without any apparent reason

2016-07-20 Thread Juho Mäkinen
I just recently upgraded our cluster to 2.2.7 and after turning the cluster under production load the instances started to show high load (as shown by uptime) without any apparent reason and I'm not quite sure what could be causing it. We are running on i2.4xlarge, so we have 16 cores, 120GB of

Questions on LCS behaviour after big BulkLoad cluster bootstrap

2016-07-06 Thread Juho Mäkinen
Hello. I'm in the process of migrating my old 60 node cluster into a new 72 node cluster running 2.2.6. I fired BulkLoader on the old cluster to stream all data from every node in the old cluster to my new cluster, and I'm now watching as my new cluster is doing compactions. What I like is to

Re: My cluster shows high system load without any apparent reason

2016-07-23 Thread Juho Mäkinen
og is very latency > sensitive, even under low load. Do be sure you're using the deadline > or noop scheduler for that reason, too. > > -Mark > > On Fri, Jul 22, 2016 at 4:44 PM, Juho Mäkinen <juho.maki...@gmail.com> > wrote: > >> Are you using XFS or Ext4 for data?

Re: My cluster shows high system load without any apparent reason

2016-07-22 Thread Juho Mäkinen
any major change except the system/kernel CPU usage. All further ideas how to debug this are greatly appreciated. On Wed, Jul 20, 2016 at 7:13 PM, Juho Mäkinen <juho.maki...@gmail.com> wrote: > I just recently upgraded our cluster to 2.2.7 and after turning the > cluster under pr