Re: [RELEASE] Apache Cassandra 4.0.8 released

2023-03-09 Thread Brandon Williams
It was reported in CASSANDRA-18307 that the Debian and Redhat packages
for 4.0.8 did not make it to the jfrog repository - this has now been
corrected, sorry for any inconvenience.

Kind Regards,
Brandon

On Tue, Feb 14, 2023 at 3:39 PM Miklosovic, Stefan
 wrote:
>
> The Cassandra team is pleased to announce the release of Apache Cassandra 
> version 4.0.8.
>
> Apache Cassandra is a fully distributed database. It is the right choice when 
> you need scalability and high availability without compromising performance.
>
>  http://cassandra.apache.org/
>
> Downloads of source and binary distributions are listed in our download 
> section:
>
>  http://cassandra.apache.org/download/
>
> This version is a bug fix release[1] on the 4.0 series. As always, please pay 
> attention to the release notes[2] and Let us know[3] if you were to encounter 
> any problem.
>
> [WARNING] Debian and RedHat package repositories have moved! Debian 
> /etc/apt/sources.list.d/cassandra.sources.list and RedHat 
> /etc/yum.repos.d/cassandra.repo files must be updated to the new repository 
> URLs. For Debian it is now https://debian.cassandra.apache.org . For RedHat 
> it is now https://redhat.cassandra.apache.org/40x/ .
>
> Enjoy!
>
> [1]: CHANGES.txt 
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/cassandra-4.0.8
> [2]: NEWS.txt 
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/cassandra-4.0.8
> [3]: https://issues.apache.org/jira/browse/CASSANDRA


Re: [RELEASE] Apache Cassandra 4.0.2 released

2022-02-11 Thread Brandon Williams
Agreed, I've opened CASSANDRA-17376 to handle this.

On Fri, Feb 11, 2022 at 4:44 PM Jeff Jirsa  wrote:
>
> We don't HAVE TO remove the Config.java entry - we can mark it as deprecated 
> and ignored and remove it in a future version (and you could update 
> Config.java to log a message about having a deprecated config option). It's a 
> much better operator experience: log for a major version, then remove in the 
> next.
>
> On Fri, Feb 11, 2022 at 2:41 PM Ekaterina Dimitrova  
> wrote:
>>
>> This had to be removed in 4.0 but it wasn’t. The patch mentioned did it to 
>> fix a bug that gives impression those work. Confirmed with Benedict on the 
>> ticket.
>>
>> I agree I absolutely had to document it better, a ticket for documentation 
>> was opened but it slipped from my mind with this emergency release this 
>> week. It is unfortunate it is still in our backlog after the ADOC migration.
>>
>> Note taken. I truly apologize and I am going to prioritize CASSANDRA-17135. 
>> Let me know if there is anything else I can/should do at this point.
>>
>> On Fri, 11 Feb 2022 at 17:26, Erick Ramirez  
>> wrote:
>>>
>>> (moved dev@ to BCC)
>>>

 It looks like the otc_coalescing_strategy config key is no longer 
 supported in cassandra.yaml in 4.0.2, despite this not being mentioned 
 anywhere in CHANGES.txt or NEWS.txt.
>>>
>>>
>>> James, you're right -- it was removed by CASSANDRA-17132 in 4.0.2 and 4.1.
>>>
>>> I agree that the CHANGES.txt entry should be clearer and we'll improve it 
>>> plus add detailed info in NEWS.txt. I'll get this done soon in 
>>> CASSANDRA-17135. Thanks for the feedback. Cheers!


Re: Log4j vulnerability

2021-12-11 Thread Brandon Williams
https://issues.apache.org/jira/browse/CASSANDRA-5883

As that ticket shows, Apache Cassandra has never used log4j2.

On Sat, Dec 11, 2021 at 11:07 AM Abdul Patel  wrote:
>
> Hi all,
>
> Any idea if any of open source Cassandra versions are impacted with log4j 
> vulnerability which was reported on dec 9th


[RELEASE] Apache Cassandra 3.11.11 released

2021-07-28 Thread Brandon Williams
The Cassandra team is pleased to announce the release of Apache
Cassandra version 3.11.11.

Apache Cassandra is a fully distributed database. It is the right
choice when you need scalability and high availability without
compromising performance.

 http://cassandra.apache.org/

Downloads of source and binary distributions are listed in our download section:

 http://cassandra.apache.org/download/

This version is a bug fix release[1] on the 3.11 series. As always,
please pay attention to the release notes[2] and let us know[3] if you
were to encounter any problem.

Enjoy!

[1]: CHANGES.txt
https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/cassandra-3.11.11
[2]: NEWS.txt 
https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/cassandra-3.11.11
[3]: https://issues.apache.org/jira/browse/CASSANDRA


[RELEASE] Apache Cassandra 3.0.25 released

2021-07-28 Thread Brandon Williams
The Cassandra team is pleased to announce the release of Apache
Cassandra version 3.0.25.

Apache Cassandra is a fully distributed database. It is the right
choice when you need scalability and high availability without
compromising performance.

 http://cassandra.apache.org/

Downloads of source and binary distributions are listed in our download section:

 http://cassandra.apache.org/download/

This version is a bug fix release[1] on the 3.0 series. As always,
please pay attention to the release notes[2] and let us know[3] if you
were to encounter any problem.

Enjoy!

[1]: CHANGES.txt
https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/cassandra-3.0.25
[2]: NEWS.txt 
https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/cassandra-3.0.25
[3]: https://issues.apache.org/jira/browse/CASSANDRA


[RELEASE] Apache Cassandra 4.0.0 released

2021-07-26 Thread Brandon Williams
The Cassandra team is pleased to announce the release of Apache
Cassandra version 4.0.0.

Apache Cassandra is a fully distributed database. It is the right
choice when you need scalability and high availability without
compromising performance.

http://cassandra.apache.org/

Downloads of source and binary distributions are available in our
download section:

http://cassandra.apache.org/download/

This version is the initial release in the 4.0 series. As always,
please pay attention to the release notes[2] and Let us know[3] if you
were to encounter any problem.

Enjoy!

[1]: CHANGES.txt
https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/cassandra-4.0.0
[2]: NEWS.txt 
https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/cassandra-4.0.0
[3]: https://issues.apache.org/jira/browse/CASSANDRA


CVE-2016-3427 Apache Cassandra Unspecified vulnerability related to JMX

2020-08-31 Thread Brandon Williams
Versions Affected:
All versions prior to: 2.1.22, 2.2.18, 3.0.22, 3.11.8 and 4.0-beta2

Description:
Unspecified vulnerability in Oracle Java SE 6u113, 7u99, and 8u77;
Java SE Embedded 8u77; and JRockit R28.3.9 allows remote attackers to
affect confidentiality, integrity, and availability via vectors
related to JMX.   By default Cassandra only binds JMX locally.

Mitigation:
2.1.x users should upgrade to 2.1.22
2.2.x users should upgrade to 2.2.18
3.0.x users should upgrade to 3.0.22
3.11.x users should upgrade to 3.11.8
4.0-beta1 users should upgrade to 4.0-beta2

Alternatively, users can upgrade their JVM to versions after those in
the description.

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread Brandon Williams
The only progress from this point is what Jon said: enumerate and detail
your issues in jira tickets.

On Wed, Feb 21, 2018 at 4:53 PM, Kenneth Brotman <
kenbrot...@yahoo.com.invalid> wrote:

> Hi Akash,
>
> I get the part about outside work which is why in replying to Jeff Jirsa I
> was suggesting the big companies could justify taking it on easy enough and
> you know actually pay the people who would be working at it so those people
> could have a life.
>
> The part I don't get is the aversion to usability.  Isn't that what you
> think about when you are coding?  "Am I making this thing I'm building easy
> to use?"  If you were programming for me, we would be constantly talking
> about what we are building and how we can make things easier for users.  If
> I had to fight with a developer, architect or engineer about usability all
> the time, they would be gone and quick.  How do approach programming if you
> aren't trying to make things easy.
>
> Kenneth Brotman
>
> -Original Message-
> From: Akash Gangil [mailto:akashg1...@gmail.com]
> Sent: Wednesday, February 21, 2018 2:24 PM
> To: d...@cassandra.apache.org
> Cc: user@cassandra.apache.org
> Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
> I would second Jon in the arguments he made. Contributing outside work is
> draining and really requires a lot of commitment. If someone requires
> features around usability etc, just pay for it, period.
>
> On Wed, Feb 21, 2018 at 2:20 PM, Kenneth Brotman <
> kenbrot...@yahoo.com.invalid> wrote:
>
> > Jon,
> >
> > Very sorry that you don't see the value of the time I'm taking for this.
> > I don't have demands; I do have a stern warning and I'm right Jon.
> > Please be very careful not to mischaracterized my words Jon.
> >
> > You suggest I put things in JIRA's, then seem to suggest that I'd be
> > lucky if anyone looked at it and did anything. That's what I figured too.
> >
> > I don't appreciate the hostility.  You will understand more fully in
> > the next post where I'm coming from.  Try to keep the conversation
> civilized.
> > I'm trying or at least so you understand I think what I'm doing is
> > saving your gig and mine.  I really like a lot of people is this group.
> >
> > I've come to a preliminary assessment on things.  Soon the cloud will
> > clear or I'll be gone.  Don't worry.  I'm a very peaceful person and
> > like you I am driven by real important projects that I feel compelled
> > to work on for the good of others.  I don't have time for people to
> > hand hold a database and I can't get stuck with my projects on the wrong
> stuff.
> >
> > Kenneth Brotman
> >
> >
> > -Original Message-
> > From: Jon Haddad [mailto:jonathan.had...@gmail.com] On Behalf Of Jon
> > Haddad
> > Sent: Wednesday, February 21, 2018 12:44 PM
> > To: user@cassandra.apache.org
> > Cc: d...@cassandra.apache.org
> > Subject: Re: Cassandra Needs to Grow Up by Version Five!
> >
> > Ken,
> >
> > Maybe it’s not clear how open source projects work, so let me try to
> > explain.  There’s a bunch of us who either get paid by someone or
> > volunteer on our free time.  The folks that get paid, (yay!) usually
> > take direction on what the priorities are, and work on projects that
> > directly affect our jobs.  That means that someone needs to care
> > enough about the features you want to work on them, if you’re not going
> to do it yourself.
> >
> > Now as others have said already, please put your list of demands in
> > JIRA, if someone is interested, they will work on it.  You may need to
> > contribute a little more than you’ve done already, be prepared to get
> > involved if you actually want to to see something get done.  Perhaps
> > learning a little more about Cassandra’s internals and the people
> > involved will reveal some of the design decisions and priorities of the
> project.
> >
> > Third, you seem to be a little obsessed with market share.  While
> > market share is fun to talk about, *most* of us that are working on
> > and contributing to Cassandra do so because it does actually solve a
> > problem we have, and solves it reasonably well.  If some magic open
> > source DB appears out of no where and does everything you want
> > Cassandra to, and is bug free, keeps your data consistent,
> > automatically does backups, comes with really nice cert management, ad
> > hoc querying, amazing materialized views that are perfect, no caveats
> > to secondary indexes, and somehow still gives you linear scalability
> > without any mental overhead whatsoever then sure, people might start
> > using it.  And that’s actually OK, because if that happens we’ll all
> > be incredibly pumped out of our minds because we won’t have to work as
> > hard.  If on the slim chance that doesn’t manifest, those of us that
> > use Cassandra and are part of the community will keep working on the
> > things we care about, iterating, and improving things.  Maybe someone
> will even take a look at your JIRA issues.
> >
> > 

Re: Definition of QUORUM consistency level

2017-06-08 Thread Brandon Williams
I don't disagree with you there and have never liked TWO/THREE.  This is
somewhat relevant: https://issues.apache.org/jira/browse/CASSANDRA-2338

I don't think going to CL.FOUR, etc, is a good long-term solution, but I'm
also not sure what is.


On Thu, Jun 8, 2017 at 11:20 PM, Dikang Gu  wrote:

> To me, CL.TWO and CL.THREE are more like work around of the problem, for
> example, they do not work if the number of replicas go to 8, which does
> possible in our environment (2 replicas in each of 4 DCs).
>
> What people want from quorum is strong consistency guarantee, as long as
> R+W > N, there are three options: a) R=W=(n/2+1); b) R=(n/2), W=(n/2+1); c)
> R=(n/2+1), W=(n/2). What Cassandra doing right now, is the option a), which
> is the most expensive option.
>
> I can not think of a reason, that people want the quorum read, not for
> strong consistency reason, but just to read from (n/2+1) nodes. If they
> want strong consistency, then the read just needs (n/2) nodes, we are
> purely waste the one extra request, and hurts read latency as well.
>
> Thanks
> Dikang.
>
> On Thu, Jun 8, 2017 at 8:20 PM, Nate McCall 
> wrote:
>
>>
>> We have CL.TWO.
>>>
>>>
>>>
>> This was actually the original motivation for CL.TWO and CL.THREE if
>> memory serves:
>> https://issues.apache.org/jira/browse/CASSANDRA-2013
>>
>
>
>
> --
> Dikang
>
>


Re: Definition of QUORUM consistency level

2017-06-08 Thread Brandon Williams
We have CL.TWO.

On Thu, Jun 8, 2017 at 10:03 PM, Dikang Gu  wrote:

> So, for the quorum, what we really want is that there is one overlap among
> the nodes in write path and read path. It actually was my assumption for a
> long time that we need (N/2 + 1) for write and just need (N/2) for read,
> because it's enough to provide the strong consistency.
>
> On Thu, Jun 8, 2017 at 7:47 PM, Jonathan Haddad  wrote:
>
>> It would be a little weird to change the definition of QUORUM, which
>> means majority, to mean something other than majority for a single use
>> case. Sounds like you want to introduce a new CL, HALF.
>> On Thu, Jun 8, 2017 at 7:43 PM Dikang Gu  wrote:
>>
>>> Justin, what I suggest is that for QUORUM consistent level, the block
>>> for write should be (num_replica/2)+1, this is same as today, but for read
>>> request, we just need to access (num_replica/2) nodes, which should provide
>>> enough strong consistency.
>>>
>>> Dikang.
>>>
>>> On Thu, Jun 8, 2017 at 7:38 PM, Justin Cameron 
>>> wrote:
>>>
 2/4 for write and 2/4 for read would not be sufficient to achieve
 strong consistency, as there is no overlap.

 In your particular case you could potentially use QUORUM for write and
 TWO for read (or vice-versa) and still achieve strong consistency. If you
 add additional nodes in the future this would obviously no longer work.
 Also the benefit of this is dubious, since 3/4 nodes still need to be
 accessible to perform writes. I'd also guess that it's unlikely to provide
 any significant performance increase.

 Justin

 On Fri, 9 Jun 2017 at 12:29 Dikang Gu  wrote:

> Hello there,
>
> We have some use cases are doing consistent read/write requests, and
> we have 4 replicas in that cluster, according to our setup.
>
> What's interesting to me is that, for both read and write quorum
> requests, they are blocked for 4/2+1 = 3 replicas, so we are accessing 3
> (for write) + 3 (for reads) = 6 replicas in quorum requests, which is 2
> replicas more than 4.
>
> I think it's not necessary to have 2 overlap nodes in even replication
> factor case.
>
> I suggest to change the `quorumFor(keyspace)` code, separate the case
> for read and write requests, so that we can reduce one replica request in
> read path.
>
> Any concerns?
>
> Thanks!
>
>
> --
> Dikang
>
> --


 *Justin Cameron*Senior Software Engineer


 


 This email has been sent on behalf of Instaclustr Pty. Limited
 (Australia) and Instaclustr Inc (USA).

 This email and any attachments may contain confidential and legally
 privileged information.  If you are not the intended recipient, do not copy
 or disclose its content, but please reply to this email immediately and
 highlight the error to the sender and then immediately delete the message.

>>>
>>>
>>>
>>> --
>>> Dikang
>>>
>>>
>
>
> --
> Dikang
>
>


Re: unbalanced ring

2013-02-12 Thread Brandon Williams
On Tue, Feb 12, 2013 at 6:13 PM, Edward Capriolo edlinuxg...@gmail.com wrote:

 Are vnodes on by default. It seems that many on list are using this feature
 with small clusters.

They are not.

-Brandon


Re: Pig / Map Reduce on Cassandra

2013-01-16 Thread Brandon Williams
On Wed, Jan 16, 2013 at 2:37 PM,  cscetbon@orange.com wrote:
 Here is the point. You're right this github repository has not been updated 
 for a year and a half. I thought brisk was just a bundle of some technologies 
 and that it was possible to install the same components and make them work 
 together without using this bundle :(

You can install hadoop manually alongside Cassandra as well as pig.
Pig support is in C*'s tree in o.a.c.hadoop.pig.  You won't get CFS,
but it's not a hard requirement, either.

-Brandon


Re: leveled compaction and tombstoned data

2012-11-08 Thread Brandon Williams
On Thu, Nov 8, 2012 at 1:33 PM, Aaron Turner synfina...@gmail.com wrote:
 There are also ways to bring up a test node and just run Level Compaction on
 that.  Wish I had a URL handy, but hopefully someone else can find it.

This rather handsome fellow wrote a blog about it:
http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-1-live-traffic-sampling

-Brandon


Re: distribution of token ranges with virtual nodes

2012-11-01 Thread Brandon Williams
On Thu, Nov 1, 2012 at 10:05 PM, Manu Zhang owenzhang1...@gmail.com wrote:

 it will migrate you to virtual nodes by splitting the existing partition
 256 ways.


 Out of curiosity, is it for the purpose of avoiding streaming?

It splits into a contiguous range, because truly upgrading to vnode
functionality is another step.


  the former would require you to perform a shuffle to achieve that.


 Is there a nodetool option or are there other ways shuffle could be done
 automatically?

There a shuffle command in bin/ that was recently committed, we'll
document this in process in NEWS.txt shortly.

-Brandon


Re: Hinted Handoff runs every ten minutes

2012-10-24 Thread Brandon Williams
On Sun, Oct 21, 2012 at 6:44 PM, aaron morton aa...@thelastpickle.com wrote:
 I *think* this may be ghost rows which have not being compacted.

You would be correct in the case of 1.0.8:
https://issues.apache.org/jira/browse/CASSANDRA-3955

-Brandon


Re: Bringing a dead node back up after fixing hardware issues

2012-07-26 Thread Brandon Williams
On Wed, Jul 25, 2012 at 6:16 PM, Eran Chinthaka Withana
eran.chinth...@gmail.com wrote:

 Alright, lets assume I want to go on this route. I have RF=2 in the data
 center and I believe I need at least RF=3 to set the replication to
 LOCAL_QUORUM and hide the node failures. But if I increase the RF to 3 now
 then won't it trigger more read misses until repair completes? Given this is
 a production cluster which can not afford downtime, how can we do this?

Switch to LQ and increase the RF to 3, then repair to actually have
the RF bumped up.

As long as nothing fails during the first step (which should take
perhaps minutes) you'll be ok.

-Brandon


Re: Bringing a dead node back up after fixing hardware issues

2012-07-24 Thread Brandon Williams
On Mon, Jul 23, 2012 at 10:24 PM, Eran Chinthaka Withana
eran.chinth...@gmail.com wrote:
 Thanks Brandon for the answer (and I didn't know driftx = Brandon Williams.
 Thanks for your awesome support in Cassandra IRC)

Thanks :)

 Increasing CL is tricky for us for now, as our RF on that datacenter is 2
 and CL is set to ONE. If we make the CL to be LOCAL_QUORUM, then, if a node
 goes down we will have trouble. I will try to increase the RF to 3 in that
 data center and set the CL to LOCAL_QUORUM if nothing works out.

Increasing the RF and and using LOCAL_QUORUM is the right thing in
this case.  By choosing CL.ONE, you are agreeing that read misses are
acceptable.  If they are not, then adjusting your RF/CL is the only
path.

 About decommissioning, if the node goes down. There is no way of knowing
 running that command on that node, right? IIUC, decommissioning should be
 run on a node that needs to be decommissioned.

Well, decom and removetoken are both ways of removing a node.  The
former is for a live node, and the latter is for a dead node.  Since
your node was actually alive you could have decommissioned it.

 Coming back to the original question, without touching the CL, can we bring
 back a dead node (after fixing it) and somehow tell Cassandra that the node
 is backup and do not send read requests until it gets all the data?

No, as I said, you are accepting this behavior by choosing CL.ONE.

-Brandon


Re: Bringing a dead node back up after fixing hardware issues

2012-07-23 Thread Brandon Williams
On Mon, Jul 23, 2012 at 6:26 PM, Eran Chinthaka Withana
eran.chinth...@gmail.com wrote:
 Method 1: I copied the data from all the nodes in that data center, into the
 repaired node, and brought it back up. But because of the rate of updates
 happening, the read misses started going up.

That's not really a good method when you scale up and the amount of
data in the cluster won't fit on a single machine.

 Method 2: I issued a removetoken command for that node's token and let the
 cluster stream the data into relevant nodes. At the end of this process, the
 dead node was not showing up in the ring output. Then I brought the node
 back up. I was expecting, Cassandra to first stream data into the new node
 (which happens to be the dead node which was in the cluster earlier) and
 once its done then make it serve reads. But, in the server log, I can see as
 soon the node comes up, it started serving reads, creating a large number of
 read misses.

Removetoken is for dead nodes, so the node has no way of locally
knowing it shouldn't be a cluster member any longer when it starts up.
 Instead if you had decommissioned, it would have saved a flag to
indicate it should bootstrap at the next startup.

 So the question is, what is the best way to bring back a dead node (once its
 hardware issues are fixed) without impacting read misses?

Increase your consistency level.  Run a repair on the node once it's
back up, unless the repair time took longer than gc_grace, in which
case you need to removetoken it, delete all the data, and bootstrap it
back in if you don't want anything deleted to resurrect.

-Brandon


Re: bulk load glitch

2012-07-02 Thread Brandon Williams
On Mon, Jul 2, 2012 at 10:35 AM, Brian Jeltema
brian.jelt...@digitalenvoy.net wrote:
 I can't tell whether the bulk load process recovered from the transient dead 
 node, or whether I need to start over.

 Does anybody know?

You need to start over if the failure detector tripped, but it will
retry a few times for regular network errors.

-Brandon


Re: Problem joining new node to cluster in 1.1.1

2012-06-08 Thread Brandon Williams
This sounds related to https://issues.apache.org/jira/browse/CASSANDRA-4251

On Thu, Jun 7, 2012 at 5:28 PM, Bryce Godfrey bryce.godf...@azaleos.com wrote:
 As the new node starts up I get this error before boostrap starts:



 INFO 08:20:51,584 Enqueuing flush of Memtable-schema_columns@1493418651(0/0
 serialized/live bytes, 1 ops)

 INFO 08:20:51,584 Writing Memtable-schema_columns@1493418651(0/0
 serialized/live bytes, 1 ops)

 INFO 08:20:51,589 Completed flushing
 /opt/cassandra/data/system/schema_columns/system-schema_columns-hc-1-Data.db
 (61 bytes)

 ERROR 08:20:51,889 Exception in thread Thread[MigrationStage:1,5,main]

 java.lang.IllegalArgumentException: value already present: 1015

     at
 com.google.common.base.Preconditions.checkArgument(Preconditions.java:115)

     at
 com.google.common.collect.AbstractBiMap.putInBothMaps(AbstractBiMap.java:111)

     at
 com.google.common.collect.AbstractBiMap.put(AbstractBiMap.java:96)

     at com.google.common.collect.HashBiMap.put(HashBiMap.java:84)

     at org.apache.cassandra.config.Schema.load(Schema.java:385)

     at
 org.apache.cassandra.db.DefsTable.addColumnFamily(DefsTable.java:426)

     at
 org.apache.cassandra.db.DefsTable.mergeColumnFamilies(DefsTable.java:361)

     at org.apache.cassandra.db.DefsTable.mergeSchema(DefsTable.java:270)

     at
 org.apache.cassandra.db.DefsTable.mergeRemoteSchema(DefsTable.java:248)

     at
 org.apache.cassandra.service.MigrationManager$MigrationTask.runMayThrow(MigrationManager.java:416)

     at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)

     at java.util.concurrent.Executors$RunnableAdapter.call(Unknown
 Source)

     at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)

     at java.util.concurrent.FutureTask.run(Unknown Source)

     at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown
 Source)

     at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
 Source)

     at java.lang.Thread.run(Unknown Source)

 INFO 08:20:51,931 Enqueuing flush of
 Memtable-schema_keyspaces@833041663(943/1178 serialized/live bytes, 20 ops)

 INFO 08:20:51,932 Writing Memtable-schema_keyspaces@833041663(943/1178
 serialized/live bytes, 20 ops)





 Then it starts spewing these errors nonstop until I kill it.



 ERROR 08:21:45,959 Error in row mutation

 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find
 cfId=1019

     at
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:126)

     at
 org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:439)

     at
 org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:447)

     at
 org.apache.cassandra.db.RowMutation.fromBytes(RowMutation.java:395)

     at
 org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:42)

     at
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)

     at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown
 Source)

     at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
 Source)

     at java.lang.Thread.run(Unknown Source)

 ERROR 08:21:45,814 Error in row mutation

 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find
 cfId=1019

     at
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:126)

     at
 org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:439)

     at
 org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:447)

     at
 org.apache.cassandra.db.RowMutation.fromBytes(RowMutation.java:395)

     at
 org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:42)

     at
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)

     at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown
 Source)

     at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
 Source)

     at java.lang.Thread.run(Unknown Source)

 ERROR 08:21:45,813 Error in row mutation

 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find
 cfId=1020

     at
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:126)

     at
 org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:439)

     at
 org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:447)

     at
 org.apache.cassandra.db.RowMutation.fromBytes(RowMutation.java:395)

     at
 org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:42)

     at
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)

     at 

Re: memory issue on 1.1.0

2012-06-04 Thread Brandon Williams
Perhaps the deletes: https://issues.apache.org/jira/browse/CASSANDRA-3741

-Brandon

On Sun, Jun 3, 2012 at 6:12 PM, Poziombka, Wade L
wade.l.poziom...@intel.com wrote:
 Running a very write intensive (new column, delete old column etc.) process 
 and failing on memory.  Log file attached.

 Curiously when I add new data I have never seen this have in past sent 
 hundreds of millions new transactions.  It seems to be when I modify.  my 
 process is as follows

 key slice to get columns to modify in batches of 100, in separate threads 
 modify those columns.  I advance the slice with the start key each with last 
 key in previous batch.  Mutations done are update a column value in one 
 column family(token), delete column and add new column in another (pan).

 Runs well until after about 5 million rows then it seems to run out of 
 memory.  Note that these column families are quite small.

 WARN [ScheduledTasks:1] 2012-06-03 17:49:01,558 GCInspector.java (line 145) 
 Heap is 0.7967470834946492 full.  You may need to reduce memtable and/or 
 cache sizes.  Cassandra will now flush up to the two largest memtables to 
 free up memory.  Adjust flush_largest_memtables_at threshold in 
 cassandra.yaml if you don't want Cassandra to do this automatically
  INFO [ScheduledTasks:1] 2012-06-03 17:49:01,559 StorageService.java (line 
 2772) Unable to reduce heap usage since there are no dirty column families
  INFO [GossipStage:1] 2012-06-03 17:49:01,999 Gossiper.java (line 797) 
 InetAddress /10.230.34.170 is now UP
  INFO [ScheduledTasks:1] 2012-06-03 17:49:10,048 GCInspector.java (line 122) 
 GC for ParNew: 206 ms for 1 collections, 7345969520 used; max is 8506048512
  INFO [ScheduledTasks:1] 2012-06-03 17:49:53,187 GCInspector.java (line 122) 
 GC for ConcurrentMarkSweep: 12770 ms for 1 collections, 5714800208 used; max 
 is 8506048512

 
 Keyspace: keyspace
        Read Count: 50042632
        Read Latency: 0.23157864418482224 ms.
        Write Count: 44948323
        Write Latency: 0.019460829472992797 ms.
        Pending Tasks: 0
                Column Family: pan
                SSTable count: 5
                Space used (live): 1977467326
                Space used (total): 1977467326
                Number of Keys (estimate): 16334848
                Memtable Columns Count: 0
                Memtable Data Size: 0
                Memtable Switch Count: 74
                Read Count: 14985122
                Read Latency: 0.408 ms.
                Write Count: 19972441
                Write Latency: 0.022 ms.
                Pending Tasks: 0
                Bloom Filter False Postives: 829
                Bloom Filter False Ratio: 0.00073
                Bloom Filter Space Used: 37048400
                Compacted row minimum size: 125
                Compacted row maximum size: 149
                Compacted row mean size: 149

                Column Family: token
                SSTable count: 4
                Space used (live): 1250973873
                Space used (total): 1250973873
                Number of Keys (estimate): 14217216
                Memtable Columns Count: 0
                Memtable Data Size: 0
                Memtable Switch Count: 49
                Read Count: 30059563
                Read Latency: 0.167 ms.
                Write Count: 14985488
                Write Latency: 0.014 ms.
                Pending Tasks: 0
                Bloom Filter False Postives: 13642
                Bloom Filter False Ratio: 0.00322
                Bloom Filter Space Used: 28002984
                Compacted row minimum size: 150
                Compacted row maximum size: 258
                Compacted row mean size: 224

                Column Family: counters
                SSTable count: 2
                Space used (live): 561549994
                Space used (total): 561549994
                Number of Keys (estimate): 9985024
                Memtable Columns Count: 0
                Memtable Data Size: 0
                Memtable Switch Count: 38
                Read Count: 4997947
                Read Latency: 0.092 ms.
                Write Count: 9990394
                Write Latency: 0.023 ms.
                Pending Tasks: 0
                Bloom Filter False Postives: 191
                Bloom Filter False Ratio: 0.37525
                Bloom Filter Space Used: 18741152
                Compacted row minimum size: 125
                Compacted row maximum size: 179
                Compacted row mean size: 150

 


Re: nodes moving spontaneously

2012-06-02 Thread Brandon Williams
Nodes don't move themselves, you likely have some kind of 'bouncing
gossip' issue where a node was removed/replaced and is hanging around,
but only periodically held in statee between nodes.  Unfortunately
node removal is very prone to this before 0.8.3 and even after that,
you can't fix it without
https://issues.apache.org/jira/browse/CASSANDRA-3337 so upgrading is
really your best bet here.

On Fri, Jun 1, 2012 at 4:01 PM, Curt Allred c...@mediosystems.com wrote:
 We have a 10 node cluster (v0.7.9) split into 2 datacenters.  Three times we
 have seen nodes move themselves to different locations in the ring.  In each
 case, the move unbalanced the ring. In one case a node moved to the opposite
 side of the ring.



 Sometime after the first spontaneous move we started using Datastax
 OpsCenter.  The next 2 moves showed up in its event log like:

 5/20/2012 11:23am - Info -  Host 12.34.56.78 moved from '12345' to '54321'



 where '12345' and '54321' are the old and new tokens.



 Anyone know whats causing this?




Re: Migrating from a windows cluster to a linux cluster.

2012-05-24 Thread Brandon Williams
On Thu, May 24, 2012 at 12:41 PM, Henrik Schröder skro...@gmail.com wrote:
 We're running version 1.0.8. Is this fixed in a later release? Will this be
 fixed in a later release?

No, mixed-OS clusters are unsupported.

 Are there any other ways of doing the migration? What happens if we join the
 new servers without bootstrapping and run repair? Are there any other ugly
 hacks or workaround we can do? We're not looking to run a mixed cluster, we
 just want to migrate all the data as painlessly as possible.

Start the linux cluster independently and use sstableloader from the
windows cluster to populate it.

-Brandon


Re: Migrating from a windows cluster to a linux cluster.

2012-05-24 Thread Brandon Williams
On Thu, May 24, 2012 at 1:50 PM, Henrik Schröder skro...@gmail.com wrote:
 Ok. It's important for us to not have any downtime, so how about this
 solution:

 We startup the Linux cluster independently.
 We configure our application to send all Cassandra writes to both clusters,
 but only read from the Windows cluster.
 We run sstableloader on each windows server (Is it possible to do in
 parallell?), sending whatever it has to the Linux cluster.
 When it's done on all Windows servers, we configure our application to only
 talk to the Linux cluster.

That sounds fine, with the caveat that you can't run sstableloader
from a machine running Cassandra before 1.1, so copying the sstables
manually (assuming both clusters are the same size and have the same
tokens) might be better.

 The only issue with this is the timestamps of the data and tombstones in
 each sstable, will they be preserved by sstableloader? What about deletes of
 non-existing keys? Will they be stored in the Linux cluster so that when
 sstableloader inserts the key later, it's resolved as being deleted?

None of that should be a problem.

-Brandon


Re: Migrating from a windows cluster to a linux cluster.

2012-05-24 Thread Brandon Williams
On Thu, May 24, 2012 at 3:36 PM, Henrik Schröder skro...@gmail.com wrote:
 That sounds fine, with the caveat that you can't run sstableloader
 from a machine running Cassandra before 1.1, so copying the sstables
 manually (assuming both clusters are the same size and have the same
 tokens) might be better.


 Why is version 1.1 required for sstableloader? We're running 1.0.x on both
 clusters, but we can of course upgrade if that's required.

Before 1.1 sstableloader is a fat client, and thus can't coexist with
an existing Cassandra instance on the same machine.

-Brandon


Re: Replication factor

2012-05-23 Thread Brandon Williams
On Wed, May 23, 2012 at 5:51 AM, Viktor Jevdokimov 
viktor.jevdoki...@adform.com wrote:

   When RF == number of nodes, and you read at CL ONE you will always be
 reading locally.

 “always be reading locally” – only if Dynamic Snitch is “off”. With
 dynamic snitch “on” request may be redirected to other node, which may
 introduce latency spikes.


Actually it's preventing spikes, since if it won't read locally that means
the local replica is in worse shape than the rest (compacting, repairing,
etc.)

-Brandon


Re: How do I add a custom comparator class to a cassandra cluster ?

2012-05-15 Thread Brandon Williams
On Tue, May 15, 2012 at 12:53 AM, Ertio Lew ertio...@gmail.com wrote:
 @Brandon : I just created a jira issue to request this type of comparator
 along with Cassandra.

 It is about a UTF8 comparator that provides case insensitive ordering of
 columns.
 See issue here : https://issues.apache.org/jira/browse/CASSANDRA-4245

Nothing I said before does not stand, as far as I can tell.

-Brandon


Re: Snapshot failing on JSON files in 1.1.0

2012-05-15 Thread Brandon Williams
Probably https://issues.apache.org/jira/browse/CASSANDRA-4230

On Tue, May 15, 2012 at 4:08 PM, Bryan Fernandez bfernande...@gmail.com wrote:
 Greetings,

 We recently upgraded from 1.0.8 to 1.1.0. Everything has been running fine
 with the exception of snapshots. When attempting to snapshot any of the
 nodes in our six node cluster we are seeing the following error.

 [root@cassandra-n6 blotter]# /opt/apache-cassandra-1.1.0/bin/nodetool -h
 10.20.50.58 snapshot
 Requested snapshot for: all keyspaces
 Exception in thread main java.io.IOError: java.io.IOException: Unable to
 create hard link from
 /var/lib/cassandra/data/blotter/twitter_users/twitter_users.json to
 /var/lib/cassandra/data/blotter/twitter_users/snapshots/1337115022389/twitter_users.json
 (errno 17)
 at
 org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1454)
 at
 org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:1483)
 at org.apache.cassandra.db.Table.snapshot(Table.java:205)
 at
 org.apache.cassandra.service.StorageService.takeSnapshot(StorageService.java:1793)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
 at
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
 at
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
 at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
 at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
 at
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
 at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
 at
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
 at
 javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
 at
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
 at
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
 at
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:303)
 at sun.rmi.transport.Transport$1.run(Transport.java:159)
 at java.security.AccessController.doPrivileged(Native Method)
 at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
 at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
 at
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
 at
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.IOException: Unable to create hard link from
 /var/lib/cassandra/data/blotter/twitter_users/twitter_users.json to
 /var/lib/cassandra/data/blotter/twitter_users/snapshots/1337115022389/twitter_users.json
 (errno 17)
 at org.apache.cassandra.utils.CLibrary.createHardLink(CLibrary.java:163)
 at
 org.apache.cassandra.db.Directories.snapshotLeveledManifest(Directories.java:343)
 at
 org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1450)
 ... 33 more


 However, an LS shows that both of these JSON files exist on the filesystem
 (although slightly different sizes).

 [root@cassandra-n6 blotter]# ls -al
 /var/lib/cassandra/data/blotter/twitter_users/twitter_users.json
 -rw-r--r-- 1 root root 38786 May 15 20:51
 /var/lib/cassandra/data/blotter/twitter_users/twitter_users.json

 [root@cassandra-n6 blotter]# ls -al
 /var/lib/cassandra/data/blotter/twitter_users/snapshots/1337115022389/twitter_users.json
 -rw-r--r-- 1 root root 38778 May 15 20:50
 /var/lib/cassandra/data/blotter/twitter_users/snapshots/1337115022389/twitter_users.json


 We are using Leveled Compaction on the twitter_users CF with I assume is
 creating the JSON files.

 [root@cassandra-n6 blotter]# ls -al
 /var/lib/cassandra/data/blotter/twitter_users/*.json
 -rw-r--r-- 1 root root 38779 May 15 20:51
 /var/lib/cassandra/data/blotter/twitter_users/twitter_users.json
 -rw-r--r-- 1 root root 38779 May 15 20:51
 /var/lib/cassandra/data/blotter/twitter_users/twitter_users-old.json
 

Re: How do I add a custom comparator class to a cassandra cluster ?

2012-05-14 Thread Brandon Williams
On Mon, May 14, 2012 at 1:11 PM, Ertio Lew ertio...@gmail.com wrote:
 I need to add a custom comparator to a cluster, to sort columns in a certain
 customized fashion. How do I add the class to the cluster  ?

I highly recommend against doing this, because you'll be locked in to
your comparator and not have an easy way out.  I dare say if none of
the currently available comparators meet your needs, you're doing
something wrong.

-Brandon


Re: stream data using bulkoutputformat on hdfs?

2012-05-02 Thread Brandon Williams
On Wed, May 2, 2012 at 2:23 PM, Shawna Qian shaw...@yahoo-inc.com wrote:
 Hello:

 I am trying to use bulkoutputformat and seeing some nice docs on how to use
 it to stream the data to an existing cassandra cluster using configHelper
 class.  I am wondering if it is possible to use it just to stream the data
 (sstable etc) into the hdfs?

Not currently, but most of the groundwork is already laid there if
that's what you want to do, the main catch being that BOF wants to
stream to cassandra when it's done.

-Brandon


Re: how to increase compaction rate?

2012-03-12 Thread Brandon Williams
On Mon, Mar 12, 2012 at 4:44 AM, aaron morton aa...@thelastpickle.com wrote:
 I don't understand why I
 don't get multiple concurrent compactions running, that's what would
 make the biggest performance difference.

 concurrent_compactors
 Controls how many concurrent compactions to run, by default it's the number
 of cores on the machine.

With leveled compaction, I don't think you get any concurrency because
it has to compact an entire level, and it can't proceed to the next
level without completing the one before it.

In short, if you want maximum throughput, stick with size tiered.

-Brandon


Re: Several times hinted handoff for the same node with Cassandra 1.0.8

2012-03-12 Thread Brandon Williams
Just ignore it: https://issues.apache.org/jira/browse/CASSANDRA-3955

On Mon, Mar 12, 2012 at 9:31 PM, Roshan codeva...@gmail.com wrote:
 Hi

 I have upgrade our development Cassandra cluster (2 nodes) from 1.0.6 to
 1.0.8 version.

 After upgrade to 1.0.8 version, one node keep trying to send hints every 10
 minutes (seems). There is no any network issues between two nodes and can do
 ping with using server name and ip address. Also the nodetool ring is
 working fine from both end. Could someone please help on this?

 Here is the Cassandra stack of the problem node:

 2012-03-13 13:12:56,138 INFO  [StorageService] Cassandra version: 1.0.8
 2012-03-13 13:12:56,138 INFO  [StorageService] Thrift API version: 19.20.0
 2012-03-13 13:12:56,138 INFO  [StorageService] Loading persisted ring state
 2012-03-13 13:12:56,149 INFO  [StorageService] Starting up server gossip
 2012-03-13 13:12:56,150 INFO  [ColumnFamilyStore] Enqueuing flush of
 Memtable-LocationInfo@1899843147(29/36 serialized/live bytes, 1 ops)
 2012-03-13 13:12:56,152 INFO  [Memtable] Writing
 Memtable-LocationInfo@1899843147(29/36 serialized/live bytes, 1 ops)
 2012-03-13 13:12:56,238 INFO  [Memtable] Completed flushing
 /data/cassandradb/data/system/LocationInfo-hc-50-Data.db (80 bytes)
 2012-03-13 13:12:56,253 INFO  [MessagingService] Starting Messaging Service
 on port 7000
 2012-03-13 13:12:56,259 INFO  [StorageService] Using saved token
 144939581272443147669723010154540982565
 2012-03-13 13:12:56,261 INFO  [ColumnFamilyStore] Enqueuing flush of
 Memtable-LocationInfo@1915784320(53/66 serialized/live bytes, 2 ops)
 2012-03-13 13:12:56,261 INFO  [Memtable] Writing
 Memtable-LocationInfo@1915784320(53/66 serialized/live bytes, 2 ops)
 2012-03-13 13:12:56,350 INFO  [Gossiper] Node /10.1.161.67 has restarted,
 now UP
 2012-03-13 13:12:56,350 INFO  [Gossiper] InetAddress /10.1.161.67 is now UP
 2012-03-13 13:12:56,351 INFO  [StorageService] Node /10.1.161.67 state jump
 to normal
 2012-03-13 13:12:56,378 INFO  [Memtable] Completed flushing
 /data/cassandradb/data/system/LocationInfo-hc-51-Data.db (163 bytes)
 2012-03-13 13:12:56,389 INFO  [CompactionTask] Compacting
 [SSTableReader(path='/data/cassandradb/data/system/LocationInfo-hc-51-Data.db'),
 SSTableReader(path='/data/cassandradb/data/system/LocationInfo-hc-50-Data.db'),
 SSTableReader(path='/data/cassandradb/data/system/LocationInfo-hc-48-Data.db'),
 SSTableReader(path='/data/cassandradb/data/system/LocationInfo-hc-49-Data.db')]
 2012-03-13 13:12:56,402 INFO  [StorageService] Node
 app8.dev1.net/10.1.161.68 state jump to normal
 2012-03-13 13:12:56,403 INFO  [StorageService] Bootstrap/Replace/Move
 completed! Now serving reads.
 2012-03-13 13:12:56,403 INFO  [Mx4jTool] Will not load MX4J, mx4j-tools.jar
 is not in the classpath
 2012-03-13 13:12:56,438 INFO  [CassandraDaemon] Binding thrift service to
 app8.dev1.net/10.1.161.68:9160
 2012-03-13 13:12:56,445 INFO  [CassandraDaemon] Using TFastFramedTransport
 with a max frame size of 62914560 bytes.
 2012-03-13 13:12:56,448 INFO  [CassandraDaemon] Using synchronous/threadpool
 thrift server on app8.dev1.net/10.1.161.68 : 9160
 2012-03-13 13:12:56,448 INFO  [CassandraDaemon] Listening for thrift
 clients...
 2012-03-13 13:12:56,498 INFO  [CompactionTask] Compacted to
 [/data/cassandradb/data/system/LocationInfo-hc-52-Data.db,].  844 to 438
 (~51% of original) bytes for 4 keys at 0.003941MB/s.  Time: 106ms.
 2012-03-13 13:13:42,410 INFO  [HintedHandOffManager] Started hinted handoff
 for token: 59868989542208531803879358296598929701 with IP: /10.1.161.67
 2012-03-13 13:13:42,490 INFO  [HintedHandOffManager] Finished hinted handoff
 of 0 rows to endpoint /10.1.161.67
 2012-03-13 13:23:28,946 INFO  [HintedHandOffManager] Started hinted handoff
 for token: 59868989542208531803879358296598929701 with IP: /10.1.161.67
 2012-03-13 13:23:28,948 INFO  [HintedHandOffManager] Finished hinted handoff
 of 0 rows to endpoint /10.1.161.67
 Waiting for data... (interrupt to abort)

 --
 View this message in context: 
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Several-times-hinted-handoff-for-the-same-node-with-Cassandra-1-0-8-tp7367386p7367386.html
 Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
 Nabble.com.


Re: Node joining / unknown

2012-03-07 Thread Brandon Williams
On Wed, Mar 7, 2012 at 3:37 AM, aaron morton aa...@thelastpickle.com wrote:
 2) Stop the node. Try to get remove the token again from another node. Node
 that removing a token will stream data around the place as well.

A node that has never fully joined doesn't need to be removed (and
can't.)  Just shut it down and it will go away after a minute or so.

-Brandon


Re: avoid log spam with 0 HH rows delivered

2012-03-02 Thread Brandon Williams
https://issues.apache.org/jira/browse/CASSANDRA-3955

On Fri, Mar 2, 2012 at 1:53 AM, Radim Kolar h...@sendmail.cz wrote:
 Can be something made to remove these empty delivery attempts from log?

 Its just tombstoned row.

 [default@system] list HintsColumnFamily;
 Using default limit of 100
 ---
 RowKey: 00

 1 Row Returned.
 Elapsed time: 234 msec(s).


  INFO [HintedHandoff:1] 2012-03-02 05:44:32,359 HintedHandOffManager.java
 (line 296) Started hinted handoff for token: 0 with IP: /64.6.104.18
  INFO [HintedHandoff:1] 2012-03-02 05:44:32,362 HintedHandOffManager.java
 (line 373) Finished hinted handoff of 0 rows to endpoint /64.6.104.18
  INFO [HintedHandoff:1] 2012-03-02 05:54:31,641 HintedHandOffManager.java
 (line 296) Started hinted handoff for token: 0 with IP: /64.6.104.18
  INFO [HintedHandoff:1] 2012-03-02 05:54:31,644 HintedHandOffManager.java
 (line 373) Finished hinted handoff of 0 rows to endpoint /64.6.104.18
  INFO [HintedHandoff:1] 2012-03-02 06:04:25,253 HintedHandOffManager.java
 (line 296) Started hinted handoff for token: 0 with IP: /64.6.104.18
  INFO [HintedHandoff:1] 2012-03-02 06:04:25,255 HintedHandOffManager.java
 (line 373) Finished hinted handoff of 0 rows to endpoint /64.6.104.18
  INFO [HintedHandoff:1] 2012-03-02 06:14:57,984 HintedHandOffManager.java
 (line 296) Started hinted handoff for token: 0 with IP: /64.6.104.18
  INFO [HintedHandoff:1] 2012-03-02 06:14:58,013 HintedHandOffManager.java
 (line 373) Finished hinted handoff of 0 rows to endpoint /64.6.104.18
  INFO [HintedHandoff:1] 2012-03-02 06:24:15,206 HintedHandOffManager.java
 (line 296) Started hinted handoff for token: 0 with IP: /64.6.104.18
  INFO [HintedHandoff:1] 2012-03-02 06:24:15,208 HintedHandOffManager.java
 (line 373) Finished hinted handoff of 0 rows to endpoint /64.6.104.18
  INFO [HintedHandoff:1] 2012-03-02 06:34:43,108 HintedHandOffManager.java
 (line 296) Started hinted handoff for token: 0 with IP: /64.6.104.18
  INFO [HintedHandoff:1] 2012-03-02 06:34:43,110 HintedHandOffManager.java
 (line 373) Finished hinted handoff of 0 rows to endpoint /64.6.104.18



Re: Only the last added node is visible in the cluster

2012-02-25 Thread Brandon Williams
My guess would be you're using the same token everywhere.

-Brandon

On Sat, Feb 25, 2012 at 9:48 AM, Aditya Gupta ady...@gmail.com wrote:
 While creating a multinode cluster, my nodes are unable to identify all the
 nodes in the cluster.
 Only the *last added'  node is visible when I do:
  ./nodetool -h localhost ring


 I am trying to create a 4 nodes cluster. On starting the seed node, the
 above command shows just itself(ok.. good), then when I start the 2nd node
 the first one disappears  there is only 2nd visible in the ring. On
 starting 3rd one, just the 3rd one remains.

 In cassandra.yaml of each node, I configured the listen_adress equal to ip
 address of that node  for seeds I just put the ip address of 1st node
 everywhere.

 Can anyone point to me what may be causing this ?



Re: Only the last added node is visible in the cluster

2012-02-25 Thread Brandon Williams
Then my next guess is you cloned one system to make the others in a
virtual env, and the token is recorded in the system keyspace.  In any
case, some nodetool ring output at each node addition will clarify
this.

-Brandon

On Sat, Feb 25, 2012 at 1:20 PM, Aditya Gupta ady...@gmail.com wrote:
 Nope, I just re-verified :)
 I have split up the range into 4 parts for 4 nodes. I have specified that in
 the intial_token


 On Sun, Feb 26, 2012 at 12:33 AM, Brandon Williams dri...@gmail.com wrote:

 My guess would be you're using the same token everywhere.

 -Brandon

 On Sat, Feb 25, 2012 at 9:48 AM, Aditya Gupta ady...@gmail.com wrote:
  While creating a multinode cluster, my nodes are unable to identify all
  the
  nodes in the cluster.
  Only the *last added'  node is visible when I do:
   ./nodetool -h localhost ring
 
 
  I am trying to create a 4 nodes cluster. On starting the seed node, the
  above command shows just itself(ok.. good), then when I start the 2nd
  node
  the first one disappears  there is only 2nd visible in the ring. On
  starting 3rd one, just the 3rd one remains.
 
  In cassandra.yaml of each node, I configured the listen_adress equal to
  ip
  address of that node  for seeds I just put the ip address of 1st node
  everywhere.
 
  Can anyone point to me what may be causing this ?
 




Re: Only the last added node is visible in the cluster

2012-02-25 Thread Brandon Williams
On Sat, Feb 25, 2012 at 3:39 PM, Aditya Gupta ady...@gmail.com wrote:
 The output of nodetool ring after each addition of nodes, make just the last
 added node visible in the ring.
 When I retry to add the node(which are not visible) it says that it is
 already a part of the ring.

 Could you indicate how should I rectify this now, as you seem to have
 figured out the issue ?

The simplest thing to do is rm -rf /var/lib/cassandra on all the nodes.

-Brandon


Re: Cassandra keeps on logging Finished hinted handoff of 0 rows to endpoint

2012-02-24 Thread Brandon Williams
It's a special case of a single sstable existing for hints:
https://issues.apache.org/jira/browse/CASSANDRA-3955

On Fri, Feb 24, 2012 at 5:43 AM, Manoj Mainali mainalima...@gmail.com wrote:
 Hi,

 I have been running Cassandra 1.0.7 and in the log file I see the log saying

  Finished hinted handoff of 0 rows to endpoint /{ipaddress}

 The above issue can be reproduced by the following steps,

 1. Start a cluster with 2 node, suppose node1 and node2
 2. Create a keyspace with rf=2, create column family
 3. Stop node2
 4. Insert some rows, suppose 100, to cluster with consistency level 1
 5. Restart node2

 When the node2 is restarted, node1 sends the hints to the node2 and from the
 log I see that 100 rows are sent. But, after that in the interval of
 approximately 10 mins, Cassandra logs Finished hinted handoff of 0 rows to
 the endpoint ..

 When I do the list hintscolumnfamily from the cassandra-cli, it shows a
 result of 1 row, but no columns data.

 There seems to be a issue raised
 before, https://issues.apache.org/jira/browse/CASSANDRA-3733, and it says it
 is fixed in 1.0.7. However, I keep seeing the above log.

 It seems that the Cassandra is trying to send hints message even when all
 the hints are delivered and there are no more hints left. Is there a way to
 solve the above issue?
 Recently, another issue was also
 raised https://issues.apache.org/jira/browse/CASSANDRA-3935 and they are
 similar, but I am not sure if they are caused by the same reason.

 Does anyone know how to solve the issue?

 Thanks,
 Manoj


Re: data model advice

2012-02-24 Thread Brandon Williams
On Fri, Feb 24, 2012 at 10:46 AM, David Leimbach leim...@gmail.com wrote:


 On Thu, Feb 23, 2012 at 7:54 PM, Martin Arrowsmith
 arrowsmith.mar...@gmail.com wrote:

 Hi Franc,

 Or, you can consider using composite columns. It is not recommended to use
 Super Columns anymore.


 Yes, but why?  Is it because composite columns effectively replace and
 simplify similar models?

http://www.quora.com/Cassandra-database/Why-is-it-bad-to-use-supercolumns-in-Cassandra

-Brandon


Re: cassandra on ec2 lock-ups

2012-02-17 Thread Brandon Williams
http://wiki.apache.org/cassandra/FAQ#ubuntu_hangs

On Fri, Feb 17, 2012 at 7:00 AM, Pierre-Yves Ritschard p...@spootnik.org 
wrote:
 Hi,

 I've experienced several node lock-ups on EC2 instances. I'm running
 with the following set-up:

 heap-new: 800M
 max-heap: 8G
 instance type: m2.xlarge

 java is
 java version 1.6.0_26
 Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
 Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)

 running on ubuntu 11.04 and the apache cassandra packages + jna installed

 All my initial tokens are computed and set in cassandra.yaml according
 to the number of nodes present, the seed list is the ip of the first
 node.

 I am seeing a very strange behavior where on startup a node will
 lock-up the machine, ssh is still possible but ps hangs on the
 cassandra process which seems to be doing nothing.


Re: nodetool removetoken

2012-02-14 Thread Brandon Williams
Before 1.0.8, use https://issues.apache.org/jira/browse/CASSANDRA-3337
to remove it.

On Tue, Feb 14, 2012 at 3:44 PM, Franc Carter franc.car...@sirca.org.au wrote:

 I teminated (ec2 destruction) a node that I was wedged during bootstrap.
 However when I try to removetoken I get 'Token not found'.

 It looks a bit like this issue ?

 https://issues.apache.org/jira/browse/CASSANDRA-3737

 nodetool -h 127.0.0.1 ring gives this

 Address DC  Rack    Status State   Load
 Owns    Token

 85070591730234615865843651857942052864
 10.253.65.203   us-east 1a  Up Normal  11.18 GB
 50.00%  0
 10.252.82.64    us-east 1a  Down   Joining 320.45 KB
 25.00%  42535295865117307932921825928971026432
 10.253.86.224   us-east 1a  Up Normal  11.01 GB
 25.00%  85070591730234615865843651857942052864

 and

 nodetool -h 127.0.0.1 removetoken 42535295865117307932921825928971026432

 gives

 xception in thread main java.lang.UnsupportedOperationException: Token not
 found.
     at
 org.apache.cassandra.service.StorageService.removeToken(StorageService.java:2369)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
     at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
     at java.lang.reflect.Method.invoke(Method.java:597)
     at
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
     at
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
     at
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
     at
 com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
     at
 com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
     at
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
     at
 com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
     at
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
     at
 javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
     at
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
     at
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
     at
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
     at sun.reflect.GeneratedMethodAccessor165.invoke(Unknown Source)
     at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
     at java.lang.reflect.Method.invoke(Method.java:597)
     at
 sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
     at sun.rmi.transport.Transport$1.run(Transport.java:159)
     at java.security.AccessController.doPrivileged(Native Method)
     at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
     at
 sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
     at
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
     at
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
     at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
     at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
     at java.lang.Thread.run(Thread.java:662)

 Any ideas on how to deal with this ?

 thanks

 --

 Franc Carter | Systems architect | Sirca Ltd

 franc.car...@sirca.org.au | www.sirca.org.au

 Tel: +61 2 9236 9118

 Level 9, 80 Clarence St, Sydney NSW 2000

 PO Box H58, Australia Square, Sydney NSW 1215




Re: truncate command fails

2012-02-07 Thread Brandon Williams
On Tue, Feb 7, 2012 at 2:17 AM, Patrik Modesto patrik.mode...@gmail.com wrote:
 Hi,

 I've a strange problem with my test cluster. Calling truncate on a
 small ColumnFamily on idle cluster of 4 nodes returns
 UnavailableException after 10s. That 10s is set in rpc_timeout_in_ms.

Make sure you have JNA, without it the cost of forking ln to snapshot
is expensive.

-Brandon


Re: Any tools like phpMyAdmin to see data stored in Cassandra ?

2012-01-30 Thread Brandon Williams
On Sun, Jan 29, 2012 at 11:52 PM, Ertio Lew ertio...@gmail.com wrote:

 On Mon, Jan 30, 2012 at 7:16 AM, Frisch, Michael michael.fri...@nuance.com
 wrote:

 OpsCenter?

 http://www.datastax.com/products/opscenter


 Thanks, that's a great product but unfortunately doesn't work with windows.

Now it does: http://www.datastax.com/products/opscenter/platforms

-Brandon


Re: Can I use BulkOutputFormat from 1.1 to load data to older Cassandra versions?

2012-01-09 Thread Brandon Williams
On Mon, Jan 9, 2012 at 1:18 AM, Erik Forsberg forsb...@opera.com wrote:
 Hi!

 Can the new BulkOutputFormat
 (https://issues.apache.org/jira/browse/CASSANDRA-3045) be used to load data
 to servers running cassandra 0.8.7 and/or Cassandra 1.0.6?

 I'm thinking of using jar files from the development version to load data
 onto a production cluster which I want to keep on a production version of
 Cassandra. Can I do that, or does BulkOutputFormat require an API level that
 is only in the development version of Cassandra?

Unfortunately BOF wants to stream the output files into the cluster,
which required streaming changes, so this won't work.  If you hacked
this part out, and then generated the sstables with 1.0, you could
then use the bulkloader to stream them and that should work.

-Brandon


Re: Copy a column family?

2012-01-09 Thread Brandon Williams
On Mon, Jan 9, 2012 at 9:14 AM, Brian O'Neill b...@alumni.brown.edu wrote:

 What is the fastest way to copy a column family?
 We were headed down the map/reduce path, but that seems silly.
 Any file level mechanisms for this?

Copy all the sstables 1:1 renaming them to the new CF name.  Then
create the schema for the CF.

-Brandon


Re: Copy a column family?

2012-01-09 Thread Brandon Williams
On Mon, Jan 9, 2012 at 11:34 AM, Philippe watche...@gmail.com wrote:
 Would this apply to copying data from one cluster to another, assuming I do
 a rolling drain and shutdown ?
 Thanks

Only if the tokens also match 1:1 and you copy to the same tokens.  If
they don't match, the easiest thing to do is use the bulkloader, but
until 1.1 it requires becoming a fat client, which means cassandra
cannot be running on the machine you load from.

-Brandon


Re: What is the future of supercolumns ?

2012-01-07 Thread Brandon Williams
On Sat, Jan 7, 2012 at 5:42 PM, Rustam Aliyev rus...@code.az wrote:
 My suggestion is simple: don't use any deprecated stuff out there. In
 practically any case there is a good reason why it's deprecated.


 SuperColumns are not deprecated.

The supercolumn API will remain:
https://issues.apache.org/jira/browse/CASSANDRA-3237

-Brandon


Re: about decommissioned node that returns

2011-12-20 Thread Brandon Williams
On Tue, Dec 20, 2011 at 2:24 PM, aaron morton aa...@thelastpickle.com wrote:
 Sorry, got that a little wrong.

 At startup the node will use the higher of the current seconds since epoch
 or the stored generation number.

Technically stored generation + 1 so it's always increased on a restart.

-Brandon


Re: memory estimate for each key in the key cache

2011-12-16 Thread Brandon Williams
On Fri, Dec 16, 2011 at 8:52 PM, Kent Tong freemant2...@yahoo.com wrote:
 Hi,

 From the source code I can see that for each key, the hash (token), the key 
 itself (ByteBuffer) and the position (long. offset in the sstable) are stored 
 into the key cache. The hash is an MD5 hash, so it is 16 bytes. So, the total 
 size required is at least 16+size-of(key)+4 which is  20 bytes. If we 
 consider the overhead of the object references, then it will be even larger. 
 Then, why the wiki recommends multiplying the  number of keys cached with 
 10-12 to get the memory requirement?

In a word: java.

-Brandon


Re: memory estimate for each key in the key cache

2011-12-16 Thread Brandon Williams
On Fri, Dec 16, 2011 at 9:31 PM, Dave Brosius dbros...@mebigfatguy.com wrote:
 Wow, Java is a lot better than I thought if it can perform that kind of
 magic.  I'm guessing the wiki information is just old and out of date. It's
 probably more like 60 + sizeof(key)

With jamm and MAT it's fairly easy to test.  The number is accurate
last I checked.

-Brandon


Re: Keys for deleted rows visible in CLI

2011-12-14 Thread Brandon Williams
http://wiki.apache.org/cassandra/FAQ#range_ghosts

On Wed, Dec 14, 2011 at 4:36 AM, Radim Kolar h...@sendmail.cz wrote:
 Dne 14.12.2011 1:15, Maxim Potekhin napsal(a):

 Thanks. It could be hidden from a human operator, I suppose :)

 I agree. Open JIRA for it.


Re: configurable bloom filters (like hbase)

2011-12-14 Thread Brandon Williams
https://issues.apache.org/jira/browse/CASSANDRA-3497

On Wed, Dec 14, 2011 at 4:52 AM, Radim Kolar h...@sendmail.cz wrote:
 Dne 11.11.2011 7:55, Radim Kolar napsal(a):

 i have problem with large CF (about 200 billions entries per node). While
 i can configure index_interval to lower memory requirements, i still have to
 stick with huge bloom filters.

 Ideal would be to have bloom filters configurable like in hbase. Cassandra
 standard is about 1.05% false possitive but in my case i would be fine even
 with 20% false positive rate. Data are not often read back. Most of them
 will be never read before they expire via TTL.

 anybody other has problem that bloom filters are using too much memory in
 applications which do not needs to read written data often?

 I am looking at bloom filters memory used and it would be ideal to have in
 cassandra-1.1 ability to shrink bloom filters to about 1/10 of their size.
 Is possible to code something like this: save bloom filters to disk as usual
 but during load, transform them into something smaller at cost increasing FP
 rate?


Re: Cannot Start Cassandra 1.0.5 with JNA on the CLASSPATH

2011-12-11 Thread Brandon Williams
On Sun, Dec 11, 2011 at 3:23 AM, Caleb Rackliffe ca...@steelhouse.comwrote:

 Hi All,

 I'm trying to start up Cassandra 1.0.5 on a Cent OS 6 machine.  I
 installed JNA through yum and made a symbolic link to jna.jar in my
 Cassandra lib directory.  When I run bin/cassandra -f, I get the
 following:

  INFO 09:14:31,552 Logging initialized
  INFO 09:14:31,555 JVM vendor/version: Java HotSpot(TM) 64-Bit Server
 VM/1.6.0_29
  INFO 09:14:31,555 Heap size: 3405774848/3405774848
  INFO 09:14:31,555 Classpath:
 bin/../conf:bin/../build/classes/main:bin/../build/classes/thrift:bin/../lib/antlr-3.2.jar:bin/../lib/apache-cassandra-1.0.5.jar:bin/../lib/apache-cassandra-clientutil-1.0.5.jar:bin/../lib/apache-cassandra-thrift-1.0.5.jar:bin/../lib/avro-1.4.0-fixes.jar:bin/../lib/avro-1.4.0-sources-fixes.jar:bin/../lib/commons-cli-1.1.jar:bin/../lib/commons-codec-1.2.jar:bin/../lib/commons-lang-2.4.jar:bin/../lib/compress-lzf-0.8.4.jar:bin/../lib/concurrentlinkedhashmap-lru-1.2.jar:bin/../lib/guava-r08.jar:bin/../lib/high-scale-lib-1.1.2.jar:bin/../lib/jackson-core-asl-1.4.0.jar:bin/../lib/jackson-mapper-asl-1.4.0.jar:bin/../lib/jamm-0.2.5.jar:bin/../lib/jline-0.9.94.jar:bin/../lib/jna.jar:bin/../lib/json-simple-1.1.jar:bin/../lib/libthrift-0.6.jar:bin/../lib/log4j-1.2.16.jar:bin/../lib/servlet-api-2.5-20081211.jar:bin/../lib/slf4j-api-1.6.1.jar:bin/../lib/slf4j-log4j12-1.6.1.jar:bin/../lib/snakeyaml-1.6.jar:bin/../lib/snappy-java-1.0.4.1.jar:bin/../lib/jamm-0.2.5.jar
 Killed


The 'Killed' line is your problem, the OOM killer decided to kill java.
 You can confirm this in dmesg.  You either need more memory or less heap,
the reason it's happening instantly with JNA is because all the memory is
being allocated up front, but without it you still have a timebomb waiting
to go off.

-Brandon


Re: Really old files in the data directory

2011-12-09 Thread Brandon Williams
On Fri, Dec 9, 2011 at 1:57 PM, Edward Capriolo edlinuxg...@gmail.com wrote:
 Are you taking snapshots? If the file is referenced in a snapshot it will
 not delete until it is also not part of any snapshot.

That isn't true.  The file will be removed from the data dir, but
still exist in the snapshot dir.

-Brandon


Re: Cassandra 0.8.8

2011-12-09 Thread Brandon Williams
http://cassandra.apache.org/download/

On Fri, Dec 9, 2011 at 3:37 PM, Maxim Potekhin potek...@bnl.gov wrote:
 Hello everyone,

 so what's the update on 0.8.8?

 Many thanks

 Maxim



 On 12/2/2011 4:49 AM, Patrik Modesto wrote:

 Hi,

 It's been almost 2 months since the release of the 0.8.7 version and
 there are quite some changes in 0.8.8, so I'd like to ask is there a
 release date?

 Regards,
 Patrik




Re: decommissioned not show being gossipped

2011-12-01 Thread Brandon Williams
On Thu, Dec 1, 2011 at 12:26 PM, huyle hu...@springpartners.com wrote:
 Hi,

 We have 2 nodes have been decommissioned from the cluster running 1.0.3.
 However, the live nodes still making references to the decommissioned nodes
 3 days after the nodes were decommissioned.  Nodetool does not show the
 decommissioned noes. Here are sample log entries:

How in sync are the clocks in the cluster?

-Brandon


Re: decommissioned not show being gossipped

2011-12-01 Thread Brandon Williams
On Thu, Dec 1, 2011 at 1:10 PM, huyle hu...@springpartners.com wrote:
 The clocks are very sync'ed between the nodes as they have ntp running
 hitting our time servers.

Maybe they weren't 3 days after the token left, which
https://issues.apache.org/jira/browse/CASSANDRA-2961 requires.

If a node sees the token you can removetoken it, otherwise you'll need
https://issues.apache.org/jira/browse/CASSANDRA-3337

-Brandon


Re: [RELEASE] Apache Cassandra 1.0.5 released

2011-11-30 Thread Brandon Williams
On Wed, Nov 30, 2011 at 1:29 PM, Michael Vaknine micha...@citypath.com wrote:
 The files are not on the site
 The requested URL /apache//cassandra/1.0.5/apache-cassandra-1.0.5-bin.tar.gz
 was not found on this server.

It takes the mirrors some time to sync.

-Brandon


Re: Local quorum reads

2011-11-18 Thread Brandon Williams
On Fri, Nov 18, 2011 at 4:23 PM, Anthony Ikeda
anthony.ikeda@gmail.com wrote:
 This is the setup:
 Cassandra 0.8.6
 3 nodes
 Keyspace: NetworkTopologyStrategy, 1DC, RF=2

 1 node goes down and we cannot read from the ring.

 My expectation is that LOCAL_QUORUM dictates that it will return a record 
 once a majority (N/2 +1) of replicas reports back.

 2/2 + 1 = 2

 I originally thought N=3 for 3 nodes but someone has corrected me on this 
 that it's the replicas (2) but when I use the cli, setConsistencyLevel AS 
 LOCAL_QUORUM nothing comes back. Set it back to ONE I can see the data

 Is there somethin I'm missing?

You've arrived at the right answer, 2, but you're using N to refer to
the amount of nodes (in the last part), where you need it to refer to
the RF.  RF=3 is the minimum amount where you can achieve quorum with
a member of the replica set being down.

-Brandon


Re: BulkLoader

2011-11-16 Thread Brandon Williams
On Mon, Nov 14, 2011 at 2:49 PM, Giannis Neokleous
gian...@generalsentiment.com wrote:
 Hello everyone,

 We're using the bulk loader to load data every day to Cassandra. The
 machines that use the bulkloader are diferent every day so their IP
 addresses change. When I do describe cluster i see all the unreachable
 nodes that keep piling up for the past few days. Is there a way to remove
 those IP addresses without terminating the whole cluster at the same time
 and restarting it?

 The unreachable nodes cause issues when we want to make schema changes to
 all the nodes or when we want to truncate a CF.

 Any suggestions?


It sounds like you're running into
https://issues.apache.org/jira/browse/CASSANDRA-3351 so the first step
would be to upgrade to a version that has it fixed.

Unfortunately, this won't solve the problem, just prevent it from
happening in the future.  To remove the old nodes, you can apply
https://issues.apache.org/jira/browse/CASSANDRA-3337 on one node and
call the JMX method for the unreachable endpoints.

-Brandon


Re: Upgrade Cassandra Cluster to 1.0.2

2011-11-14 Thread Brandon Williams
On Mon, Nov 14, 2011 at 1:21 AM, Michael Vaknine micha...@citypath.com wrote:
 Hi,

 After configuring the encryption on Cassandra.yaml I get this error when
 upgrading from 1.0.0 to 1.0.2
 Attached the log file with the errors.

https://issues.apache.org/jira/browse/CASSANDRA-3466

-Brandon


Re: Upgrade Cassandra Cluster to 1.0.2

2011-11-14 Thread Brandon Williams
On Mon, Nov 14, 2011 at 7:53 AM, Michael Vaknine micha...@citypath.com wrote:
 Does this means that I have to wait to 1.0.3?

In the meantime you can just delete the hints and rely on read repair
or antientropy repair if you're concerned about the consistency of
your replicas.

-Brandon


Re: Upgrade Cassandra Cluster to 1.0.2

2011-11-14 Thread Brandon Williams
On Mon, Nov 14, 2011 at 8:06 AM, Michael Vaknine micha...@citypath.com wrote:
 Well,
 I tried to delete the hints on the failed cluster but I could not start it
 I got other errors such as

 ERROR [MutationStage:34] 2011-11-14 15:37:43,813
 AbstractCassandraDaemon.java (line 133) Fatal exception in thread
 Thread[MutationStage:34,5,main]
 java.lang.StackOverflowError
        at
 java.util.concurrent.atomic.AtomicReferenceFieldUpdater$AtomicReferenceField
 UpdaterImpl.updateCheck(AtomicReferenceFieldUpdater.java:216)

This is different an unrelated to hints.  What jvm are you using?  It
looks like it just simply ran out of stack space, which is odd, but
you can control that with the -Xss option if needed.

-Brandon


Re: Upgrade Cassandra Cluster to 1.0.2

2011-11-13 Thread Brandon Williams
On Sun, Nov 13, 2011 at 4:35 AM, Michael Vaknine micha...@citypath.com wrote:
 I am trying to upgrade to 1.0.2 and when I try to start the first upgraded
 server I get the following error



 ERROR [WRITE-/10.5.6.102] 2011-11-13 10:20:37,447
 AbstractCassandraDaemon.java (line 133) Fatal exception in thread
 Thread[WRITE-/10.5.6.102,5,main]

 java.lang.NullPointerException

     at
 org.apache.cassandra.net.OutboundTcpConnectionPool.isEncryptedChannel(OutboundTcpConnectionPool.java:93)

     at
 org.apache.cassandra.net.OutboundTcpConnectionPool.newSocket(OutboundTcpConnectionPool.java:77)

     at
 org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection.java:209)

     at
 org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:93)

You are probably missing the encryption options in your yaml.  I
noticed this problem as part of
https://issues.apache.org/jira/browse/CASSANDRA-3045.

-Brandon


Re: Upgrade Cassandra Cluster to 1.0.2

2011-11-13 Thread Brandon Williams
I believe https://issues.apache.org/jira/browse/CASSANDRA-2802 broke
it.  I've created https://issues.apache.org/jira/browse/CASSANDRA-3489
to address this separately.

On Sun, Nov 13, 2011 at 9:37 AM, Michael Vaknine micha...@citypath.com wrote:
 You are right this solved the problem.
 I do not understand why version 1.0.0 was not affected since I used the same
 configuration yaml file.

 Thank you.
 Michael Vaknine


 -Original Message-
 From: Brandon Williams [mailto:dri...@gmail.com]
 Sent: Sunday, November 13, 2011 4:48 PM
 To: user@cassandra.apache.org
 Cc: cassandra-u...@incubator.apache.org
 Subject: Re: Upgrade Cassandra Cluster to 1.0.2

 On Sun, Nov 13, 2011 at 4:35 AM, Michael Vaknine micha...@citypath.com
 wrote:
 I am trying to upgrade to 1.0.2 and when I try to start the first upgraded
 server I get the following error



 ERROR [WRITE-/10.5.6.102] 2011-11-13 10:20:37,447
 AbstractCassandraDaemon.java (line 133) Fatal exception in thread
 Thread[WRITE-/10.5.6.102,5,main]

 java.lang.NullPointerException

     at

 org.apache.cassandra.net.OutboundTcpConnectionPool.isEncryptedChannel(Outbou
 ndTcpConnectionPool.java:93)

     at

 org.apache.cassandra.net.OutboundTcpConnectionPool.newSocket(OutboundTcpConn
 ectionPool.java:77)

     at

 org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection
 .java:209)

     at

 org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.jav
 a:93)

 You are probably missing the encryption options in your yaml.  I
 noticed this problem as part of
 https://issues.apache.org/jira/browse/CASSANDRA-3045.

 -Brandon




Re: Mass deletion -- slowing down

2011-11-13 Thread Brandon Williams
On Sun, Nov 13, 2011 at 5:57 PM, Maxim Potekhin potek...@bnl.gov wrote:
 I've done more experimentation and the behavior persists: I start with a
 normal dataset which is searcheable by a secondary index. I select by that
 index the entries that match a certain criterion, then delete those. I tried
 two methods of deletion -- individual cf.remove() as well as batch removal
 in Pycassa.
 What happens after that is as follows: attempts to read the same CF, using
 the same index values start to time out in the Pycassa client (there is a
 thrift message about timeout). The entries not touched by such attempted
 deletion are read just fine still.

 Has anyone seen such behavior?

What you're probably running into is a huge amount of tombstone
filtering on the read (see
http://wiki.apache.org/cassandra/DistributedDeletes)

Since you're dealing with timeseries data, using a row-bucketing
technique like http://rubyscale.com/2011/basic-time-series-with-cassandra/
might help by eliminating the need for an index.

-Brandon


Re: Mass deletion -- slowing down

2011-11-13 Thread Brandon Williams
On Sun, Nov 13, 2011 at 6:55 PM, Maxim Potekhin potek...@bnl.gov wrote:
 Thanks to all for valuable insight!

 Two comments:
 a) this is not actually time series data, but yes, each item has
 a timestamp and thus chronological attribution.

 b) so, what do you practically recommend? I need to delete
 half a million to a million entries daily, then insert fresh data.
 What's the right operation procedure?

I'd have to know more about what your access pattern is like to give
you a fully informed answer.

 For some reason I can still select on the index in the CLI, it's
 the Pycassa module that gives me trouble, but I need it as this
 is my platform and we are a Python shop.

This seems odd, since the rpc_timeout is the same for all clients.
Maybe pycassa is asking for more data than the cli?

-Brandon


Re: Mass deletion -- slowing down

2011-11-13 Thread Brandon Williams
On Sun, Nov 13, 2011 at 7:25 PM, Maxim Potekhin potek...@bnl.gov wrote:
 Each row represents a computational task (a job) executed on the grid or in
 the cloud. It naturally has a timestamp as one of its attributes,
 representing the time of the last update. This timestamp
 is used to group the data into buckets each representing one day in the
 system's activity.
 I create the DATE attribute and add it to each row, e.g. it's a column
 {'DATE','2013'}.

Hmm, so why is pushing this into the row key and then deleting the
entire row not acceptable? (this is what the link I gave would
prescribe)  In other words, you bucket at the row level, instead of
relying on a column attribute that needs an index.

-Brandon


Re: decommissioned node still in LoadMap in JMX Management Console

2011-11-09 Thread Brandon Williams
On Wed, Nov 9, 2011 at 1:28 AM, Patrik Modesto patrik.mode...@gmail.com wrote:
 Hi,

 on our production cluster of 8 nodes which is running cassandra 0.8.7
 we still see in the MBean
 org.apache.cassandra.db:type=StorageService.LoadMap in JMX
 Management console the 9th node we added for testing for a short time.
 After the testing we decommissioned the 9th node and has been
 reinstalled for another use. The node is not reported by `nodetool
 ring` not is it in
 org.apache.cassandra.db:type=StorageService.LiveNodes MBean.

 Is it a feature or a bug?

Appears to be a minor bug: https://issues.apache.org/jira/browse/CASSANDRA-3475

-Brandon


Re: Using Cli to create a column family with column name metadata question

2011-11-07 Thread Brandon Williams
On Mon, Nov 7, 2011 at 7:36 PM, Arsene Lee
arsene@ruckuswireless.com wrote:
 Hi,

 Thanks for the replay. I'm not talking about the column name. I'm talking 
 about the column metadata's column name. Right now cli can't not display the 
 column's meta name correctly if the comparator type is not UTF8.

Try 'help assume;'

-Brandon


Re: Second Cassandra users survey

2011-11-05 Thread Brandon Williams
On Fri, Nov 4, 2011 at 9:50 PM, Jim Newsham jnews...@referentia.com wrote:
 Our use case is time-series data (such as sampled sensor data).  Each row
 describes a particular statistic over time, the column name is a time, and
 the column value is the sample.  So it makes perfect sense to want to delete
 columns for a given time range.  I'm sure there must be numerous other use
 cases for which using a range of column names makes sense.

Assuming you are bucketing your rows at some interval (as in
http://rubyscale.com/2011/basic-time-series-with-cassandra/), why is
deleting the entire row for the interval not acceptable?

-Brandon


Re: Why SSTable is sorted by tokens instead of row keys?

2011-11-04 Thread Brandon Williams
On Fri, Nov 4, 2011 at 7:49 AM, Gary Shi gary...@gmail.com wrote:
 I want to save time series event logs into Cassandra, and I need to load
 them by key range (row key is time-based). But we can't use
 RandomPartitioner in this way, while OrderPreservingPartitioner leads to hot
 spot problem.

You should read this:
http://rubyscale.com/2011/basic-time-series-with-cassandra/

-Brandon


Re: Second Cassandra users survey

2011-11-04 Thread Brandon Williams
On Fri, Nov 4, 2011 at 9:19 PM, Jim Newsham jnews...@referentia.com wrote:
 - Bulk column deletion by (column name) range.  Without this feature, we are
 forced to perform a range query and iterate over all of the columns,
 deleting them one by one (we do this in a batch, but it's still a very slow
 approach).  See CASSANDRA-494/3448.  If anyone else has a need for this
 issue, please raise your voice, as the feature has been tabled due to lack
 of interest.

I think the lack of interest here has been this: it's unusual to want
to delete columns for which you do not know the names, but also not
want to delete the entire row.  Is there any chance you're trying to
delete the entire row, or is it truly the case I just described?

-Brandon


Re: Retreiving column by names Vs by range, which is more performant ?

2011-11-03 Thread Brandon Williams
On Thu, Nov 3, 2011 at 2:05 PM, Ertio Lew ertio...@gmail.com wrote:
 Retrieving columns by names vs by range which is more performant , when you
 have the options to do both ?

Assuming the columns have never been overwritten, range has a small advantage.

However, in the face of frequently updated (overwritten) columns,
names will tear it up with
https://issues.apache.org/jira/browse/CASSANDRA-2498

-Brandon


Re: Storing and querying IP ranges in Cassandra

2011-11-01 Thread Brandon Williams
On Tue, Nov 1, 2011 at 11:17 AM, Tamas Marki tma...@gmail.com wrote:
 Hello,

 I'm new to the list and also to Cassandra. I found it when I was searching
 for something to replace our busy mysql server.

 One of the things we use the server for is filtering IPs based on a list of
 IP ranges. These ranges can be small and big, and there are about 50k of
 them in the database.

 In mysql this is pretty quick: they are stored as integers, and the query
 basically looks like (say ip is the ip we want to find the all the ranges
 for):

 select range from rangelist where ip_start=ip and ip_end=ip;

 I tried to move this schema to Cassandra, but it turned out to be very slow,
 even with indexes on both columns. Since I also had to have an EQ expression
 in the query, I added an indexed text field which was the same for all rows,
 so the query in cassandra was something like this:

 select range from rangelist where type='ip' and ip_start=ip and ip_end=ip;

 This was very slow, and I imagine it is because it has to scan through all
 the rows, making the index useless.

This basically boils down to a binary search problem, so you don't
really need an index.  Assuming IPv4, what I would do is make the
first two bytes (class A and B, respectively) the row key. This will
give you 65025 rows, each with possibly 65025 columns (each column
name will be the other two bytes.)  When you need to find an ip, you
go to the row key and then slice the columns to find a match.  This
works well until you need to search an entire class A, in which case
you'll need to do 255 checks, but in parallel this won't be too bad,
especially because the bloom filter will save you on non-existent
rows.  Presumably there is no need to search all class As, unless for
some reason you don't know the first byte, which would be somewhat
strange.  If you do need to span a few class As this will begin to
fall apart, but hopefully that's not a common use case.

-Brandon


Re: SimpleAuthenticator missing in Cassandra 1.0

2011-10-27 Thread Brandon Williams
On Thu, Oct 27, 2011 at 3:25 PM, RobinUs2 ro...@us2.nl wrote:
 It seems that org.apache.cassandra.auth.SimpleAuthenticator is missing in the
 cassandra 1.0 binaries. Is this on purpose or did I found a bug?

From NEWS.txt:

- The SimpleAuthenticator and SimpleAuthority classes have been moved to
  the example directory (and are thus not available from the binary
  distribution). They never provided actual security and in their current
  state are only meant as examples.

-Brandon


Re: Schema versions reflect schemas on unwanted nodes

2011-10-14 Thread Brandon Williams
On Fri, Oct 14, 2011 at 2:36 PM, Eric Czech e...@nextbigsound.com wrote:
 Thanks again.  I have truncated certain cf's recently and the cli didn't
 complain and listings of the cf rows return nothing after truncation.  Is
 that data not actually deleted?

Hmm, well, now I'm confused because if 3259 is your problem truncate
won't work.  If it did then it sounds like you have a schema
disagreement problem with nodes that are members of the ring.  If
truncate didn't return an error it did work, however.

-Brandon


Re: Schema versions reflect schemas on unwanted nodes

2011-10-13 Thread Brandon Williams
You're running into https://issues.apache.org/jira/browse/CASSANDRA-3259

Try upgrading and doing a rolling restart.

-Brandon

On Thu, Oct 13, 2011 at 9:11 AM, Eric Czech e...@nextbigsound.com wrote:
 Nope, there was definitely no intersection of the seed nodes between the two
 clusters so I'm fairly certain that the second cluster found out about the
 first through what was in the LocationInfo* system tables.  Also, I don't
 think that procedure will really help because I don't actually want the
 schema on cass-analysis-1 to be consistent with the schema in the original
 cluster -- I just want to totally remove it.

 On Thu, Oct 13, 2011 at 8:01 AM, Mohit Anchlia mohitanch...@gmail.com
 wrote:

 Do you have same seed node specified in cass-analysis-1 as cass-1,2,3?
 I am thinking that changing the seed node in cass-analysis-2 and
 following the directions in
 http://wiki.apache.org/cassandra/FAQ#schema_disagreement might solve
 the problem. Somone please correct me.

 On Thu, Oct 13, 2011 at 12:05 AM, Eric Czech e...@nextbigsound.com
 wrote:
  I don't think that's what I'm after here since the unwanted nodes were
  originally assimilated into the cluster with the same initial_token
  values
  as other nodes that were already in the cluster (that have, and still do
  have, useful data).  I know this is an awkward situation so I'll try to
  depict it in a simpler way:
  Let's say I have a simplified version of our production cluster that
  looks
  like this -
  cass-1   token = A
  cass-2   token = B
  cass-3   token = C
  Then I tried to create a second cluster that looks like this -
  cass-analysis-1   token = A  (and contains same data as cass-1)
  cass-analysis-2   token = B  (and contains same data as cass-2)
  cass-analysis-3   token = C  (and contains same data as cass-3)
  But after starting the second cluster, things got crossed up between the
  clusters and here's what the original cluster now looks like -
  cass-1   token = A   (has data and schema)
  cass-2   token = B   (has data and schema)
  cass-3   token = C   (had data and schema)
  cass-analysis-1   token = A  (has *no* data and is not part of the ring,
  but
  is trying to be included in cluster schema)
  A simplified version of describe cluster  for the original cluster now
  shows:
  Cluster Information:
     Schema versions:
  SCHEMA-UUID-1: [cass-1, cass-2, cass-3]
  SCHEMA-UUID-2: [cass-analysis-1]
  But the simplified ring looks like this (has only 3 nodes instead of 4):
  Host       Owns     Token
  cass-1     33%       A
  cass-2     33%       B
  cass-3     33%       C
  The original cluster is still working correctly but all live schema
  updates
  are failing because of the inconsistent schema versions introduced by
  the
  unwanted node.
  From my perspective, a simple fix seems to be for cassandra to exclude
  nodes
  that aren't part of the ring from the schema consistency requirements.
   Any
  reason that wouldn't work?
  And aside from a possible code patch, any recommendations as to how I
  can
  best fix this given the current 8.4 release?
 
  On Thu, Oct 13, 2011 at 12:14 AM, Jonathan Ellis jbel...@gmail.com
  wrote:
 
  Does nodetool removetoken not work?
 
  On Thu, Oct 13, 2011 at 12:59 AM, Eric Czech e...@nextbigsound.com
  wrote:
   Not sure if anyone has seen this before but it's really killing me
   right
   now.  Perhaps that was too long of a description of the issue so
   here's
   a
   more succinct question -- How do I remove nodes associated with a
   cluster
   that contain no data and have no reason to be associated with the
   cluster
   whatsoever?
   My last resort here is to stop cassandra (after recording all tokens
   for
   each node), set the initial token for each node in the cluster in
   cassandra.yaml, manually delete the LocationInfo* sstables in the
   system
   keyspace, and then restart.  I'm hoping there's a simpler, less
   seemingly
   risky way to do this so please, please let me know if that's true!
   Thanks again.
   - Eric
   On Tue, Oct 11, 2011 at 11:55 AM, Eric Czech e...@nextbigsound.com
   wrote:
  
   Hi, I'm having what I think is a fairly uncommon schema issue --
   My situation is that I had a cluster with 10 nodes and a consistent
   schema.  Then, in an experiment to setup a second cluster with the
   same
   information (by copying the raw sstables), I left the LocationInfo*
   sstables
   in the system keyspace in the new cluster and after starting the
   second
   cluster, I realized that the two clusters were discovering each
   other
   when
   they shouldn't have been.  Since then, I changed the cluster name
   for
   the
   second cluster and made sure to delete the LocationInfo* sstables
   before
   starting it and the two clusters are now operating independent of
   one
   another for the most part.  The only remaining connection between
   the
   two
   seems to be that the first cluster is still maintaining references
   to
   nodes
   in the second 

Re: MapReduce with two ethernet cards

2011-10-13 Thread Brandon Williams
What is your rpc_address set to?  If it's 0.0.0.0 (bind everything)
then that's not going to work if listen_address is blocked.

-Brandon

On Thu, Oct 13, 2011 at 11:13 AM, Scott Fines scott.fi...@nisc.coop wrote:
 I upgraded to cassandra 0.8.7, and the problem persists.

 Scott
 
 From: Brandon Williams [dri...@gmail.com]
 Sent: Monday, October 10, 2011 12:28 PM
 To: user@cassandra.apache.org
 Subject: Re: MapReduce with two ethernet cards

 On Mon, Oct 10, 2011 at 11:47 AM, Scott Fines scott.fi...@nisc.coop wrote:
 Hi all,
 This may be a silly question, but I'm at a bit of a loss, and was hoping for
 some help.
 I have a Cassandra cluster set up with two NICs--one for internel
 communication between cassandra machines (10.1.1.*), and one to respond to
 Thrift RPC (172.28.*.*).
 I also have a Hadoop cluster set up, which, for unrelated reasons, has to
 remain separate from Cassandra, so I've written a little MapReduce job to
 copy data from Cassandra to Hadoop. However, when I try to run my job, I
 get
 java.io.IOException: failed connecting to all endpoints
 10.1.1.24,10.1.1.17,10.1.1.16
 which is puzzling to me. It seems like the MR is attempting to connect to
 the internal communication IPs instead of the external Thrift IPs. Since I
 set up a firewall to block external access to the internal IPs of Cassandra,
 this is obviously going to fail.
 So my question is: why does Cassandra MR seem to be grabbing the
 listen_address instead of the Thrift one. Presuming it's not a funky
 configuration error or something on my part, is that strictly necessary? All
 told, I'd prefer if it was connecting to the Thrift IPs, but if it can't,
 should I open up port 7000 or port 9160 between Hadoop and Cassandra?
 Thanks for your help,
 Scott

 Your cassandra is old, upgrade to the latest version.

 -Brandon



Re: Multi DC setup

2011-10-11 Thread Brandon Williams
On Tue, Oct 11, 2011 at 2:36 AM, Peter Schuller
peter.schul...@infidyne.com wrote:
 Google/check wiki/read docs about NetworkTopologyStrategy and
 PropertyFileSnitch. I don't have a good link to multi-dc off hand
 (anyone got a good link to suggest that goes through this?).

http://www.datastax.com/docs/0.8/cluster_architecture/replication is
pretty good imo.

-Brandon


Re: add bloomfilter results to nodetool?

2011-10-11 Thread Brandon Williams
On Tue, Oct 11, 2011 at 12:19 PM, Yang tedd...@gmail.com wrote:
 I find the info about bloomfilter very helpful, could we add that to NodeCmd ?

Feel free to create a ticket and tag it 'lhf'

-Brandon


Re: Volunteers needed - Wiki

2011-10-10 Thread Brandon Williams
On Mon, Oct 10, 2011 at 11:51 AM, hani elabed hani.ela...@gmail.com wrote:
 Hi Aaron,
 I can help with the documentation... I grabbed tons of screenshots as I was
 installing Cassandra source trunk(1.0.0.rc2?) on my Mac OS X Snow leopard on
 Eclipse Galileo and later Eclipse Indigo, I will be installing it on Eclipse
 for Ubuntu 10.04 soon. I took the sceenshots after I noticed the missing
 picts in here:
 http://wiki.apache.org/cassandra/RunningCassandraInEclipse

Unfortunately, the ASF no longer allows attachments on the wiki.

-Brandon


Re: MapReduce with two ethernet cards

2011-10-10 Thread Brandon Williams
On Mon, Oct 10, 2011 at 11:47 AM, Scott Fines scott.fi...@nisc.coop wrote:
 Hi all,
 This may be a silly question, but I'm at a bit of a loss, and was hoping for
 some help.
 I have a Cassandra cluster set up with two NICs--one for internel
 communication between cassandra machines (10.1.1.*), and one to respond to
 Thrift RPC (172.28.*.*).
 I also have a Hadoop cluster set up, which, for unrelated reasons, has to
 remain separate from Cassandra, so I've written a little MapReduce job to
 copy data from Cassandra to Hadoop. However, when I try to run my job, I
 get
 java.io.IOException: failed connecting to all endpoints
 10.1.1.24,10.1.1.17,10.1.1.16
 which is puzzling to me. It seems like the MR is attempting to connect to
 the internal communication IPs instead of the external Thrift IPs. Since I
 set up a firewall to block external access to the internal IPs of Cassandra,
 this is obviously going to fail.
 So my question is: why does Cassandra MR seem to be grabbing the
 listen_address instead of the Thrift one. Presuming it's not a funky
 configuration error or something on my part, is that strictly necessary? All
 told, I'd prefer if it was connecting to the Thrift IPs, but if it can't,
 should I open up port 7000 or port 9160 between Hadoop and Cassandra?
 Thanks for your help,
 Scott

Your cassandra is old, upgrade to the latest version.

-Brandon


Re: read on multiple SS tables

2011-10-06 Thread Brandon Williams
On Thu, Oct 6, 2011 at 3:56 PM, aaron morton aa...@thelastpickle.com wrote:
 -If you perform a query for a specific row key and a column name, does
 it read the most recent SSTable first and if it finds a hit, does it
 stop there or does it need to read through all the SStables (to find
 most recent one) regardless of whether if found a hit on the most
 recent SSTable or not?

 Reads all SSTables, as the only way to know which column instance has the
 highest time stamp is to read them all.

Until https://issues.apache.org/jira/browse/CASSANDRA-2498 which makes
this much faster.

-Brandon


Re: help needed interpreting Read/Write latency in cfstats and cfhistograms output

2011-10-04 Thread Brandon Williams
On Mon, Oct 3, 2011 at 3:57 PM, Ramesh Natarajan rames...@gmail.com wrote:
 Thanks Aaron. The ms in the latency is it microseconds or milliseconds?
 I ran the 2 commands at the same time. I was expecting the values to be in
 the some what similar but from my output earlier ,  you can see the median
 in read latency in histogram output is about 10 milliseconds whereas the
 cfstats showed 5 ms.  Is this normal?
 thanks
 Ramesh

You should be aware of https://issues.apache.org/jira/browse/CASSANDRA-3222

-Brandon


Re: dedicated gossip lan

2011-10-04 Thread Brandon Williams
On Tue, Oct 4, 2011 at 2:00 PM, Sorin Julean sorin.jul...@gmail.com wrote:
 Hi,

  Did anyone used a dedicated interfaces and LAN / VLAN for gossip traffic ?

  Any benefits in such approach ?

I don't think there is any substantial benefit to doing this, but also
it's impossible: gossip is not separate from the storage protocol.  Of
course, I am assuming you mean just gossip, but if what you actually
mean is the entire storage protocol (listen_address) then yes, there
is benefit to having a dedicated network for that.

-Brandon


Re: frequent node UP/Down?

2011-09-25 Thread Brandon Williams
On Sat, Sep 24, 2011 at 4:54 PM, Yang tedd...@gmail.com wrote:
 I'm using 1.0.0


 there seems to be too many node Up/Dead events detected by the failure
 detector.
 I'm using  a 2 node cluster on EC2, in the same region, same security
 group, so I assume the message drop
 rate should be fairly low.
 but in about every 5 minutes, I'm seeing some node detected as down,
 and then Up again quickly

This is fairly common on ec2 due to wild variance in the network.
Increase your phi_convict_threshold to 10 or higher (but I wouldn't go
over 12, this is roughly an exponential increase)

-Brandon


Re: frequent node UP/Down?

2011-09-25 Thread Brandon Williams
On Sun, Sep 25, 2011 at 12:52 PM, Yang tedd...@gmail.com wrote:
 Thanks Brandon.

 I suspected that, but I think that's precluded as a possibility since
 I setup another background job to do
 echo | nc other_box 7000
 in a loop,
 this job seems to be working fine all the time, so network seems fine.

This isn't measuring latency, however.  That is how the failure
detector works, using probability to estimate the likelihood that a
given host is alive, based on previous history.  The situation on ec2
is something like the following: 99% of pings are 1ms, but sometimes
there are brief periods of 100ms, and this is where the FD says this
is not realistic, I think the host is dead but then receives the
ping, and thus the flapping.  I've seen it a million times, increasing
the phi threshold always solves it.

-Brandon


Re: frequent node UP/Down?

2011-09-25 Thread Brandon Williams
On Sun, Sep 25, 2011 at 1:10 PM, Yang tedd...@gmail.com wrote:
 Thanks Brandon.

 I'll try this.

 but you can also see my later post regarding message drop :
 http://mail-archives.apache.org/mod_mbox/cassandra-user/201109.mbox/%3ccaanh3_8aehidyh9ybt82_emh3likbcdsenrak3jhfzaj2l+...@mail.gmail.com%3E

 that seems to show something in either code or background load causing
 messages to be really dropped

I see.  My guess is then this: there is a local clock problem, causing
generations to be the same, thus not notifying the FD.  So perhaps the
problem is not network-related, but it is something in the ec2
environment.

-Brandon


Re: Nodetool removetoken taking days to run.

2011-09-14 Thread Brandon Williams
On Wed, Sep 14, 2011 at 8:54 AM, Ryan Hadley r...@sgizmo.com wrote:
 Hi,

 So, here's the backstory:

 We were running Cassandra 0.7.4 and at one point in time had a node in the 
 ring at 10.84.73.18. We removed this node from the ring successfully in 
 0.7.4. It stopped showing in the nodetool ring command. But occasionally we'd 
 still get weird log entries about failing to write/read to IP 10.84.73.18.

 We upgraded to Cassandra 0.8.4. Now, nodetool ring shows this old node:

 10.84.73.18     datacenter1 rack1       Down   Leaving ?               6.71%  
  32695837177645752437561450928649262701

 So I started a nodetool removetoken on 32695837177645752437561450928649262701 
 last Friday. It's still going strong this morning, on day 5:

 ./bin/nodetool -h 10.84.73.47 -p 8080 removetoken status
 RemovalStatus: Removing token (32695837177645752437561450928649262701). 
 Waiting for replication confirmation from 
 [/10.84.73.49,/10.84.73.48,/10.84.73.51].

 Should I just be patient? Or is something really weird with this node?

5 days seems excessive unless there is a very large amount of data per
node.  I would check nodetool netstats, and if the streams don't look
active issue a 'removetoken force' against 10.84.73.47 and accept that
you may possibly need to run repair to restore the replica count.

-Brandon


Re: Nodetool removetoken taking days to run.

2011-09-14 Thread Brandon Williams
On Wed, Sep 14, 2011 at 4:25 PM, Ryan Hadley r...@sgizmo.com wrote:
 Hi Brandon,

 Thanks for the reply. Quick question though:

 1. We write all data to this ring with a TTL of 30 days
 2. This node hasn't been in the ring for at least 90 days, more like 120 days 
 since it's been in the ring.

 So, if I nodetool removetoken forced it, would I still have to be concerned 
 about running a repair?

There have probably been some writes that thought that node was part
of the replica set, so you may still be missing a replica in that
regard.  If you're only holding the data for 30 days though, it might
not be worth the trouble of repairing and instead bet that not all of
the live replicas will die in the next month.

 Also, after this node is removed, I'm going to rebalance with nodetool move. 
 Would that remove the repair requirement too?

If you intend to replace the node, it's better to bootstrap the new
node at the dead node's token minus one, and then do the removetoken
force.  This would actually obviate the need to repair (except for one
key, you can move the node to the old token once it has been removed)
assuming that your consistency level was greater than ONE for writes,
or your clients always replayed any failures. This holds true for
moving to the old token as well.

-Brandon


Re: Cassandra 082 - Large swap memory

2011-08-26 Thread Brandon Williams
On Thu, Aug 25, 2011 at 11:42 PM, King JKing beuk...@gmail.com wrote:
 Dear Jonathan,
 Cassandra process has 63.5 GB virtual size.
 I mention about RES column in top. RES is 8.3G. Very large than 2.5G Used
 Memory Used show in JConsole.

https://issues.apache.org/jira/browse/CASSANDRA-2868

-Brandon


Re: Completely removing a node from the cluster

2011-08-23 Thread Brandon Williams
On Tue, Aug 23, 2011 at 2:26 AM, aaron morton aa...@thelastpickle.com wrote:
 I'm running low on ideas for this one. Anyone else ?

 If the phantom node is not listed in the ring, other nodes should not be 
 storing hints for it. You can see what nodes they are storing hints for via 
 JConsole.

I think I found it in https://issues.apache.org/jira/browse/CASSANDRA-3071

--Brandon


Re: Ec2Snitch

2011-08-10 Thread Brandon Williams
You probably have other nodes that are NOT using the snitch yet, so
they haven't populated DC/RACK info yet.  The exceptions will stop
when all snitches have been changed.

On Wed, Aug 10, 2011 at 7:55 PM, Viliam Holub viliam.ho...@ucd.ie wrote:

 Hi,

 I tried to switch to Ec2Snith. Although it correctly found the region:

 INFO 23:18:00,643 EC2Snitch using region: eu-west, zone: 1a.

 it started to report NullPointerException every second:

 ERROR 00:23:40,268 Internal error processing get_slice
 java.lang.NullPointerException
        at 
 org.apache.cassandra.locator.Ec2Snitch.getDatacenter(Ec2Snitch.java:93)
        at 
 org.apache.cassandra.locator.DynamicEndpointSnitch.getDatacenter(DynamicEndpointSnitch.java:122)
        at 
 org.apache.cassandra.locator.OldNetworkTopologyStrategy.calculateNaturalEndpoints(OldNetworkTopologyStrategy.java:64)
        at 
 org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:99)
        at 
 org.apache.cassandra.service.StorageService.getLiveNaturalEndpoints(StorageService.java:1708)
        at 
 org.apache.cassandra.service.StorageService.getLiveNaturalEndpoints(StorageService.java:1702)
        at 
 org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:511)
        at 
 org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:480)
        at 
 org.apache.cassandra.thrift.CassandraServer.readColumnFamily(CassandraServer.java:126)
        at 
 org.apache.cassandra.thrift.CassandraServer.getSlice(CassandraServer.java:280)
        at 
 org.apache.cassandra.thrift.CassandraServer.multigetSliceInternal(CassandraServer.java:362)
        at 
 org.apache.cassandra.thrift.CassandraServer.get_slice(CassandraServer.java:323)
        at 
 org.apache.cassandra.thrift.Cassandra$Processor$get_slice.process(Cassandra.java:3033)
        at 
 org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
        at 
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
        at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:636)

 Am I doing something wrong?

 Thanks,
 Viliam




Re: migrating from 0.6 to 0.8, java.io.IOError: ... cannot extend file to required size

2011-08-09 Thread Brandon Williams
On Tue, Aug 9, 2011 at 6:43 PM, ian douglas i...@armorgames.com wrote:
 Driftx on IRC:
 follow up on the list saying that disabling mmap fixed it

Which is my code for I have no idea why this is happening, maybe
someone else does :)

-Brandon


Re: Cassandra-2252

2011-08-02 Thread Brandon Williams
On Tue, Aug 2, 2011 at 10:22 PM, Bill Hastings bllhasti...@gmail.com wrote:
 Oops. Sorry. Any information would be great.

The class does not exist in trunk and appears unused in the 0.8 branch.

--
Eric Evans


Re: cqlsh error using assume

2011-07-21 Thread Brandon Williams
'assume' is only valid in the cli, not cql.

On Thu, Jul 21, 2011 at 7:59 AM, Stephen Pope stephen.p...@quest.com wrote:
 I’m trying to use cqlsh (on Windows) to get some values from my database
 using secondary indexes. I’m not sure if it’s something I’m doing or not (I
 can’t seem to find any syntactical help for assume). I’m running:



 assume TransactionLogs comparator as ascii



 where TransactionLogs is my column family, and has string column names in
 it. The resulting (intuitive) error message is:



 line 1:0 no viable alternative at input 'assume'



 Anybody know what this means?



 Cheers,

 Steve


Re: JDBC CQL Driver unable to locate cassandra.yaml

2011-07-16 Thread Brandon Williams
Try another slash in file:/, ie file://

On Thu, Jul 14, 2011 at 10:55 AM, Derek Tracy trac...@gmail.com wrote:
 I tried putting the cassandra.yaml in the classpath but got the same error.
 Adding -Dcassandra.config=file:/path/to/cassandra.yaml did work.


 -
 Derek Tracy
 trac...@gmail.com
 -



 On Wed, Jul 13, 2011 at 6:22 PM, Jonathan Ellis jbel...@gmail.com wrote:

 The current version of the driver does require having the server's
 cassandra.yaml on the classpath.  This is a bug.

 On Wed, Jul 13, 2011 at 3:13 PM, Derek Tracy trac...@gmail.com wrote:
  I am trying to integrate the Cassandra JDBC CQL driver with my companies
  ETL
  product.
  We have an interface that performs database queries using there
  respective
  JDBC drivers.
  When I try to use the Cassandra CQL JDBC driver I keep getting a
  stacktrace:
 
  Unable to locate cassandra.yaml
 
  I am using Cassandra 0.8.1.  Is there a guide on how to utilize/setup
  the
  JDBC driver?
 
 
 
  Derek Tracy
  trac...@gmail.com
  -
 
 



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com




Re: node stuck leaving

2011-07-12 Thread Brandon Williams
On Mon, Jul 11, 2011 at 11:51 PM, Casey Deccio ca...@deccio.net wrote:
 java.lang.RuntimeException: Cannot recover SSTable with version f (current
 version g).

You need to scrub before any streaming is performed.

-Brandon


Re: Repair doesn't work after upgrading to 0.8.1

2011-07-05 Thread Brandon Williams
On Fri, Jul 1, 2011 at 3:16 AM, Sylvain Lebresne sylv...@datastax.com wrote:
 To make it clear what the problem is, this is not a repair problem. This is
 a gossip problem. Gossip is reporting that the remote node is a 0.7 node
 and repair is just saying I cannot use that node because repair has changed
 and the 0.7 node will not know how to answer me correctly, which is the
 correct behavior if the node happens to be a 0.7 node.

Technically, this is not part of gossip (in that no state is being
gossiped for this, but we do maintain this state in the Gossiper
class), but your analysis of the problem is correct.

The problem is that on an upgrade via rolling restart, the existing
nodes still remember the new ones as being old, so they mimic the old
version, thusly propagating the old version around.

 Hence, I'm kind of baffled that dropping a keyspace and recreating it fixed
 anything. Unless as part of removed the keyspace, you've deleted the
 system tables, in which case that could have triggered something.

I don't see how this could help either, since the version is bound in
Gossiper and set by IncomingTcpConnection.

I've created https://issues.apache.org/jira/browse/CASSANDRA-2860 to
get this resolved.

--Brandon


  1   2   >