date:20120816

[Cassandra Wiki] Update of ClientOptions by DeanHiller

2012-08-16 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The ClientOptions page has been changed by DeanHiller:
http://wiki.apache.org/cassandra/ClientOptions?action=diffrev1=156rev2=157

* Pycassa: http://github.com/pycassa/pycassa
* Telephus: http://github.com/driftx/Telephus (Twisted)
   * Java:
+   * PlayOrm: https://github.com/deanhiller/playorm
* Astyanax: https://github.com/Netflix/astyanax/wiki/Getting-Started
* Hector:
 * Site: http://hector-client.org

[jira] [Commented] (CASSANDRA-4512) Nodes removed with removetoken stay around preventing truncation

2012-08-16 Thread Manoj Kanta Mainali (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13435958#comment-13435958
]

Manoj Kanta Mainali commented on CASSANDRA-4512:

Do you have any concrete steps on how to reproduce this and what is the error
message you get? Did you check the server logs?

I tried truncating after create two instances, killing one of it and removing
its token using the nodetool, but cli didn't report any error and truncation
was successful.

The only reason truncate will throw nodes as UNREACHABLE is if the endpoint
still exists in the down nodes (unreachable) set. If your remove token was
successful and didn't throw any error, the endpoint would be removed from the
unreachable nodes set.

Nodes removed with removetoken stay around preventing truncation

Key: CASSANDRA-4512
URL: https://issues.apache.org/jira/browse/CASSANDRA-4512
Project: Cassandra
Issue Type: Bug
Affects Versions: 1.0.10
Environment: Ubuntu, EC2
Reporter: Taras Ovsyankin
Priority: Minor

Removed multiple nodes from the cluster in order to scale down (killed VMs
then ran removetoken for every dead node). Nodetool ring looks happy, but
cassandra-cli reports removed nodes as UNREACHABLE and truncation doesn't
work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-16 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436020#comment-13436020
 ] 

Jonathan Ellis commented on CASSANDRA-4481:
---

I'm not aware of commitlog format changes recently.  What specific versions are 
you saying are incompatible?

 Commitlog not replayed after restart - data lost
 

 Key: CASSANDRA-4481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical

 When data is written to the commitlog and I restart the machine, all commited 
 data is lost that has not been flushed to disk. 
 In the startup logs it says that it replays the commitlog successfully, but 
 the data is not available then. 
 When I open the commitlog file in an editor I can see the added data, but 
 after the restart it cannot be fetched from cassandra. 
 {code}
  INFO 09:59:45,362 Replaying 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Finished reading 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-16 Thread Florent Clairambault (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436024#comment-13436024
 ] 

Florent Clairambault commented on CASSANDRA-4481:
-

Well, I spoke a little bit too fast here. It looks like they are incompatible 
and it looks like this potential so-called (by me) incompatibility occurs 
between 1.1.1 or 1.1.2 and 1.1.3. 


 Commitlog not replayed after restart - data lost
 

 Key: CASSANDRA-4481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical

 When data is written to the commitlog and I restart the machine, all commited 
 data is lost that has not been flushed to disk. 
 In the startup logs it says that it replays the commitlog successfully, but 
 the data is not available then. 
 When I open the commitlog file in an editor I can see the added data, but 
 after the restart it cannot be fetched from cassandra. 
 {code}
  INFO 09:59:45,362 Replaying 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Finished reading 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-16 Thread Florent Clairambault (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13435597#comment-13435597
]

Florent Clairambault edited comment on CASSANDRA-4481 at 8/17/12 2:08 AM:
--

So, I didn't find (or tried to find) a solution to reproduce this bug. But I
found an (incomplete) fix. I'm on Debian/6.0.5, still with Cassandra/1.1.3:

For a keyspace named dom2dom (so that I don't have to replace any name).

In my case I removed all the commitlog files that were created prior to 1.1.3
{code:title=Shell commands|borderStyle=solid}
# 1. Flush cassandra
nodetool flush

# 2. Stop cassandra
service cassandra stop

# 3. Move the sstable files to an other directory
mkdir /var/lib/cassandra/toload
mv /var/lib/cassandra/data/dom2dom /var/lib/cassandra/toload/m2mp

# In my case, I had to create a 127.0.0.2 loopback interface
# and update the cassandra.yaml file to change rpc_address and listen_address
settings
# to 127.0.0.2 so that sstableloader could work.

# 4. Start cassandra
service cassandra stop

# At that point the commitlogs should work again and you should have some new
sstable created
du -sh /var/lib/cassandra/dom2dom
# Returns: 236K

# You now have the new data and not the old one, so you need to load the old
data using sstableloader:
find /var/lib/cassandra/toload/dom2dom/ -type d -exec sstableloader -d
127.0.0.2 {} \;

# In my case, I had to put back localhost in the cassandra.yaml for the
rpc_address and listen_address settings

# You can delete the /var/lib/cassandra/toload folder
{code}

*IMPORTANT NOTE:*
I'm don't think the old (prior to 1.1.3) commitlog files will work. From what
I've quickly tested it doesn't.

Still, it would be very good to have some kind of error/warning logs around
these commitlogs that are not taken into account for a reason. Because you
currently only discover it when you restart your cassandra server.

was (Author: superfc):
So, I didn't find (or tried to find) a solution to reproduce this bug. But
I found an (incomplete) fix. I'm on Debian/6.0.5, still with Cassandra/1.1.3:

For a keyspace named dom2dom (so that I don't have to replace any name).

In my case I removed all the commitlog files that were created prior to 1.1.3
{code:title=Shell commands|borderStyle=solid}
# 1. Flush cassandra
nodetool flush

# 2. Stop cassandra
service cassandra stop

# 3. Move the sstable files to an other directory
mkdir /var/lib/cassandra/toload
mv /var/lib/cassandra/data/dom2dom /var/lib/cassandra/toload/m2mp

# In my case, I had to create a 127.0.0.2 loopback interface
# and update the cassandra.yaml file to change rpc_address and listen_address
settings
# to 127.0.0.2 so that sstableloader could work.

# 4. Start cassandra
service cassandra stop

# At that point the commitlogs should work again and you should have some new
sstable created
du -sh /var/lib/cassandra/dom2dom
# Returns: 236K

# You now have the new data and not the old one, so you need to load the old
data using sstableloader:
find /var/lib/cassandra/toload/dom2dom/ -type d -exec sstableloader -d
127.0.0.2 {} \;

# In my case, I had to put back localhost in the cassandra.yaml for the
rpc_address and listen_address settings

# You can delete the /var/lib/cassandra/toload folder
{code}

*IMPORTANT NOTE:*
I'm don't think the old (prior to 1.1.3) commitlog files will work. From what
I've quickly tested it doesn't.

Still, it would be very good to have some kind of error/warning logs around
this commit logs and sstables incompatibility issue. Because you currently only
discover it when you restart your cassandra server.

Commitlog not replayed after restart - data lost

Key: CASSANDRA-4481
URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
Project: Cassandra
Issue Type: Bug
Affects Versions: 1.1.2
Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical

When data is written to the commitlog and I restart the machine, all commited
data is lost that has not been flushed to disk.
In the startup logs it says that it replays the commitlog successfully, but
the data is not available then.
When I open the commitlog file in an editor I can see the added data, but
after the restart it cannot be fetched from cassandra.
{code}
INFO 09:59:45,362 Replaying
/var/myproject/cassandra/commitlog/CommitLog-83203377067.log
INFO 09:59:45,476 Finished reading
/var/myproject/cassandra/commitlog/CommitLog-83203377067.log
INFO 09:59:45,476 Log replay complete, 0 replayed mutations
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:

[jira] [Comment Edited] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-16 Thread Florent Clairambault (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436024#comment-13436024
 ] 

Florent Clairambault edited comment on CASSANDRA-4481 at 8/17/12 2:08 AM:
--

Well, I spoke a little bit too fast here. It looks like they are incompatible 
and it looks like this potential so-called (by me) incompatibility occurs 
between 1.1.1 or 1.1.2 and 1.1.3. I tried to clarify my previous message.


  was (Author: superfc):
Well, I spoke a little bit too fast here. It looks like they are 
incompatible and it looks like this potential so-called (by me) incompatibility 
occurs between 1.1.1 or 1.1.2 and 1.1.3. 

  
 Commitlog not replayed after restart - data lost
 

 Key: CASSANDRA-4481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical

 When data is written to the commitlog and I restart the machine, all commited 
 data is lost that has not been flushed to disk. 
 In the startup logs it says that it replays the commitlog successfully, but 
 the data is not available then. 
 When I open the commitlog file in an editor I can see the added data, but 
 after the restart it cannot be fetched from cassandra. 
 {code}
  INFO 09:59:45,362 Replaying 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Finished reading 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-16 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436030#comment-13436030
 ] 

Jonathan Ellis commented on CASSANDRA-4481:
---

If you can reproduce commitlog incompatibility between those versions, please 
let us know.

 Commitlog not replayed after restart - data lost
 

 Key: CASSANDRA-4481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical

 When data is written to the commitlog and I restart the machine, all commited 
 data is lost that has not been flushed to disk. 
 In the startup logs it says that it replays the commitlog successfully, but 
 the data is not available then. 
 When I open the commitlog file in an editor I can see the added data, but 
 after the restart it cannot be fetched from cassandra. 
 {code}
  INFO 09:59:45,362 Replaying 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Finished reading 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-16 Thread Florent Clairambault (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436033#comment-13436033
 ] 

Florent Clairambault commented on CASSANDRA-4481:
-

As I have something that now works fine, I don't think I will do it.

 Commitlog not replayed after restart - data lost
 

 Key: CASSANDRA-4481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical

 When data is written to the commitlog and I restart the machine, all commited 
 data is lost that has not been flushed to disk. 
 In the startup logs it says that it replays the commitlog successfully, but 
 the data is not available then. 
 When I open the commitlog file in an editor I can see the added data, but 
 after the restart it cannot be fetched from cassandra. 
 {code}
  INFO 09:59:45,362 Replaying 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Finished reading 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-16 Thread Florent Clairambault (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13435597#comment-13435597
]

Florent Clairambault edited comment on CASSANDRA-4481 at 8/17/12 2:21 AM:
--

So, I didn't find (or tried to find) a solution to reproduce this bug. But I
found an (incomplete) fix. I'm on Debian/6.0.5, still with Cassandra/1.1.3:

For a keyspace named dom2dom (so that I don't have to replace any name).

In my case I removed all the commitlog files that were created prior to 1.1.3
{code:title=Shell commands|borderStyle=solid}
# 1. Flush cassandra
nodetool flush

# 2. Stop cassandra
service cassandra stop

# 3. Move the sstable files to an other directory
mkdir /var/lib/cassandra/toload
mv /var/lib/cassandra/data/dom2dom /var/lib/cassandra/toload/m2mp

# In my case, I had to create a 127.0.0.2 loopback interface
# and update the cassandra.yaml file to change rpc_address and listen_address
settings
# to 127.0.0.2 so that sstableloader could work.

# 4. Start cassandra
service cassandra stop

# At that point the commitlogs should work again and you should have some new
sstable created
du -sh /var/lib/cassandra/dom2dom
# Returns: 236K

# You now have the new data and not the old one, so you need to load the old
data using sstableloader:
find /var/lib/cassandra/toload/dom2dom/ -type d -exec sstableloader -d
127.0.0.2 {} \;

# In my case, I had to put back localhost in the cassandra.yaml for the
rpc_address and listen_address settings

# You can delete the /var/lib/cassandra/toload folder
{code}

*IMPORTANT NOTE:*
I'm don't think the old (prior to 1.1.3) commitlog files will work. From what
I've quickly tested they don't.

Still, it would be very good to have some kind of error/warning logs around
these commitlogs that are not taken into account for a reason. Because you
currently only discover it when you restart your cassandra server.

was (Author: superfc):
So, I didn't find (or tried to find) a solution to reproduce this bug. But
I found an (incomplete) fix. I'm on Debian/6.0.5, still with Cassandra/1.1.3:

For a keyspace named dom2dom (so that I don't have to replace any name).

In my case I removed all the commitlog files that were created prior to 1.1.3
{code:title=Shell commands|borderStyle=solid}
# 1. Flush cassandra
nodetool flush

# 2. Stop cassandra
service cassandra stop

# 3. Move the sstable files to an other directory
mkdir /var/lib/cassandra/toload
mv /var/lib/cassandra/data/dom2dom /var/lib/cassandra/toload/m2mp

# In my case, I had to create a 127.0.0.2 loopback interface
# and update the cassandra.yaml file to change rpc_address and listen_address
settings
# to 127.0.0.2 so that sstableloader could work.

# 4. Start cassandra
service cassandra stop

# At that point the commitlogs should work again and you should have some new
sstable created
du -sh /var/lib/cassandra/dom2dom
# Returns: 236K

# You now have the new data and not the old one, so you need to load the old
data using sstableloader:
find /var/lib/cassandra/toload/dom2dom/ -type d -exec sstableloader -d
127.0.0.2 {} \;

# In my case, I had to put back localhost in the cassandra.yaml for the
rpc_address and listen_address settings

# You can delete the /var/lib/cassandra/toload folder
{code}

*IMPORTANT NOTE:*
I'm don't think the old (prior to 1.1.3) commitlog files will work. From what
I've quickly tested it doesn't.

Still, it would be very good to have some kind of error/warning logs around
these commitlogs that are not taken into account for a reason. Because you
currently only discover it when you restart your cassandra server.

Commitlog not replayed after restart - data lost

Key: CASSANDRA-4481
URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
Project: Cassandra
Issue Type: Bug
Affects Versions: 1.1.2
Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical

When data is written to the commitlog and I restart the machine, all commited
data is lost that has not been flushed to disk.
In the startup logs it says that it replays the commitlog successfully, but
the data is not available then.
When I open the commitlog file in an editor I can see the added data, but
after the restart it cannot be fetched from cassandra.
{code}
INFO 09:59:45,362 Replaying
/var/myproject/cassandra/commitlog/CommitLog-83203377067.log
INFO 09:59:45,476 Finished reading
/var/myproject/cassandra/commitlog/CommitLog-83203377067.log
INFO 09:59:45,476 Log replay complete, 0 replayed mutations
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:

[jira] [Commented] (CASSANDRA-4292) Per-disk I/O queues

2012-08-16 Thread Jonathan Ellis (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436059#comment-13436059
]

Jonathan Ellis commented on CASSANDRA-4292:
---

Your instincts were better than mine: combining compaction and flush i/o into a
single executor was a mistake. We could band-aid it by adding some kind of
semaphore mechanism to make sure we always leave at least one thread free for
flushing but this still won't let us max out on flushing temporarily at the
expense of compaction, without introducing extremely complicated preemption
logic.

So, color me convinced that we need to keep separate executors for flush and
compaction.

Additionally, the more I think about it the less I think the DBT abstraction is
what we want here. Or at a higher level: I don't think we want to be that
strict about one thread per disk. Which was my fault in the first place, sorry!

If we instead just follow the above disk prioritization logic, we'll still get
effectively thread-per-disk until disks start to run out of space. But having
a (standard) flexible pool of threads means that we generalize much better to
SSDs, where having substantially more threads than disks makes sense (since
compaction becomes CPU bound).

So I think we can simplify our approach a lot, perhaps by having a global
Directory state that tracks space remaining and how many i/o tasks are running
on each, that we can use when handing out flush and compaction targets. The
executor architecture won't need to change. (May want to introduce a
DirectoryBoundRunnable abstraction, whose run method encapsulates updating i/o
task count and space free after running the flush/compaction, but without
trying it I'm not sure if that actually works as imagined.)

Per-disk I/O queues
---

Key: CASSANDRA-4292
URL: https://issues.apache.org/jira/browse/CASSANDRA-4292
Project: Cassandra
Issue Type: Improvement
Components: Core
Reporter: Jonathan Ellis
Assignee: Yuki Morishita
Fix For: 1.2

Attachments: 4292.txt, 4292-v2.txt, 4292-v3.txt

As noted in CASSANDRA-809, we have a certain amount of flush (and compaction)
threads, which mix and match disk volumes indiscriminately. It may be worth
creating a tight thread - disk affinity, to prevent unnecessary conflict at
that level.
OTOH as SSDs become more prevalent this becomes a non-issue. Unclear how
much pain this actually causes in practice in the meantime.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2710) Get multiple column ranges

2012-08-16 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436075#comment-13436075
 ] 

Jonathan Ellis commented on CASSANDRA-2710:
---

What is the use case?  TBH this seems like a misfeature to me, that we only 
support for backwards compatibility.

 Get multiple column ranges
 --

 Key: CASSANDRA-2710
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2710
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: David Boxenhorn
Assignee: Vijay
  Labels: compositeColumns, cql
 Attachments: 0001-2710-multiple-column-ranges-cql.patch, 
 0001-2710-multiple-column-ranges-thrift.patch


 I have replaced all my super column families with regular column families 
 using composite columns. I have easily been able to support all previous 
 functionality (I don't need range delete) except for one thing: getting 
 multiple super columns with a single access. For this, I would need to get 
 multiple ranges. (I can get multiple columns, or a single range, but not 
 multiple ranges.) 
 For example, I used to have
 [superColumnName1,subColumnName1..N],[superColumnName2,subColumnName1..N]
 and I could get superColumnName1, superColumnName2
 Now I have
 [lensuperColumnName10lensubColumnName1..lensuperColumnName10lensubColumnNameN],[lensuperColumnName20lensubColumnName1..lensuperColumnName20lensubColumnNameN]
 and I need to get superColumnName1..superColumnName1+, 
 superColumnName2..superColumnName2+
 to get the same functionality
 I would like the clients to support this functionality, e.g. Hector to have 
 .setRages parallel to .setColumnNames 
 and for CQL to support a syntax like 
 SELECT [FIRST N] [REVERSED] name1..nameN1, name2..nameN2... FROM ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4487) remove uses of SchemaDisagreementException

2012-08-16 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436087#comment-13436087
 ] 

Jonathan Ellis commented on CASSANDRA-4487:
---

LGTM

 remove uses of SchemaDisagreementException
 --

 Key: CASSANDRA-4487
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4487
 Project: Cassandra
  Issue Type: Bug
  Components: API
Affects Versions: 1.2
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
Priority: Minor
 Fix For: 1.2

 Attachments: 0001-code-changes.patch, 0002-re-generated-thrift.patch, 
 CASSANDRA-4487-v2.patch


 Since we can handle concurrent schema changes now, there's no need to 
 validateSchemaAgreement before modification now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2897) Secondary indexes without read-before-write

2012-08-16 Thread Jonathan Ellis (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-2897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jonathan Ellis updated CASSANDRA-2897:
--

Fix Version/s: 1.2

Secondary indexes without read-before-write
---

Key: CASSANDRA-2897
URL: https://issues.apache.org/jira/browse/CASSANDRA-2897
Project: Cassandra
Issue Type: Improvement
Components: Core
Affects Versions: 0.7.0
Reporter: Sylvain Lebresne
Priority: Minor
Labels: secondary_index
Fix For: 1.2

Attachments:
0001-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt

Currently, secondary index updates require a read-before-write to maintain
the index consistency. Keeping the index consistent at all time is not
necessary however. We could let the (secondary) index get inconsistent on
writes and repair those on reads. This would be easy because on reads, we
make sure to request the indexed columns anyway, so we can just skip the row
that are not needed and repair the index at the same time.
This does trade work on writes for work on reads. However, read-before-write
is sufficiently costly that it will likely be a win overall.
There is (at least) two small technical difficulties here though:
# If we repair on read, this will be racy with writes, so we'll probably have
to synchronize there.
# We probably shouldn't only rely on read to repair and we should also have a
task to repair the index for things that are rarely read. It's unclear how to
make that low impact though.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-4549) Update the pig examples to include more recent pig/cassandra features

2012-08-16 Thread Jeremy Hanna (JIRA)

Jeremy Hanna created CASSANDRA-4549:
---

 Summary: Update the pig examples to include more recent 
pig/cassandra features
 Key: CASSANDRA-4549
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4549
 Project: Cassandra
  Issue Type: Task
  Components: Hadoop
Reporter: Jeremy Hanna
Assignee: Jeremy Hanna
Priority: Minor


Now that there is support for a variety of Cassandra features from Pig (esp 
1.1+), it would great to have some of them in the examples so that people can 
see how to use them.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CASSANDRA-1743) Switch to TFastFramedTransport

2012-08-16 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-1743.
---

   Resolution: Won't Fix
Fix Version/s: (was: 1.2)
 Assignee: (was: T Jake Luciani)

now that we have a custom binary protocol, we can leave thrift in peace

 Switch to TFastFramedTransport
 --

 Key: CASSANDRA-1743
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1743
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
 Attachments: 1743.txt, 1743.txt, 1743_v3.txt, 1743_v4.txt, 1743_v5.txt

   Original Estimate: 16h
  Remaining Estimate: 16h

 Forgot that after THRIFT-831 fast mode is not the default and is a separate 
 transport class.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2710) Get multiple column ranges

2012-08-16 Thread Vijay (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436124#comment-13436124
 ] 

Vijay commented on CASSANDRA-2710:
--

Hi Jonathan, Consider a Use case where we have a hierarchical data, 
type:latest:1:2, type:v1:1:2, type:v2:1,2 etc the user might want to query all 
the data most of the time, sometimes only the latest and sometimes a specific 
versions of a given type. 

1) if we model the type to be part of the Row Key then the problem is for 80% 
or so use case i will be doing a multi-get (we dont advice OPP so sometimes you 
might need a index). 
2) if i have all of them in one row then i will be doing multiple calls to get 
the data out. 

I am not arguing the need for it, there are other ways you can get it done (by 
adding the type and v1 in the super column name or something like that)... but 
it will be little more flexible. I am fine closing the ticket too :) Let me 
know Thanks! 

 Get multiple column ranges
 --

 Key: CASSANDRA-2710
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2710
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: David Boxenhorn
Assignee: Vijay
  Labels: compositeColumns, cql
 Attachments: 0001-2710-multiple-column-ranges-cql.patch, 
 0001-2710-multiple-column-ranges-thrift.patch


 I have replaced all my super column families with regular column families 
 using composite columns. I have easily been able to support all previous 
 functionality (I don't need range delete) except for one thing: getting 
 multiple super columns with a single access. For this, I would need to get 
 multiple ranges. (I can get multiple columns, or a single range, but not 
 multiple ranges.) 
 For example, I used to have
 [superColumnName1,subColumnName1..N],[superColumnName2,subColumnName1..N]
 and I could get superColumnName1, superColumnName2
 Now I have
 [lensuperColumnName10lensubColumnName1..lensuperColumnName10lensubColumnNameN],[lensuperColumnName20lensubColumnName1..lensuperColumnName20lensubColumnNameN]
 and I need to get superColumnName1..superColumnName1+, 
 superColumnName2..superColumnName2+
 to get the same functionality
 I would like the clients to support this functionality, e.g. Hector to have 
 .setRages parallel to .setColumnNames 
 and for CQL to support a syntax like 
 SELECT [FIRST N] [REVERSED] name1..nameN1, name2..nameN2... FROM ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-4550) nodetool ring output should use hex not integers for tokens

2012-08-16 Thread Aaron Turner (JIRA)

Aaron Turner created CASSANDRA-4550:
---

 Summary: nodetool ring output should use hex not integers for 
tokens
 Key: CASSANDRA-4550
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4550
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Affects Versions: 1.0.9
 Environment: Linux
Reporter: Aaron Turner
Priority: Minor


The current output of nodetool ring prints start token values as base10 
integers instead of hex.  This is not very user friendly for a number of 
reasons:

1. Hides the fact that the values are 128bit

2. Values are not of a consistent length, while in hex padding with zero is 
generally accepted

3. When using the default random partitioner, having the values in hex makes it 
easier for users to determine which node(s) a given key resides on since md5 
utilities like md5sum output hex.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2710) Get multiple column ranges

2012-08-16 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436224#comment-13436224
 ] 

Jonathan Ellis commented on CASSANDRA-2710:
---

bq. if we model the type to be part of the Row Key then the problem is for 80% 
or so use case i will be doing a multi-get

What's the objection here?  multiget-within-a-single-row still has all the 
problems of multiget-across-rows, with the added problem that it doesn't 
parallelize across machines.

 Get multiple column ranges
 --

 Key: CASSANDRA-2710
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2710
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: David Boxenhorn
Assignee: Vijay
  Labels: compositeColumns, cql
 Attachments: 0001-2710-multiple-column-ranges-cql.patch, 
 0001-2710-multiple-column-ranges-thrift.patch


 I have replaced all my super column families with regular column families 
 using composite columns. I have easily been able to support all previous 
 functionality (I don't need range delete) except for one thing: getting 
 multiple super columns with a single access. For this, I would need to get 
 multiple ranges. (I can get multiple columns, or a single range, but not 
 multiple ranges.) 
 For example, I used to have
 [superColumnName1,subColumnName1..N],[superColumnName2,subColumnName1..N]
 and I could get superColumnName1, superColumnName2
 Now I have
 [lensuperColumnName10lensubColumnName1..lensuperColumnName10lensubColumnNameN],[lensuperColumnName20lensubColumnName1..lensuperColumnName20lensubColumnNameN]
 and I need to get superColumnName1..superColumnName1+, 
 superColumnName2..superColumnName2+
 to get the same functionality
 I would like the clients to support this functionality, e.g. Hector to have 
 .setRages parallel to .setColumnNames 
 and for CQL to support a syntax like 
 SELECT [FIRST N] [REVERSED] name1..nameN1, name2..nameN2... FROM ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4550) nodetool ring output should use hex not integers for tokens

2012-08-16 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-4550:
--

 Priority: Trivial  (was: Minor)
Affects Version/s: (was: 1.0.9)
   Labels: lhf  (was: )

 nodetool ring output should use hex not integers for tokens
 ---

 Key: CASSANDRA-4550
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4550
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
 Environment: Linux
Reporter: Aaron Turner
Priority: Trivial
  Labels: lhf

 The current output of nodetool ring prints start token values as base10 
 integers instead of hex.  This is not very user friendly for a number of 
 reasons:
 1. Hides the fact that the values are 128bit
 2. Values are not of a consistent length, while in hex padding with zero is 
 generally accepted
 3. When using the default random partitioner, having the values in hex makes 
 it easier for users to determine which node(s) a given key resides on since 
 md5 utilities like md5sum output hex.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4304) Add bytes-limit clause to queries

2012-08-16 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-4304:
--

Fix Version/s: (was: 1.2.0)

 Add bytes-limit clause to queries
 -

 Key: CASSANDRA-4304
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4304
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Christian Spriegel
 Attachments: TestImplForSlices.patch


 Idea is to add a second limit clause to (slice)queries. This would allow easy 
 loading of batches, even if content is variable sized.
 Imagine the following use case:
 You want to load a batch of XMLs, where each is between 100bytes and 5MB 
 large.
 Currently you can load either
 - a large number of XMLs, but risk OOMs or timeouts
 or
 - a small number of XMLs, and do too many queries where each query usually 
 retrieves very little data.
 With cassandra being able to limit by size and not just count, we could do a 
 single query which would never OOM but always return a decent amount of data 
 -- with no extra overhead for multiple queries.
 Few thoughts from my side:
 - The limit should be a soft limit, not a hard limit. Therefore it will 
 always return at least one row/column, even if that one large than the limit 
 specifies.
 - HintedHandoffManager:303 is already doing a 
 InMemoryCompactionLimit/averageColumnSize to avoid OOM. It could then simply 
 use the new limit clause :-)
 - A bytes-limit on a range- or indexed-query should always return a complete 
 row

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CASSANDRA-3649) Code style changes, aka The Big Reformat

2012-08-16 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-3649.
---

   Resolution: Won't Fix
Fix Version/s: (was: 1.2.0)

Realistically I guess this is not going to happen.

 Code style changes, aka The Big Reformat
 

 Key: CASSANDRA-3649
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3649
 Project: Cassandra
  Issue Type: Wish
  Components: Core
Reporter: Brandon Williams

 With a new major release coming soon and not having a ton of huge pending 
 patches that have prevented us from doing this in the past, post-freeze looks 
 like a good time to finally do this.  Mostly this will include the removal of 
 underscores in private variables, and no more brace-on-newline policy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-1337) parallelize fetching rows for low-cardinality indexes

2012-08-16 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-1337:
--

 Priority: Minor  (was: Major)
Fix Version/s: (was: 1.2.0)
   1.2.1

 parallelize fetching rows for low-cardinality indexes
 -

 Key: CASSANDRA-1337
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1337
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: David Alves
Priority: Minor
 Fix For: 1.2.1

 Attachments: 
 0001-CASSANDRA-1337-scan-concurrently-depending-on-num-rows.txt, 
 1137-bugfix.patch, CASSANDRA-1337.patch

   Original Estimate: 8h
  Remaining Estimate: 8h

 currently, we read the indexed rows from the first node (in partitioner 
 order); if that does not have enough matching rows, we read the rows from the 
 next, and so forth.
 we should use the statistics fom CASSANDRA-1155 to query multiple nodes in 
 parallel, such that we have a high chance of getting enough rows w/o having 
 to do another round of queries (but, if our estimate is incorrect, we do need 
 to loop and do more rounds until we have enough data or we have fetched from 
 each node).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4292) Improve JBOD loadbalancing and reduce contention

2012-08-16 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-4292:
--

Summary: Improve JBOD loadbalancing and reduce contention  (was: Per-disk 
I/O queues)

 Improve JBOD loadbalancing and reduce contention
 

 Key: CASSANDRA-4292
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4292
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Yuki Morishita
 Fix For: 1.2.0

 Attachments: 4292.txt, 4292-v2.txt, 4292-v3.txt


 As noted in CASSANDRA-809, we have a certain amount of flush (and compaction) 
 threads, which mix and match disk volumes indiscriminately.  It may be worth 
 creating a tight thread - disk affinity, to prevent unnecessary conflict at 
 that level.
 OTOH as SSDs become more prevalent this becomes a non-issue.  Unclear how 
 much pain this actually causes in practice in the meantime.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3783) Add 'null' support to CQL 3.0

2012-08-16 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3783:
--

Fix Version/s: (was: 1.2.0)
   1.2.1

 Add 'null' support to CQL 3.0
 -

 Key: CASSANDRA-3783
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3783
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Reporter: Sylvain Lebresne
Priority: Minor
  Labels: cql3
 Fix For: 1.2.1


 Dense composite supports adding records where only a prefix of all the 
 component specifying the key is defined. In other words, with:
 {noformat}
 CREATE TABLE connections (
userid int,
ip text,
port int,
protocol text,
time timestamp,
PRIMARY KEY (userid, ip, port, protocol)
 ) WITH COMPACT STORAGE
 {noformat}
 you can insert
 {noformat}
 INSERT INTO connections (userid, ip, port, time) VALUES (2, '192.168.0.1', 
 80, 123456789);
 {noformat}
 You cannot however select that column specifically (i.e, without selecting 
 column (2, '192.168.0.1', 80, 'http') for instance).
 This ticket proposes to allow that though 'null', i.e. to allow
 {noformat}
 SELECT * FROM connections WHERE userid = 2 AND ip = '192.168.0.1' AND port = 
 80 AND protocol = null;
 {noformat}
 It would then also make sense to support:
 {noformat}
 INSERT INTO connections (userid, ip, port, protocol, time) VALUES (2, 
 '192.168.0.1', 80, null, 123456789);
 {noformat}
 as an equivalent to the insert query above.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3920) tests for cqlsh

2012-08-16 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3920:
--

Fix Version/s: (was: 1.2.0)

 tests for cqlsh
 ---

 Key: CASSANDRA-3920
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3920
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: paul cannon
Assignee: paul cannon
Priority: Minor
  Labels: cqlsh

 Cqlsh has become big enough and tries to cover enough situations that it's 
 time to start acting like a responsible adult and make this bugger some unit 
 tests to guard against regressions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4430) optional pluggable o.a.c.metrics reporters

2012-08-16 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-4430:
--

Fix Version/s: (was: 1.2.0)

 optional pluggable o.a.c.metrics reporters
 --

 Key: CASSANDRA-4430
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4430
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Burroughs
Priority: Minor

 CASSANDRA-4009  expanded the use of the metrics library which has a set of 
 reporter modules http://metrics.codahale.com/manual/core/#reporters  You can 
 report to flat files, ganglia, spit everything over http, etc.  The next step 
 is a mechanism for using those reporters with  o.a.c.metrics.  To avoid 
 bundling everything I suggest following the mx4j approach of enable only if 
 on classpath coupled with a reporter configuration file.
 Strawman file:
 {noformat}
 console:
   time: 1
   timeunit: seconds
 csv:
  - time: 1
timeunit: minutes
file: foo.csv
  - time: 10
timeunit: seconds
 file: bar.csv
 ganglia:
  - time: 30
timunit: seconds
host: server-1
port: 8649
  - time: 30
timunit: seconds
host: server-2
port: 8649
 {noformat}
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4316) Compaction Throttle too bursty with large rows

2012-08-16 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-4316:
--

Fix Version/s: (was: 1.2.0)
   1.2.1

 Compaction Throttle too bursty with large rows
 --

 Key: CASSANDRA-4316
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4316
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.8.0
Reporter: Wayne Lewis
Assignee: Yuki Morishita
Priority: Minor
 Fix For: 1.2.1


 In org.apache.cassandra.db.compaction.CompactionIterable the check for 
 compaction throttling occurs once every 1000 rows. In our workload this is 
 much too large as we have many large rows (16 - 100 MB).
 With a 100 MB row, about 100 GB is read (and possibly written) before the 
 compaction throttle sleeps. This causes bursts of essentially unthrottled 
 compaction IO followed by a long sleep which yields inconsistence performance 
 and high error rates during the bursts.
 We applied a workaround to check throttle every row which solved our 
 performance and error issues:
 line 116 in org.apache.cassandra.db.compaction.CompactionIterable:
 if ((row++ % 1000) == 0)
 replaced with
 if ((row++ % 1) == 0)
 I think the better solution is to calculate how often throttle should be 
 checked based on the throttle rate to apply sleeps more consistently. E.g. if 
 16MB/sec is the limit then check for sleep after every 16MB is read so sleeps 
 are spaced out about every second.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2293) Rewrite nodetool help

2012-08-16 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2293:
--

Fix Version/s: (was: 1.2.0)

 Rewrite nodetool help
 -

 Key: CASSANDRA-2293
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2293
 Project: Cassandra
  Issue Type: Improvement
  Components: Core, Documentation  website
Affects Versions: 0.8 beta 1
Reporter: Aaron Morton
Assignee: David Alves
Priority: Minor

 Once CASSANDRA-2008 is through and we are happy with the approach I would 
 like to write similar help for nodetool. 
 Both command line help of the form nodetool help and nodetool help 
 command.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4536) Ability for CQL3 to list partition keys

2012-08-16 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-4536:
--

Fix Version/s: (was: 1.2.0)
   1.2.1

 Ability for CQL3 to list partition keys
 ---

 Key: CASSANDRA-4536
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4536
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Affects Versions: 1.1.0
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
Priority: Minor
  Labels: cql3
 Fix For: 1.2.1


 It can be useful to know the set of in-use partition keys (storage engine row 
 keys).  One example given to me was where application data was modeled as a 
 few 10s of 1000s of wide rows, where the app required presenting these rows 
 to the user sorted based on information in the partition key.  The partition 
 count is small enough to do the sort client-side in memory, which is what the 
 app did with the Thrift API--a range slice with an empty columns list.
 This was a problem when migrating to CQL3.  {{SELECT mykey FROM mytable}} 
 includes all the logical rows, which makes the resultset too large to make 
 this a reasonable approach, even with paging.
 One way to add support would be to allow DISTINCT in the special case of 
 {{SELECT DISTINCT mykey FROM mytable}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CASSANDRA-2109) Improve default window size for DES

2012-08-16 Thread Brandon Williams (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams resolved CASSANDRA-2109.
-

   Resolution: Duplicate
Fix Version/s: (was: 1.3)
   1.2.0

Resolved by CASSANDRA-4038

 Improve default window size for DES
 ---

 Key: CASSANDRA-2109
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2109
 Project: Cassandra
  Issue Type: Improvement
Reporter: Stu Hood
Priority: Minor
  Labels: des
 Fix For: 1.2.0


 The window size for DES is currently hardcoded at 100 requests. A larger 
 window means that it takes longer to react to a suddenly slow node, but that 
 you have a smoother transition for scores.
 An example of bad behaviour: with a window of size 100, we saw a case with a 
 failing node where if enough requests could be answered quickly out of cache 
 or bloomfilters, the window might be momentarily filled with 10 ms requests, 
 pushing out requests that had to go disk and took 10 seconds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2710) Get multiple column ranges

2012-08-16 Thread Vijay (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436274#comment-13436274
 ] 

Vijay commented on CASSANDRA-2710:
--

 multiget-within-a-single-row still has all the problems of 
 multiget-across-rows, with the added problem that it doesn't parallelize 
 across machines.
Well multiget-within-a-single-row is suppose to be one sequential IO (hence 
more throughput at-least for the best case), and the co-ordinator doesn't need 
to wait for the slowest responding node (more transient memory in the 
co-ordinator) etc.

 Get multiple column ranges
 --

 Key: CASSANDRA-2710
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2710
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: David Boxenhorn
Assignee: Vijay
  Labels: compositeColumns, cql
 Attachments: 0001-2710-multiple-column-ranges-cql.patch, 
 0001-2710-multiple-column-ranges-thrift.patch


 I have replaced all my super column families with regular column families 
 using composite columns. I have easily been able to support all previous 
 functionality (I don't need range delete) except for one thing: getting 
 multiple super columns with a single access. For this, I would need to get 
 multiple ranges. (I can get multiple columns, or a single range, but not 
 multiple ranges.) 
 For example, I used to have
 [superColumnName1,subColumnName1..N],[superColumnName2,subColumnName1..N]
 and I could get superColumnName1, superColumnName2
 Now I have
 [lensuperColumnName10lensubColumnName1..lensuperColumnName10lensubColumnNameN],[lensuperColumnName20lensubColumnName1..lensuperColumnName20lensubColumnNameN]
 and I need to get superColumnName1..superColumnName1+, 
 superColumnName2..superColumnName2+
 to get the same functionality
 I would like the clients to support this functionality, e.g. Hector to have 
 .setRages parallel to .setColumnNames 
 and for CQL to support a syntax like 
 SELECT [FIRST N] [REVERSED] name1..nameN1, name2..nameN2... FROM ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2710) Get multiple column ranges

2012-08-16 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436276#comment-13436276
 ] 

Jonathan Ellis commented on CASSANDRA-2710:
---

bq. multiget-within-a-single-row is suppose to be one sequential IO

only if the row is small enough, in which case, is there that much benefit over 
just grabbing the whole row?

 Get multiple column ranges
 --

 Key: CASSANDRA-2710
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2710
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: David Boxenhorn
Assignee: Vijay
  Labels: compositeColumns, cql
 Attachments: 0001-2710-multiple-column-ranges-cql.patch, 
 0001-2710-multiple-column-ranges-thrift.patch


 I have replaced all my super column families with regular column families 
 using composite columns. I have easily been able to support all previous 
 functionality (I don't need range delete) except for one thing: getting 
 multiple super columns with a single access. For this, I would need to get 
 multiple ranges. (I can get multiple columns, or a single range, but not 
 multiple ranges.) 
 For example, I used to have
 [superColumnName1,subColumnName1..N],[superColumnName2,subColumnName1..N]
 and I could get superColumnName1, superColumnName2
 Now I have
 [lensuperColumnName10lensubColumnName1..lensuperColumnName10lensubColumnNameN],[lensuperColumnName20lensubColumnName1..lensuperColumnName20lensubColumnNameN]
 and I need to get superColumnName1..superColumnName1+, 
 superColumnName2..superColumnName2+
 to get the same functionality
 I would like the clients to support this functionality, e.g. Hector to have 
 .setRages parallel to .setColumnNames 
 and for CQL to support a syntax like 
 SELECT [FIRST N] [REVERSED] name1..nameN1, name2..nameN2... FROM ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2710) Get multiple column ranges

2012-08-16 Thread Vijay (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436284#comment-13436284
 ] 

Vijay commented on CASSANDRA-2710:
--

 is there that much benefit over just grabbing the whole row?
Sure but then we have to stream all the columns into the client which can be 
wasteful too... I am fine either ways, nice to have.

 Get multiple column ranges
 --

 Key: CASSANDRA-2710
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2710
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: David Boxenhorn
Assignee: Vijay
  Labels: compositeColumns, cql
 Attachments: 0001-2710-multiple-column-ranges-cql.patch, 
 0001-2710-multiple-column-ranges-thrift.patch


 I have replaced all my super column families with regular column families 
 using composite columns. I have easily been able to support all previous 
 functionality (I don't need range delete) except for one thing: getting 
 multiple super columns with a single access. For this, I would need to get 
 multiple ranges. (I can get multiple columns, or a single range, but not 
 multiple ranges.) 
 For example, I used to have
 [superColumnName1,subColumnName1..N],[superColumnName2,subColumnName1..N]
 and I could get superColumnName1, superColumnName2
 Now I have
 [lensuperColumnName10lensubColumnName1..lensuperColumnName10lensubColumnNameN],[lensuperColumnName20lensubColumnName1..lensuperColumnName20lensubColumnNameN]
 and I need to get superColumnName1..superColumnName1+, 
 superColumnName2..superColumnName2+
 to get the same functionality
 I would like the clients to support this functionality, e.g. Hector to have 
 .setRages parallel to .setColumnNames 
 and for CQL to support a syntax like 
 SELECT [FIRST N] [REVERSED] name1..nameN1, name2..nameN2... FROM ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-1337) parallelize fetching rows for low-cardinality indexes

2012-08-16 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-1337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436286#comment-13436286
 ] 

Jonathan Ellis commented on CASSANDRA-1337:
---

Reverted the three commits here in 0bfea6f678034c54d64c0c613f758de02d266415 and 
bumped to 1.2.1 since David may not have time to get back to this before 1.2.0 
freeze.  

 parallelize fetching rows for low-cardinality indexes
 -

 Key: CASSANDRA-1337
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1337
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: David Alves
Priority: Minor
 Fix For: 1.2.1

 Attachments: 
 0001-CASSANDRA-1337-scan-concurrently-depending-on-num-rows.txt, 
 1137-bugfix.patch, CASSANDRA-1337.patch

   Original Estimate: 8h
  Remaining Estimate: 8h

 currently, we read the indexed rows from the first node (in partitioner 
 order); if that does not have enough matching rows, we read the rows from the 
 next, and so forth.
 we should use the statistics fom CASSANDRA-1155 to query multiple nodes in 
 parallel, such that we have a high chance of getting enough rows w/o having 
 to do another round of queries (but, if our estimate is incorrect, we do need 
 to loop and do more rounds until we have enough data or we have fetched from 
 each node).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-4551) Nodetool getendpoints keys do not work with ASCII, key_validation=ascii value of key = a test no delimiter

2012-08-16 Thread Mark Valdez (JIRA)

Mark Valdez created CASSANDRA-4551:
--

 Summary: Nodetool getendpoints keys do not work with ASCII, 
key_validation=ascii value of key = a test  no delimiter
 Key: CASSANDRA-4551
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4551
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Affects Versions: 1.0.9
Reporter: Mark Valdez


Nodetool getendpoints keys do not work with ASCII, key_validation=ascii value 
of key = a test  no delimiter tried to escape key = a test with double 
and single quotes. It doesn't work. It just reiterates the format of the tool's 
command: getendpoints requires ks, cf and key args

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[1/2] git commit: run local range scans on the read stage patch by jbellis; reviewed by vijay for CASSANDRA-3687

2012-08-16 Thread jbellis

Updated Branches:
  refs/heads/trunk fe784f58e - 5577ff626


run local range scans on the read stage
patch by jbellis; reviewed by vijay for CASSANDRA-3687


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5577ff62
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5577ff62
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5577ff62

Branch: refs/heads/trunk
Commit: 5577ff626bb38d419a3540e0c0ccb1a9d8b8680f
Parents: 29fed1f
Author: Jonathan Ellis jbel...@apache.org
Authored: Thu Aug 16 15:43:02 2012 -0500
Committer: Jonathan Ellis jbel...@apache.org
Committed: Thu Aug 16 15:43:02 2012 -0500

--
 CHANGES.txt|1 +
 .../cassandra/service/AbstractRowResolver.java |   11 --
 .../org/apache/cassandra/service/ReadCallback.java |   27 ++---
 .../org/apache/cassandra/service/StorageProxy.java |   91 ---
 4 files changed, 59 insertions(+), 71 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/5577ff62/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 5a2848d..75de54e 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 1.2-dev
+ * run local range scans on the read stage (CASSANDRA-3687)
  * clean up ioexceptions (CASSANDRA-2116)
  * Introduce new json format with row level deletion (CASSANDRA-4054)
  * remove redundant name column from schema_keyspaces (CASSANDRA-4433)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/5577ff62/src/java/org/apache/cassandra/service/AbstractRowResolver.java
--
diff --git a/src/java/org/apache/cassandra/service/AbstractRowResolver.java 
b/src/java/org/apache/cassandra/service/AbstractRowResolver.java
index b1647a2..beaf73c 100644
--- a/src/java/org/apache/cassandra/service/AbstractRowResolver.java
+++ b/src/java/org/apache/cassandra/service/AbstractRowResolver.java
@@ -51,17 +51,6 @@ public abstract class AbstractRowResolver implements 
IResponseResolverReadRespo
 replies.add(message);
 }
 
-/** hack so local reads don't force de/serialization of an extra real 
Message */
-public void injectPreProcessed(ReadResponse result)
-{
-MessageInReadResponse message = 
MessageIn.create(FBUtilities.getBroadcastAddress(),
-   result,
-   
Collections.String, byte[]emptyMap(),
-   
MessagingService.Verb.INTERNAL_RESPONSE,
-   
MessagingService.current_version);
-replies.add(message);
-}
-
 public IterableMessageInReadResponse getMessages()
 {
 return replies;

http://git-wip-us.apache.org/repos/asf/cassandra/blob/5577ff62/src/java/org/apache/cassandra/service/ReadCallback.java
--
diff --git a/src/java/org/apache/cassandra/service/ReadCallback.java 
b/src/java/org/apache/cassandra/service/ReadCallback.java
index a3d273c..bfd0044 100644
--- a/src/java/org/apache/cassandra/service/ReadCallback.java
+++ b/src/java/org/apache/cassandra/service/ReadCallback.java
@@ -19,6 +19,7 @@ package org.apache.cassandra.service;
 
 import java.io.IOException;
 import java.net.InetAddress;
+import java.util.Collections;
 import java.util.List;
 import java.util.concurrent.TimeUnit;
 import java.util.concurrent.TimeoutException;
@@ -165,32 +166,20 @@ public class ReadCallbackTMessage, TResolved implements 
IAsyncCallbackTMessag
 
 /**
  * @return true if the message counts towards the blockfor threshold
- * TODO turn the Message into a response so we don't need two versions of 
this method
  */
 protected boolean waitingFor(MessageIn message)
 {
 return true;
 }
 
-/**
- * @return true if the response counts towards the blockfor threshold
- */
-protected boolean waitingFor(ReadResponse response)
+public void response(TMessage result)
 {
-return true;
-}
-
-public void response(ReadResponse result)
-{
-((RowDigestResolver) resolver).injectPreProcessed(result);
-int n = waitingFor(result)
-  ? received.incrementAndGet()
-  : received.get();
-if (n = blockfor  resolver.isDataPresent())
-{
-condition.signal();
-maybeResolveForRepair();
-}
+MessageInTMessage message = 
MessageIn.create(FBUtilities.getBroadcastAddress(),
+   result,
+

[2/2] git commit: revert CASSANDRA-1337 comprising commits ef23335, f17fbac, 9cf915f.

2012-08-16 Thread jbellis

revert CASSANDRA-1337
comprising commits ef23335, f17fbac, 9cf915f.


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/29fed1f1
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/29fed1f1
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/29fed1f1

Branch: refs/heads/trunk
Commit: 29fed1f18188cfcd71c817db394c1087e0698dbd
Parents: fe784f5
Author: Jonathan Ellis jbel...@apache.org
Authored: Thu Aug 16 15:42:30 2012 -0500
Committer: Jonathan Ellis jbel...@apache.org
Committed: Thu Aug 16 15:42:30 2012 -0500

--
 .../org/apache/cassandra/service/StorageProxy.java |   60 --
 1 files changed, 17 insertions(+), 43 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/29fed1f1/src/java/org/apache/cassandra/service/StorageProxy.java
--
diff --git a/src/java/org/apache/cassandra/service/StorageProxy.java 
b/src/java/org/apache/cassandra/service/StorageProxy.java
index 8d0e0b3..9d55739 100644
--- a/src/java/org/apache/cassandra/service/StorageProxy.java
+++ b/src/java/org/apache/cassandra/service/StorageProxy.java
@@ -853,23 +853,6 @@ public class StorageProxy implements StorageProxyMBean
 int columnsCount = 0;
 rows = new ArrayListRow();
 ListAbstractBoundsRowPosition ranges = 
getRestrictedRanges(command.range);
-
-// get the cardinality of this index based on row count
-// use this info to decide how many scans to do in parallel
-Table table = Table.open(command.keyspace);
-long estimatedKeysPerRange = 
table.getColumnFamilyStore(command.column_family)
-.estimateKeys() / 
table.getReplicationStrategy().getReplicationFactor();
-
-int concurrencyFactor = (int) (command.maxResults / 
(estimatedKeysPerRange + 1));
-if (concurrencyFactor = 0 || command.maxIsColumns)
-concurrencyFactor = 1;
-else if (concurrencyFactor  ranges.size())
-concurrencyFactor = ranges.size();
-
-// parallel scan handlers
-ListReadCallbackRangeSliceReply, IterableRow scanHandlers = 
new ArrayListReadCallbackRangeSliceReply, IterableRow(concurrencyFactor);
-
-int parallelHandlers = concurrencyFactor;
 for (AbstractBoundsRowPosition range : ranges)
 {
 RangeSliceCommand nodeCmd = new 
RangeSliceCommand(command.keyspace,
@@ -904,7 +887,6 @@ public class StorageProxy implements StorageProxyMBean
 {
 throw new AssertionError(e);
 }
-parallelHandlers--;
 }
 else
 {
@@ -921,36 +903,28 @@ public class StorageProxy implements StorageProxyMBean
 logger.debug(reading  + nodeCmd +  from  + 
endpoint);
 }
 
-scanHandlers.add(handler);
-
-if (scanHandlers.size() = parallelHandlers)
+try
 {
-for (ReadCallbackRangeSliceReply, IterableRow 
scanHandler : scanHandlers)
+for (Row row : handler.get())
 {
-try
-{
-for (Row row : scanHandler.get())
-{
-rows.add(row);
-columnsCount += row.getLiveColumnCount();
-logger.debug(range slices read {}, 
row.key);
-}
-
FBUtilities.waitOnFutures(resolver.repairResults, 
DatabaseDescriptor.getRangeRpcTimeout());
-}
-catch (TimeoutException ex)
-{
-if (logger.isDebugEnabled())
-logger.debug(Range slice timeout: {}, 
ex.toString());
-throw ex;
-}
-catch (DigestMismatchException e)
-{
-throw new AssertionError(e); // no digests in 
range slices yet
-}
+rows.add(row);
+columnsCount += row.getLiveColumnCount();
+logger.debug(range slices read {}, row.key);
 }
-scanHandlers.clear(); //go back for more
+FBUtilities.waitOnFutures(resolver.repairResults,

[jira] [Assigned] (CASSANDRA-3237) refactor super column implmentation to use composite column names instead

2012-08-16 Thread Vijay (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay reassigned CASSANDRA-3237:


Assignee: Vijay

 refactor super column implmentation to use composite column names instead
 -

 Key: CASSANDRA-3237
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3237
 Project: Cassandra
  Issue Type: Improvement
Reporter: Matthew F. Dennis
Assignee: Vijay
Priority: Minor
  Labels: ponies
 Fix For: 1.3

 Attachments: cassandra-supercolumn-irc.log


 super columns are annoying.  composite columns offer a better API and 
 performance.  people should use composites over super columns.  some people 
 are already using super columns.  C* should implement the super column API in 
 terms of composites to reduce code, complexity and testing as well as 
 increase performance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-16 Thread JIRA

[
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436346#comment-13436346
]

Ivo Meißner commented on CASSANDRA-4481:

I have also created the broken keyspace with a version prior to 1.1.2 (I'm
pretty sure it was 1.1.1). So maybe there is a commitlog incompatibility...
I also ran into some schema changing issues with that keyspace. Maybe I
destroyed the keyspace structure.
But it would be nice to get some kind of error message if something goes wrong
with the commitlogs. Everything else seems to work with the keyspace. You
really don't notice until you wonder where the data is...

Commitlog not replayed after restart - data lost

Key: CASSANDRA-4481
URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
Project: Cassandra
Issue Type: Bug
Affects Versions: 1.1.2
Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical

When data is written to the commitlog and I restart the machine, all commited
data is lost that has not been flushed to disk.
In the startup logs it says that it replays the commitlog successfully, but
the data is not available then.
When I open the commitlog file in an editor I can see the added data, but
after the restart it cannot be fetched from cassandra.
{code}
INFO 09:59:45,362 Replaying
/var/myproject/cassandra/commitlog/CommitLog-83203377067.log
INFO 09:59:45,476 Finished reading
/var/myproject/cassandra/commitlog/CommitLog-83203377067.log
INFO 09:59:45,476 Log replay complete, 0 replayed mutations
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-16 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436350#comment-13436350
 ] 

Jonathan Ellis commented on CASSANDRA-4481:
---

But this is exactly the situation if you dropped the keyspace on purpose: 
commitlog will have data for CFs that don't exist anymore.  Not a good idea to 
panic users when things are working as designed.

 Commitlog not replayed after restart - data lost
 

 Key: CASSANDRA-4481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical

 When data is written to the commitlog and I restart the machine, all commited 
 data is lost that has not been flushed to disk. 
 In the startup logs it says that it replays the commitlog successfully, but 
 the data is not available then. 
 When I open the commitlog file in an editor I can see the added data, but 
 after the restart it cannot be fetched from cassandra. 
 {code}
  INFO 09:59:45,362 Replaying 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Finished reading 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-4552) cqlsh doesn't handle Int32Type when fully-qualified package name is present

2012-08-16 Thread Kirk True (JIRA)

Kirk True created CASSANDRA-4552:


 Summary: cqlsh doesn't handle Int32Type when fully-qualified 
package name is present
 Key: CASSANDRA-4552
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4552
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Affects Versions: 1.2.0
 Environment: Today's (08/16/2012) trunk.
Reporter: Kirk True
Assignee: Kirk True
 Fix For: 1.2.0


Steps to reproduce:

1. Start Cassandra
2. Start cqlsh: {{./bin/cqlsh -3 --debug}}
3. Execute these statements:

{noformat}
create keyspace foo with strategy_class = 'SimpleStrategy' and 
strategy_options:replication_factor=1;
use foo;
create table bar (
a int,
b int,
primary key (a)
);
insert into bar (a, b) values (1, 1);
select * from bar;
{noformat}

Expected: to see my row results
Actual: I see this error:

{noformat}
Traceback (most recent call last):
  File ./bin/cqlsh, line 926, in onecmd
self.handle_statement(st, statementtext)
  File ./bin/cqlsh, line 954, in handle_statement
return custom_handler(parsed)
  File ./bin/cqlsh, line 1015, in do_select
self.perform_statement(parsed.extract_orig(), decoder=decoder)
  File ./bin/cqlsh, line 1042, in perform_statement
self.print_result(self.cursor)
  File ./bin/cqlsh, line 1096, in print_result
self.print_static_result(cursor)
  File ./bin/cqlsh, line 1112, in print_static_result
formatted_data = [map(self.myformat_value, row, coltypes) for row in cursor]
  File ./bin/cqlsh, line 622, in myformat_value
float_precision=self.display_float_precision, **kwargs)
  File ./bin/cqlsh, line 504, in format_value
escapedval = val.replace('\\', '')
AttributeError: 'int' object has no attribute 'replace'
{noformat}

This is similar to CASSANDRA-4083 in terms of the error message, but may be of 
a different cause.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4552) cqlsh doesn't handle Int32Type when fully-qualified package name is present

2012-08-16 Thread Kirk True (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated CASSANDRA-4552:
-

Attachment: trunk-4552.txt

The value of the cqlsh Python script's {{casstype}} is in some cases the 
fully-qualified package name. In my case it was 
{{org.apache.cassandra.db.marshal.Int32Type}} while it appears the code is 
expecting it to be simply {{Int32Type}}. 

I don't think this is the right fix, but it's a start and it unblocks me :)

 cqlsh doesn't handle Int32Type when fully-qualified package name is present
 ---

 Key: CASSANDRA-4552
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4552
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Affects Versions: 1.2.0
 Environment: Today's (08/16/2012) trunk.
Reporter: Kirk True
Assignee: Kirk True
 Fix For: 1.2.0

 Attachments: trunk-4552.txt


 Steps to reproduce:
 1. Start Cassandra
 2. Start cqlsh: {{./bin/cqlsh -3 --debug}}
 3. Execute these statements:
 {noformat}
 create keyspace foo with strategy_class = 'SimpleStrategy' and 
 strategy_options:replication_factor=1;
 use foo;
 create table bar (
 a int,
 b int,
 primary key (a)
 );
 insert into bar (a, b) values (1, 1);
 select * from bar;
 {noformat}
 Expected: to see my row results
 Actual: I see this error:
 {noformat}
 Traceback (most recent call last):
   File ./bin/cqlsh, line 926, in onecmd
 self.handle_statement(st, statementtext)
   File ./bin/cqlsh, line 954, in handle_statement
 return custom_handler(parsed)
   File ./bin/cqlsh, line 1015, in do_select
 self.perform_statement(parsed.extract_orig(), decoder=decoder)
   File ./bin/cqlsh, line 1042, in perform_statement
 self.print_result(self.cursor)
   File ./bin/cqlsh, line 1096, in print_result
 self.print_static_result(cursor)
   File ./bin/cqlsh, line 1112, in print_static_result
 formatted_data = [map(self.myformat_value, row, coltypes) for row in 
 cursor]
   File ./bin/cqlsh, line 622, in myformat_value
 float_precision=self.display_float_precision, **kwargs)
   File ./bin/cqlsh, line 504, in format_value
 escapedval = val.replace('\\', '')
 AttributeError: 'int' object has no attribute 'replace'
 {noformat}
 This is similar to CASSANDRA-4083 in terms of the error message, but may be 
 of a different cause.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CASSANDRA-4552) cqlsh doesn't handle Int32Type when fully-qualified package name is present

2012-08-16 Thread Kirk True (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True resolved CASSANDRA-4552.
--

Resolution: Duplicate

Marking as a duplicate of CASSANDRA-4546.

 cqlsh doesn't handle Int32Type when fully-qualified package name is present
 ---

 Key: CASSANDRA-4552
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4552
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Affects Versions: 1.2.0
 Environment: Today's (08/16/2012) trunk.
Reporter: Kirk True
Assignee: Kirk True
 Fix For: 1.2.0

 Attachments: trunk-4552.txt


 Steps to reproduce:
 1. Start Cassandra
 2. Start cqlsh: {{./bin/cqlsh -3 --debug}}
 3. Execute these statements:
 {noformat}
 create keyspace foo with strategy_class = 'SimpleStrategy' and 
 strategy_options:replication_factor=1;
 use foo;
 create table bar (
 a int,
 b int,
 primary key (a)
 );
 insert into bar (a, b) values (1, 1);
 select * from bar;
 {noformat}
 Expected: to see my row results
 Actual: I see this error:
 {noformat}
 Traceback (most recent call last):
   File ./bin/cqlsh, line 926, in onecmd
 self.handle_statement(st, statementtext)
   File ./bin/cqlsh, line 954, in handle_statement
 return custom_handler(parsed)
   File ./bin/cqlsh, line 1015, in do_select
 self.perform_statement(parsed.extract_orig(), decoder=decoder)
   File ./bin/cqlsh, line 1042, in perform_statement
 self.print_result(self.cursor)
   File ./bin/cqlsh, line 1096, in print_result
 self.print_static_result(cursor)
   File ./bin/cqlsh, line 1112, in print_static_result
 formatted_data = [map(self.myformat_value, row, coltypes) for row in 
 cursor]
   File ./bin/cqlsh, line 622, in myformat_value
 float_precision=self.display_float_precision, **kwargs)
   File ./bin/cqlsh, line 504, in format_value
 escapedval = val.replace('\\', '')
 AttributeError: 'int' object has no attribute 'replace'
 {noformat}
 This is similar to CASSANDRA-4083 in terms of the error message, but may be 
 of a different cause.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4546) cqlsh: handle when full cassandra type class names are given

2012-08-16 Thread Kirk True (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436359#comment-13436359
 ] 

Kirk True commented on CASSANDRA-4546:
--

+1 on the first part of the patch.

The third change in the patch _appears_ unrelated to me. Please clarify for my 
own edification.

Thanks.

 cqlsh: handle when full cassandra type class names are given
 

 Key: CASSANDRA-4546
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4546
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Affects Versions: 1.2.0
Reporter: paul cannon
Assignee: paul cannon
  Labels: cqlsh
 Fix For: 1.2.0

 Attachments: 4546.patch.txt


 When a builtin Cassandra type was being used for data in previous versions of 
 Cassandra, only the short name was sent: UTF8Type, TimeUUIDType, etc. 
 Starting with 1.2, as of CASSANDRA-4453, the full class names are sent.
 Cqlsh doesn't know how to handle this, and is currently treating all data as 
 if it were an unknown type. This goes as far as to cause an exception when 
 the type is actually a number, because the driver deserializes it right, and 
 then cqlsh tries to use it as a string.
 Here for googlage:
 {noformat}
 AttributeError: 'int' object has no attribute 'replace'
 {noformat}
 Fixeries are in order.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4546) cqlsh: handle when full cassandra type class names are given

2012-08-16 Thread paul cannon (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436362#comment-13436362
 ] 

paul cannon commented on CASSANDRA-4546:


It's not directly related. Just a minor problem with error reporting that came 
up while I was testing this.

 cqlsh: handle when full cassandra type class names are given
 

 Key: CASSANDRA-4546
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4546
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Affects Versions: 1.2.0
Reporter: paul cannon
Assignee: paul cannon
  Labels: cqlsh
 Fix For: 1.2.0

 Attachments: 4546.patch.txt


 When a builtin Cassandra type was being used for data in previous versions of 
 Cassandra, only the short name was sent: UTF8Type, TimeUUIDType, etc. 
 Starting with 1.2, as of CASSANDRA-4453, the full class names are sent.
 Cqlsh doesn't know how to handle this, and is currently treating all data as 
 if it were an unknown type. This goes as far as to cause an exception when 
 the type is actually a number, because the driver deserializes it right, and 
 then cqlsh tries to use it as a string.
 Here for googlage:
 {noformat}
 AttributeError: 'int' object has no attribute 'replace'
 {noformat}
 Fixeries are in order.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

git commit: cqlsh: handle fully qualified class names Patch by paul cannon, reviewed by brandonwilliams for CASSANDRA-4546

2012-08-16 Thread brandonwilliams

Updated Branches:
  refs/heads/trunk 5577ff626 - 7ddb5c7a4


cqlsh: handle fully qualified class names
Patch by paul cannon, reviewed by brandonwilliams for CASSANDRA-4546


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7ddb5c7a
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7ddb5c7a
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7ddb5c7a

Branch: refs/heads/trunk
Commit: 7ddb5c7a477361f1a1dd7e4b7e9613b921e50b5b
Parents: 5577ff6
Author: Brandon Williams brandonwilli...@apache.org
Authored: Thu Aug 16 17:26:24 2012 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Thu Aug 16 17:26:24 2012 -0500

--
 bin/cqlsh |6 --
 1 files changed, 4 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/7ddb5c7a/bin/cqlsh
--
diff --git a/bin/cqlsh b/bin/cqlsh
index 6b61364..7ea0128 100755
--- a/bin/cqlsh
+++ b/bin/cqlsh
@@ -457,6 +457,7 @@ def unix_time_from_uuid1(u):
 
 def format_value(val, casstype, output_encoding, addcolor=False, 
time_format='',
  float_precision=3, colormap=DEFAULT_VALUE_COLORS, 
nullval='null'):
+casstype = trim_if_present(casstype, 'org.apache.cassandra.db.marshal.')
 color = colormap['default']
 coloredval = None
 displaywidth = None
@@ -498,6 +499,7 @@ def format_value(val, casstype, output_encoding, 
addcolor=False, time_format='',
 color = colormap['hex']
 else:
 # AsciiType is the only other one known right now, but handle others
+val = str(val)
 escapedval = val.replace('\\', '')
 bval = controlchars_re.sub(_show_control_chars, escapedval)
 if addcolor:
@@ -775,8 +777,8 @@ class Shell(cmd.Cmd):
 def get_keyspace(self, ksname):
 try:
 return self.make_hacktastic_thrift_call('describe_keyspace', 
ksname)
-except cql.cassandra.ttypes.NotFoundException, e:
-raise KeyspaceNotFound('Keyspace %s not found.' % e)
+except cql.cassandra.ttypes.NotFoundException:
+raise KeyspaceNotFound('Keyspace %r not found.' % ksname)
 
 def get_keyspaces(self):
 return self.make_hacktastic_thrift_call('describe_keyspaces')

[jira] [Updated] (CASSANDRA-4553) NPE while loading Saved KeyCache

2012-08-16 Thread Vijay (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-4553:
-

Attachment: 0001-CASSANDRA-4553.patch

Simple fix to handle null in ASC

 NPE while loading Saved KeyCache
 

 Key: CASSANDRA-4553
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4553
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.0
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Fix For: 1.2.0

 Attachments: 0001-CASSANDRA-4553.patch


 WARN [main] 2012-08-16 15:31:13,896 AutoSavingCache.java (line 146) error 
 reading saved cache /var/lib/cassandra/saved_caches/system-local-KeyCache-b.db
 java.lang.NullPointerException
   at 
 org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:140)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:251)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:354)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:326)
   at org.apache.cassandra.db.Table.initCf(Table.java:312)
   at org.apache.cassandra.db.Table.init(Table.java:252)
   at org.apache.cassandra.db.Table.open(Table.java:97)
   at org.apache.cassandra.db.Table.open(Table.java:75)
   at org.apache.cassandra.db.SystemTable.checkHealth(SystemTable.java:285)
   at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:168)
   at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:318)
   at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:361)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4553) NPE while loading Saved KeyCache

2012-08-16 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436384#comment-13436384
 ] 

Jonathan Ellis commented on CASSANDRA-4553:
---

+1, although would be even better w/ comment as to why we expect keycache 
deserialize to return nulls sometimes (but not rowcache)

 NPE while loading Saved KeyCache
 

 Key: CASSANDRA-4553
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4553
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.0
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Fix For: 1.2.0

 Attachments: 0001-CASSANDRA-4553.patch


 WARN [main] 2012-08-16 15:31:13,896 AutoSavingCache.java (line 146) error 
 reading saved cache /var/lib/cassandra/saved_caches/system-local-KeyCache-b.db
 java.lang.NullPointerException
   at 
 org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:140)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:251)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:354)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:326)
   at org.apache.cassandra.db.Table.initCf(Table.java:312)
   at org.apache.cassandra.db.Table.init(Table.java:252)
   at org.apache.cassandra.db.Table.open(Table.java:97)
   at org.apache.cassandra.db.Table.open(Table.java:75)
   at org.apache.cassandra.db.SystemTable.checkHealth(SystemTable.java:285)
   at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:168)
   at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:318)
   at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:361)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4553) NPE while loading Saved KeyCache

2012-08-16 Thread Pavel Yaskevich (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436386#comment-13436386
 ] 

Pavel Yaskevich commented on CASSANDRA-4553:


+1 with Jonathan

 NPE while loading Saved KeyCache
 

 Key: CASSANDRA-4553
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4553
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.0
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Fix For: 1.2.0

 Attachments: 0001-CASSANDRA-4553.patch


 WARN [main] 2012-08-16 15:31:13,896 AutoSavingCache.java (line 146) error 
 reading saved cache /var/lib/cassandra/saved_caches/system-local-KeyCache-b.db
 java.lang.NullPointerException
   at 
 org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:140)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:251)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:354)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:326)
   at org.apache.cassandra.db.Table.initCf(Table.java:312)
   at org.apache.cassandra.db.Table.init(Table.java:252)
   at org.apache.cassandra.db.Table.open(Table.java:97)
   at org.apache.cassandra.db.Table.open(Table.java:75)
   at org.apache.cassandra.db.SystemTable.checkHealth(SystemTable.java:285)
   at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:168)
   at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:318)
   at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:361)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

git commit: remove schema agreement checking from all external APIs (Thrift, CQL and CQL3) patch by Pavel Yaskevich; reviewed by Jonathan Ellis for CASSANDRA-4487

2012-08-16 Thread xedin

Updated Branches:
  refs/heads/trunk 7ddb5c7a4 - 71f5d91ab


remove schema agreement checking from all external APIs (Thrift, CQL and CQL3)
patch by Pavel Yaskevich; reviewed by Jonathan Ellis for CASSANDRA-4487


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/71f5d91a
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/71f5d91a
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/71f5d91a

Branch: refs/heads/trunk
Commit: 71f5d91ab7825196990a2744cf3e40e654917d33
Parents: 7ddb5c7
Author: Pavel Yaskevich xe...@apache.org
Authored: Wed Aug 15 14:00:28 2012 +0300
Committer: Pavel Yaskevich xe...@apache.org
Committed: Fri Aug 17 01:54:13 2012 +0300

--
 CHANGES.txt|1 +
 interface/cassandra.thrift |7 ++-
 src/java/org/apache/cassandra/cli/CliClient.java   |   49 +--
 src/java/org/apache/cassandra/cli/CliMain.java |2 +-
 .../org/apache/cassandra/cql/QueryProcessor.java   |   50 +--
 .../org/apache/cassandra/cql3/CQLStatement.java|5 +-
 .../org/apache/cassandra/cql3/QueryProcessor.java  |   10 +--
 .../cql3/statements/CreateKeyspaceStatement.java   |3 +-
 .../cql3/statements/DropKeyspaceStatement.java |3 +-
 .../cql3/statements/SchemaAlteringStatement.java   |   46 +
 .../apache/cassandra/thrift/CassandraServer.java   |   16 -
 .../cassandra/transport/messages/ErrorMessage.java |4 -
 .../cassandra/transport/messages/QueryMessage.java |4 +-
 13 files changed, 24 insertions(+), 176 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/71f5d91a/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 75de54e..39c92b1 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -37,6 +37,7 @@
  * improve DynamicEndpointSnitch by using reservoir sampling (CASSANDRA-4038)
  * (cql3) Add support for 2ndary indexes (CASSANDRA-3680)
  * (cql3) fix defining more than one PK to be invalid (CASSANDRA-4477)
+ * remove schema agreement checking from all external APIs (Thrift, CQL and 
CQL3) (CASSANDRA-4487)
 
 
 1.1.4

http://git-wip-us.apache.org/repos/asf/cassandra/blob/71f5d91a/interface/cassandra.thrift
--
diff --git a/interface/cassandra.thrift b/interface/cassandra.thrift
index 5e933d7..1f735e6 100644
--- a/interface/cassandra.thrift
+++ b/interface/cassandra.thrift
@@ -158,7 +158,12 @@ exception AuthorizationException {
 1: required string why
 }
 
-/** schemas are not in agreement across all nodes */
+/**
+ * NOTE: This up outdated exception left for backward compatibility reasons,
+ * no actual schema agreement validation is done starting from Cassandra 1.2
+ *
+ * schemas are not in agreement across all nodes
+ */
 exception SchemaDisagreementException {
 }
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/71f5d91a/src/java/org/apache/cassandra/cli/CliClient.java
--
diff --git a/src/java/org/apache/cassandra/cli/CliClient.java 
b/src/java/org/apache/cassandra/cli/CliClient.java
index f2f492a..176f70a 100644
--- a/src/java/org/apache/cassandra/cli/CliClient.java
+++ b/src/java/org/apache/cassandra/cli/CliClient.java
@@ -198,7 +198,7 @@ public class CliClient
 }
 
 // Execute a CLI Statement
-public void executeCLIStatement(String statement) throws 
CharacterCodingException, TException, TimedOutException, NotFoundException, 
NoSuchFieldException, InvalidRequestException, UnavailableException, 
InstantiationException, IllegalAccessException, ClassNotFoundException, 
SchemaDisagreementException
+public void executeCLIStatement(String statement) throws 
CharacterCodingException, TException, TimedOutException, NotFoundException, 
NoSuchFieldException, InvalidRequestException, UnavailableException, 
InstantiationException, IllegalAccessException, ClassNotFoundException
 {
 Tree tree = CliCompiler.compileQuery(statement);
 try
@@ -1006,7 +1006,6 @@ public class CliClient
 {
 String mySchemaVersion = 
thriftClient.system_add_keyspace(updateKsDefAttributes(statement, ksDef));
 sessionState.out.println(mySchemaVersion);
-validateSchemaIsSettled(mySchemaVersion);
 
 keyspacesMap.put(keyspaceName, 
thriftClient.describe_keyspace(keyspaceName));
 }
@@ -1037,7 +1036,6 @@ public class CliClient
 {
 String mySchemaVersion = 
thriftClient.system_add_column_family(updateCfDefAttributes(statement, cfDef));
 sessionState.out.println(mySchemaVersion);
-validateSchemaIsSettled(mySchemaVersion);

[jira] [Resolved] (CASSANDRA-4538) Strange CorruptedBlockException when massive insert binary data

2012-08-16 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-4538.
---

Resolution: Cannot Reproduce

does sound like a problem with that specific machine given that neither you nor 
cathy can reproduce elsewhere

 Strange CorruptedBlockException when massive insert binary data
 ---

 Key: CASSANDRA-4538
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4538
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.3
 Environment: Debian sequeeze 32bit
Reporter: Tommy Cheng
Priority: Critical
  Labels: CorruptedBlockException, binary, insert
 Attachments: cassandra-stresstest.zip


 After inserting ~ 1 records, here is the error log
  INFO 10:53:33,543 Compacted to 
 [/var/lib/cassandra/data/ST/company/ST-company.company_acct_no_idx-he-13-Data.db,].
   407,681 to 409,133 (~100% of original) bytes for 9,250 keys at 
 0.715926MB/s.  Time: 545ms.
 ERROR 10:53:35,445 Exception in thread Thread[CompactionExecutor:3,1,main]
 java.io.IOError: org.apache.cassandra.io.compress.CorruptedBlockException: 
 (/var/lib/cassandra/data/ST/company/ST-company-he-9-Data.db): corruption 
 detected, chunk at 7530128 of length 19575.
 at 
 org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:116)
 at 
 org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:99)
 at 
 org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:176)
 at 
 org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:83)
 at 
 org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:68)
 at 
 org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:118)
 at 
 org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:101)
 at 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
 at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
 at 
 com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
 at 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
 at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:173)
 at 
 org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:154)
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.cassandra.io.compress.CorruptedBlockException: 
 (/var/lib/cassandra/data/ST/company/ST-company-he-9-Data.db): corruption 
 detected, chunk at 7530128 of length 19575.
 at 
 org.apache.cassandra.io.compress.CompressedRandomAccessReader.decompressChunk(CompressedRandomAccessReader.java:98)
 at 
 org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:77)
 at 
 org.apache.cassandra.io.util.RandomAccessReader.read(RandomAccessReader.java:302)
 at java.io.RandomAccessFile.readFully(RandomAccessFile.java:397)
 at java.io.RandomAccessFile.readFully(RandomAccessFile.java:377)
 at 
 org.apache.cassandra.utils.BytesReadTracker.readFully(BytesReadTracker.java:95)
 at 
 org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:401)
 at 
 org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:363)
 at 
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:119)
 at 
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:36)
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:144)
 at 
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:234)
 at 
 org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:112)
 ... 20 more
 Here is the startup of cassandra

[jira] [Resolved] (CASSANDRA-875) Performance regression tests, take 2

2012-08-16 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-875.
--

Resolution: Won't Fix

working on this out of tree, similar to dtests

 Performance regression tests, take 2
 

 Key: CASSANDRA-875
 URL: https://issues.apache.org/jira/browse/CASSANDRA-875
 Project: Cassandra
  Issue Type: Test
  Components: Tools
Reporter: Jonathan Ellis
Assignee: Tyler Patterson
  Labels: gsoc, gsoc2010

 We have a  stress test in contrib/py_stress, and Paul has a tool using 
 libcloud to automate running it against an ephemeral cluster of rackspace 
 cloud servers, but to really qualify as performance regression tests we 
 need to
  - test a wide variety of data types (skinny rows, wide rows, different 
 comparator types, different value byte[] sizes, etc)
  - produce pretty graphs.  seriously.
  - archive historical data somewhere for comparison (rackspace can provide a 
 VM to host a db for this, if the ASF doesn't have something in place for this 
 kind of thing already)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CASSANDRA-2397) Improve or remove replicate-on-write setting

2012-08-16 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-2397.
---

Resolution: Won't Fix

more discussion in CASSANDRA-3868

 Improve or remove replicate-on-write setting
 

 Key: CASSANDRA-2397
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2397
 Project: Cassandra
  Issue Type: Bug
Reporter: Stu Hood

 The replicate on write setting breaks assumptions in various places in the 
 codebase dealing with whether data will be replicated in a timely fashion. 
 It's worthwhile to discuss whether we should go all-the-way on 
 replicate-on-write, such that it is a fully supported feature, or whether we 
 should remove it entirely.
 On one hand, ROW could be considered to be just another replication tunable 
 like HH, RR and AES. On the other hand, a lazily replicating store is very 
 rarely what you actually wanted.
 Open issues related to ROW are linked, but additionally, we'd need to:
  * Make the setting have an effect for standard column families
  * Change the default for ROW to enabled and properly warn of the effects

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CASSANDRA-2636) Import/Export of Schema Migrations

2012-08-16 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-2636.
---

Resolution: Invalid

We no longer store migrations, only the merged schema.

 Import/Export of Schema Migrations
 --

 Key: CASSANDRA-2636
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2636
 Project: Cassandra
  Issue Type: Improvement
Reporter: David Boxenhorn

 My use case is like this: I have a development cluster, a staging cluster and 
 a production cluster. When I finish a set of migrations on the development 
 cluster, I want to apply them to the staging cluster, and eventually the 
 production cluster. I don't want to do it by hand, because it's a painful and 
 error-prone process. What I would like to do is export the last N migrations 
 from the development cluster as a text file, with exactly the same format as 
 the original text commands, and import them to the staging and production 
 clusters. 
 I think the best place to do this might be the CLI, since you would probably 
 want to view your migrations before exporting them. Something like this:
 show migrations N;   Shows the last N 
 migrations.
 export migrations N fileNameExports the last N migrations to file 
 fileName.
 import migrations fileNameImports migrations from 
 fileName.
 The import process would apply the migrations one at a time giving you 
 feedback like, applying migration: update column family If a migration 
 fails, the process should give an appropriate message and stop. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CASSANDRA-3360) Read data inconsistancy in Cassandra 1.0.0-rc2

2012-08-16 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-3360.
---

Resolution: Cannot Reproduce

 Read data inconsistancy in Cassandra 1.0.0-rc2
 --

 Key: CASSANDRA-3360
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3360
 Project: Cassandra
  Issue Type: Bug
  Components: API
Affects Versions: 1.0.0
Reporter: Gopalakrishnan Rajagopal

 When qsuper column for a particular key is being queried
 using hector-core-0.8.0-2,
 the data retrieved is inconsistent. I mean, for the key that I use to fetch 
 data, there are 7 sub columns actually. But the query returns 1 or 3 sub 
 columns depending on which nodes respond to it. (I tested by bringing down 
 each one of the three nodes in turn).  
 When I tried to fetch the data for the same key using cassandra-cli tool, I 
 get all the 7 sub columns for both the consistancy levels ONE and QUORUM. 
 Below is the code that I used to fetch data
 superColumnQuery = HFactory.createSuperColumnQuery
 (keyspaceOperator, 
 stringSerializer, 
 stringSerializer, stringSerializer, stringSerializer);
 superColumnQuery.setColumnFamily(cfName).setKey
 (key).setSuperName(scName);
 result=superColumnQuery.execute();
 superColumn=result.get();
 columnList=superColumn.getColumns();

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CASSANDRA-3731) bad data size in compactionstats

2012-08-16 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-3731.
---

Resolution: Cannot Reproduce

 bad data size in compactionstats
 

 Key: CASSANDRA-3731
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3731
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.6
 Environment: debian, oracle java 1.6.26
 LeveledCompaction with 4096M (file size limit is 4096MB)
 count of sstables  500
Reporter: Zenek Kraweznik

 pending tasks: -2147483648
 compaction type keyspace column family bytes compacted bytes total progress
 Compaction Archive Messages 35050352366 *0* n/a
 0 bytes total is visible on this node (nodetool ring is reporting 37.18GB).
 After every compactions bytes total is about 73x (i guess it is not 
 compress data size), but this value isn't saved anywhere.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CASSANDRA-3739) Cassandra setInputSplitSize is not working properly

2012-08-16 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-3739.
---

Resolution: Cannot Reproduce

Let us know if you can reproduce on 1.0.12.

 Cassandra setInputSplitSize is not working properly
 ---

 Key: CASSANDRA-3739
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3739
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.0.6
Reporter: manu

 I am using Hadoop 0.20.205 and Cassandra 1.0.6. I use setInputSplitSize(1000) 
 and i expect that every split should be ~1000 rows. The problem is that each 
 mapper still receive much more than 1000 rows. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-16 Thread Florent Clairambault (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436450#comment-13436450
 ] 

Florent Clairambault commented on CASSANDRA-4481:
-

This bug is marked as resolved, so we're just documenting something that never 
happened. We're not scaring anyone here, we're making sure we have all the 
documentation to prove that I we were wrong.

So just to make things clear, I didn't make any kind of change or deletion on 
my keyspaces. The two keyspaces were created by code (one with pelops and one 
with hector) once and never changed. I know I told I did it with cassandra-cli 
earlier but it turns out that it was entirely by code.

While doing some tests, I did delete the keyspaces and in that cases it gives 
an error that looks like: Commit logs for non-existing Column Family 1036 were 
ignored (I can't find the exact error in my logs). 

When I deleted the keyspace files, they were recreated by reading the commit 
logs (this is step 4 in my previous report). So I think they were in accordance 
with the schema stored in cassandra.

--- 

I wanted to actually test it.

The only last versions I could find were 1.0.1, 1.1.2 and 1.1.3. I created a 
small testscript and it definitely works with them. But it would be good if we 
could have access to 1.1.1 to do the same data upgrade we did in the past.

{code}
#!/bin/sh
apt-get remove --purge cassandra -y
rm -Rf /var/log/cassandra /var/lib/cassandra

if [ ! -f cassandra_1.0.11_all.deb ]; then
  wget 
http://apache.mirrors.multidist.eu/cassandra/debian/pool/main/c/cassandra/cassandra_1.0.11_all.deb
fi

if [ ! -f cassandra_1.1.2_all.deb ]; then
   wget 
http://apache.mirrors.multidist.eu/cassandra/debian/pool/main/c/cassandra/cassandra_1.1.2_all.deb
fi

if [ ! -f cassandra_1.1.3_all.deb ]; then
  wget 
http://apache.mirrors.multidist.eu/cassandra/debian/pool/main/c/cassandra/cassandra_1.1.3_all.deb
fi

wait_for_server() {
  while ! echo exit | nc localhost 9160; do sleep 1; done
}

dpkg -i cassandra_1.0.11_all.deb
tail -f /var/log/cassandra/output.log 

wait_for_server;

cassandra-cli -h localhost EOF

create keyspace m2mp;
use m2mp;

create column family Registry
  with column_type = 'Standard'
  and comparator = 'AsciiType'
  and default_validation_class = 'AsciiType'
  and key_validation_class = 'AsciiType';

set Registry['/user/florent']['first']='Florent';
set Registry['/user/florent']['country']='France';
set Registry['/version']['1.0.11']='done';
EOF

cassandra-cli -h localhost -k m2mp EOF
list Registry;
exit;
EOF

dpkg -i cassandra_1.1.2_all.deb

wait_for_server;

cassandra-cli -h localhost -k m2mp EOF
set Registry['/version']['1.1.2']='done';
list Registry;
exit;
EOF

dpkg -i cassandra_1.1.3_all.deb

wait_for_server;

cassandra-cli -h localhost -k m2mp EOF
set Registry['/version']['1.1.3']='done';
list Registry;
exit;
EOF

service cassandra restart

wait_for_server;

cassandra-cli -h localhost -k m2mp EOF
list Registry;
exit;
EOF
{code}

In the end I do have:
{quote}
---
RowKey: /user/florent
= (column=country, value=France, timestamp=1345161343036000)
= (column=first, value=Florent, timestamp=1345161342992000)
---
RowKey: /version
= (column=1.0.11, value=done, timestamp=1345161343039000)
= (column=1.1.2, value=done, timestamp=1345161366935000)
= (column=1.1.3, value=done, timestamp=134516138976)
{quote}

So it's ok. But I would be pretty interested to see if we get the same result 
if we don't skip any version.

 Commitlog not replayed after restart - data lost
 

 Key: CASSANDRA-4481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical

 When data is written to the commitlog and I restart the machine, all commited 
 data is lost that has not been flushed to disk. 
 In the startup logs it says that it replays the commitlog successfully, but 
 the data is not available then. 
 When I open the commitlog file in an editor I can see the added data, but 
 after the restart it cannot be fetched from cassandra. 
 {code}
  INFO 09:59:45,362 Replaying 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Finished reading 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-16 Thread Florent Clairambault (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436450#comment-13436450
 ] 

Florent Clairambault edited comment on CASSANDRA-4481 at 8/17/12 11:11 AM:
---

This bug is marked as resolved, so we're just documenting something that never 
happened. We're not scaring anyone here, we're making sure we have all the 
documentation to prove that I we were wrong.

So just to make things clear, I didn't make any kind of change or deletion on 
my keyspaces. The two keyspaces were created by code (one with pelops and one 
with hector) once and never changed. I know I told I did it with cassandra-cli 
earlier but it turns out that it was entirely by code.

While doing some tests, I did delete the keyspaces and in that cases it gives 
an error that looks like: Commit logs for non-existing Column Family 1036 were 
ignored (I can't find the exact error in my logs). 

When I deleted the keyspace files, they were recreated by reading the commit 
logs (this is step 4 in my previous report). So I think they were in accordance 
with the schema stored in cassandra.

--- 

I wanted to actually test it.

The only last versions I could find were 1.0.11, 1.1.2 and 1.1.3. I created a 
small testscript and it definitely works with them. But it would be good to 
test it with 1.1.1 (which I didn't find) also.

{code}
#!/bin/sh
apt-get remove --purge cassandra -y
rm -Rf /var/log/cassandra /var/lib/cassandra

if [ ! -f cassandra_1.0.11_all.deb ]; then
  wget 
http://apache.mirrors.multidist.eu/cassandra/debian/pool/main/c/cassandra/cassandra_1.0.11_all.deb
fi

if [ ! -f cassandra_1.1.2_all.deb ]; then
   wget 
http://apache.mirrors.multidist.eu/cassandra/debian/pool/main/c/cassandra/cassandra_1.1.2_all.deb
fi

if [ ! -f cassandra_1.1.3_all.deb ]; then
  wget 
http://apache.mirrors.multidist.eu/cassandra/debian/pool/main/c/cassandra/cassandra_1.1.3_all.deb
fi

wait_for_server() {
  while ! echo exit | nc localhost 9160; do sleep 1; done
}

dpkg -i cassandra_1.0.11_all.deb
tail -f /var/log/cassandra/output.log 

wait_for_server;

cassandra-cli -h localhost EOF

create keyspace m2mp;
use m2mp;

create column family Registry
  with column_type = 'Standard'
  and comparator = 'AsciiType'
  and default_validation_class = 'AsciiType'
  and key_validation_class = 'AsciiType';

set Registry['/user/florent']['first']='Florent';
set Registry['/user/florent']['country']='France';
set Registry['/version']['1.0.11']='done';
EOF

cassandra-cli -h localhost -k m2mp EOF
list Registry;
exit;
EOF

dpkg -i cassandra_1.1.2_all.deb

wait_for_server;

cassandra-cli -h localhost -k m2mp EOF
set Registry['/version']['1.1.2']='done';
list Registry;
exit;
EOF

dpkg -i cassandra_1.1.3_all.deb

wait_for_server;

cassandra-cli -h localhost -k m2mp EOF
set Registry['/version']['1.1.3']='done';
list Registry;
exit;
EOF

service cassandra restart

wait_for_server;

cassandra-cli -h localhost -k m2mp EOF
list Registry;
exit;
EOF
{code}

In the end I do have:
{quote}
---
RowKey: /user/florent
= (column=country, value=France, timestamp=1345161343036000)
= (column=first, value=Florent, timestamp=1345161342992000)
---
RowKey: /version
= (column=1.0.11, value=done, timestamp=1345161343039000)
= (column=1.1.2, value=done, timestamp=1345161366935000)
= (column=1.1.3, value=done, timestamp=134516138976)
{quote}

So it's ok. But I would be pretty interested to see if we get the same result 
if we don't skip any version.

  was (Author: superfc):
This bug is marked as resolved, so we're just documenting something that 
never happened. We're not scaring anyone here, we're making sure we have all 
the documentation to prove that I we were wrong.

So just to make things clear, I didn't make any kind of change or deletion on 
my keyspaces. The two keyspaces were created by code (one with pelops and one 
with hector) once and never changed. I know I told I did it with cassandra-cli 
earlier but it turns out that it was entirely by code.

While doing some tests, I did delete the keyspaces and in that cases it gives 
an error that looks like: Commit logs for non-existing Column Family 1036 were 
ignored (I can't find the exact error in my logs). 

When I deleted the keyspace files, they were recreated by reading the commit 
logs (this is step 4 in my previous report). So I think they were in accordance 
with the schema stored in cassandra.

--- 

I wanted to actually test it.

The only last versions I could find were 1.0.11, 1.1.2 and 1.1.3. I created a 
small testscript and it definitely works with them. But it would be good to 
test it with 1.1.1 (which I didn't have good access to) also.

{code}
#!/bin/sh
apt-get remove --purge cassandra -y
rm -Rf /var/log/cassandra /var/lib/cassandra

if [ ! -f cassandra_1.0.11_all.deb ]; then
  wget

[jira] [Comment Edited] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-16 Thread Florent Clairambault (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436450#comment-13436450
 ] 

Florent Clairambault edited comment on CASSANDRA-4481 at 8/17/12 11:17 AM:
---

This bug is marked as resolved, so we're just documenting something that never 
happened. We're not scaring anyone here, we're making sure we have all the 
documentation to prove that we were wrong.

So just to make things clear, I didn't make any kind of change or deletion on 
my keyspaces. The two keyspaces were created by code (one with pelops and one 
with hector) once and never changed. I know I told I did it with cassandra-cli 
earlier but it turns out that it was entirely by code.

While doing some tests, I did delete the keyspaces and in that cases it gives 
an error that looks like: Commit logs for non-existing Column Family 1036 were 
ignored (I can't find the exact error in my logs). 

When I deleted the keyspace files, they were recreated by reading the commit 
logs (this is step 4 in my previous report). So I think they were in accordance 
with the schema stored in cassandra.

--- 

I wanted to actually test it.

The only last versions I could find were 1.0.11, 1.1.2 and 1.1.3. I created a 
small testscript and it definitely works with them. But it would be good to 
test it with 1.1.1 (which I didn't find) also.

{code}
#!/bin/sh
apt-get remove --purge cassandra -y
rm -Rf /var/log/cassandra /var/lib/cassandra

if [ ! -f cassandra_1.0.11_all.deb ]; then
  wget 
http://apache.mirrors.multidist.eu/cassandra/debian/pool/main/c/cassandra/cassandra_1.0.11_all.deb
fi

if [ ! -f cassandra_1.1.2_all.deb ]; then
   wget 
http://apache.mirrors.multidist.eu/cassandra/debian/pool/main/c/cassandra/cassandra_1.1.2_all.deb
fi

if [ ! -f cassandra_1.1.3_all.deb ]; then
  wget 
http://apache.mirrors.multidist.eu/cassandra/debian/pool/main/c/cassandra/cassandra_1.1.3_all.deb
fi

wait_for_server() {
  while ! echo exit | nc localhost 9160; do sleep 1; done
}

dpkg -i cassandra_1.0.11_all.deb
tail -f /var/log/cassandra/output.log 

wait_for_server;

cassandra-cli -h localhost EOF

create keyspace m2mp;
use m2mp;

create column family Registry
  with column_type = 'Standard'
  and comparator = 'AsciiType'
  and default_validation_class = 'AsciiType'
  and key_validation_class = 'AsciiType';

set Registry['/user/florent']['first']='Florent';
set Registry['/user/florent']['country']='France';
set Registry['/version']['1.0.11']='done';
EOF

cassandra-cli -h localhost -k m2mp EOF
list Registry;
exit;
EOF

dpkg -i cassandra_1.1.2_all.deb

wait_for_server;

cassandra-cli -h localhost -k m2mp EOF
set Registry['/version']['1.1.2']='done';
list Registry;
exit;
EOF

dpkg -i cassandra_1.1.3_all.deb

wait_for_server;

cassandra-cli -h localhost -k m2mp EOF
set Registry['/version']['1.1.3']='done';
list Registry;
exit;
EOF

service cassandra restart

wait_for_server;

cassandra-cli -h localhost -k m2mp EOF
list Registry;
exit;
EOF
{code}

In the end I do have:
{quote}
---
RowKey: /user/florent
= (column=country, value=France, timestamp=1345161343036000)
= (column=first, value=Florent, timestamp=1345161342992000)
---
RowKey: /version
= (column=1.0.11, value=done, timestamp=1345161343039000)
= (column=1.1.2, value=done, timestamp=1345161366935000)
= (column=1.1.3, value=done, timestamp=134516138976)
{quote}

So it's ok. But I would be pretty interested to see if we get the same result 
if we don't skip any version.

  was (Author: superfc):
This bug is marked as resolved, so we're just documenting something that 
never happened. We're not scaring anyone here, we're making sure we have all 
the documentation to prove that I we were wrong.

So just to make things clear, I didn't make any kind of change or deletion on 
my keyspaces. The two keyspaces were created by code (one with pelops and one 
with hector) once and never changed. I know I told I did it with cassandra-cli 
earlier but it turns out that it was entirely by code.

While doing some tests, I did delete the keyspaces and in that cases it gives 
an error that looks like: Commit logs for non-existing Column Family 1036 were 
ignored (I can't find the exact error in my logs). 

When I deleted the keyspace files, they were recreated by reading the commit 
logs (this is step 4 in my previous report). So I think they were in accordance 
with the schema stored in cassandra.

--- 

I wanted to actually test it.

The only last versions I could find were 1.0.11, 1.1.2 and 1.1.3. I created a 
small testscript and it definitely works with them. But it would be good to 
test it with 1.1.1 (which I didn't find) also.

{code}
#!/bin/sh
apt-get remove --purge cassandra -y
rm -Rf /var/log/cassandra /var/lib/cassandra

if [ ! -f cassandra_1.0.11_all.deb ]; then
  wget

[jira] [Updated] (CASSANDRA-2897) Secondary indexes without read-before-write

2012-08-16 Thread Philip Jenvey (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Jenvey updated CASSANDRA-2897:
-

Attachment: 41ec9fc-2897.txt

Here's an alternative patch that also tackles just the non-compaction changes 
(it's a little stale, against 41ec9fc)

Briefly looking at Sam's version, I'll note that:

o Mine handles entire row deletions in Memtable

o but it lacks changes to CompositesSearcher/SchemaLoader/CFMetaDataTest 
(though I'm not familiar with these code paths, either)

o in KeysSearcher, I very likely should be using the compare method from 
getValueValidator to check for staleness (instead of naively just calling 
equals)

 Secondary indexes without read-before-write
 ---

 Key: CASSANDRA-2897
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2897
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Sylvain Lebresne
Priority: Minor
  Labels: secondary_index
 Fix For: 1.2.0

 Attachments: 
 0001-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt, 
 41ec9fc-2897.txt


 Currently, secondary index updates require a read-before-write to maintain 
 the index consistency. Keeping the index consistent at all time is not 
 necessary however. We could let the (secondary) index get inconsistent on 
 writes and repair those on reads. This would be easy because on reads, we 
 make sure to request the indexed columns anyway, so we can just skip the row 
 that are not needed and repair the index at the same time.
 This does trade work on writes for work on reads. However, read-before-write 
 is sufficiently costly that it will likely be a win overall.
 There is (at least) two small technical difficulties here though:
 # If we repair on read, this will be racy with writes, so we'll probably have 
 to synchronize there.
 # We probably shouldn't only rely on read to repair and we should also have a 
 task to repair the index for things that are rarely read. It's unclear how to 
 make that low impact though.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4550) nodetool ring output should use hex not integers for tokens

2012-08-16 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436511#comment-13436511
 ] 

Jonathan Ellis commented on CASSANDRA-4550:
---

(Removed affects=1.09 because we use affects as the earliest version 
affected, which is all versions in this case.)

 nodetool ring output should use hex not integers for tokens
 ---

 Key: CASSANDRA-4550
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4550
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
 Environment: Linux
Reporter: Aaron Turner
Priority: Trivial
  Labels: lhf

 The current output of nodetool ring prints start token values as base10 
 integers instead of hex.  This is not very user friendly for a number of 
 reasons:
 1. Hides the fact that the values are 128bit
 2. Values are not of a consistent length, while in hex padding with zero is 
 generally accepted
 3. When using the default random partitioner, having the values in hex makes 
 it easier for users to determine which node(s) a given key resides on since 
 md5 utilities like md5sum output hex.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4481) Commitlog not replayed after restart - data lost

2012-08-16 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436515#comment-13436515
 ] 

Jonathan Ellis commented on CASSANDRA-4481:
---

1.1.1 is available here: http://archive.apache.org/dist/cassandra/1.1.1/

 Commitlog not replayed after restart - data lost
 

 Key: CASSANDRA-4481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4481
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Single node cluster on 64Bit CentOS
Reporter: Ivo Meißner
Priority: Critical

 When data is written to the commitlog and I restart the machine, all commited 
 data is lost that has not been flushed to disk. 
 In the startup logs it says that it replays the commitlog successfully, but 
 the data is not available then. 
 When I open the commitlog file in an editor I can see the added data, but 
 after the restart it cannot be fetched from cassandra. 
 {code}
  INFO 09:59:45,362 Replaying 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Finished reading 
 /var/myproject/cassandra/commitlog/CommitLog-83203377067.log
  INFO 09:59:45,476 Log replay complete, 0 replayed mutations
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-4554) Log when a node is down longer than the hint window and we stop saving hints

2012-08-16 Thread Jonathan Ellis (JIRA)

Jonathan Ellis created CASSANDRA-4554:
-

 Summary: Log when a node is down longer than the hint window and 
we stop saving hints
 Key: CASSANDRA-4554
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4554
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Priority: Minor
 Fix For: 1.1.5


We know that we need to repair whenever we lose a node or disk permanently 
(since it may have had undelivered hints on it), but without exposing this we 
don't know when nodes stop saving hints for a temporarily dead node, unless 
we're paying very close attention to external monitoring.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4554) Log when a node is down longer than the hint window and we stop saving hints

2012-08-16 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436537#comment-13436537
 ] 

Jonathan Ellis commented on CASSANDRA-4554:
---

Should probably log this in a system table so it's easily queried.

 Log when a node is down longer than the hint window and we stop saving hints
 

 Key: CASSANDRA-4554
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4554
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Priority: Minor
 Fix For: 1.1.5


 We know that we need to repair whenever we lose a node or disk permanently 
 (since it may have had undelivered hints on it), but without exposing this we 
 don't know when nodes stop saving hints for a temporarily dead node, unless 
 we're paying very close attention to external monitoring.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira