date:20110828


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-3086.
---

Resolution: Invalid
  Assignee: (was: Benjamin Coverston)

This was done in 1608 after all

 Use interval tree to narrow down sstables on range scans
 

 Key: CASSANDRA-3086
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3086
 Project: Cassandra
  Issue Type: Bug
Reporter: Jonathan Ellis
Priority: Minor

 CASSANDRA-1608 added interval tree optimization for single-row queries but 
 not range scans (CFS.getRangeSlice).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3025) PHP/PDO driver for Cassandra CQL

2011-08-28 Thread Mikko Koppanen (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikko Koppanen updated CASSANDRA-3025:
--

Attachment: pdo_cassandra-0.1.3.tgz

Update to the latest package

 PHP/PDO driver for Cassandra CQL
 

 Key: CASSANDRA-3025
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3025
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Mikko Koppanen
  Labels: php
 Attachments: pdo_cassandra-0.1.0.tgz, pdo_cassandra-0.1.1.tgz, 
 pdo_cassandra-0.1.2.tgz, pdo_cassandra-0.1.3.tgz, 
 php_test_results_20110818_2317.txt


 Hello,
 attached is the initial version of the PDO driver for Cassandra CQL language. 
 This is a native PHP extension written in what I would call a combination of 
 C and C++, due to PHP being C. The thrift API used is the C++.
 The API looks roughly following:
 {code}
 ?php
 $db = new PDO('cassandra:host=127.0.0.1;port=9160');
 $db-exec (CREATE KEYSPACE mytest with strategy_class = 'SimpleStrategy' and 
 strategy_options:replication_factor=1;);
 $db-exec (USE mytest);
 $db-exec (CREATE COLUMNFAMILY users (
   my_key varchar PRIMARY KEY,
   full_name varchar ););
   
 $stmt = $db-prepare (INSERT INTO users (my_key, full_name) VALUES (:key, 
 :full_name););
 $stmt-execute (array (':key' = 'mikko', ':full_name' = 'Mikko K' ));
 {code}
 Currently prepared statements are emulated on the client side but I 
 understand that there is a plan to add prepared statements to Cassandra CQL 
 API as well. I will add this feature in to the extension as soon as they are 
 implemented.
 Additional documentation can be found in github 
 https://github.com/mkoppanen/php-pdo_cassandra, in the form of rendered 
 MarkDown file. Tests are currently not included in the package file and they 
 can be found in the github for now as well.
 I have created documentation in docbook format as well, but have not yet 
 rendered it.
 Comments and feedback are welcome.
 Thanks,
 Mikko

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-3095) java.lang.NegativeArraySizeException during compacting large row

2011-08-28 Thread Pas (JIRA)

java.lang.NegativeArraySizeException during compacting large row


 Key: CASSANDRA-3095
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3095
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.4
 Environment: Linux 2.6.26-2-amd64 #1 SMP Thu Feb 11 00:59:32 UTC 2010 
x86_64 GNU/Linux
JDK 1.6.0_27 (Java 6 update 27), with JNA.
Reporter: Pas


Hello,

It's a 4 node ring, 3 on 0.7.4, I've upgraded one to 0.8.4. This particular 
node was having issues with compaction that's why I've tried the upgrade (it 
looks likely that this solved the compaction issues).

Here's the stack trace from system.log.

 INFO [CompactionExecutor:22] 2011-08-28 18:12:46,566 CompactionController.java 
(line 136) Compacting large row  (36028797018963968 bytes) incrementally
ERROR [CompactionExecutor:22] 2011-08-28 18:12:46,609 
AbstractCassandraDaemon.java (line 134) Fatal exception in thread 
Thread[CompactionExecutor:22,1,main]
java.lang.NegativeArraySizeException
at org.apache.cassandra.utils.obs.OpenBitSet.init(OpenBitSet.java:85)
at 
org.apache.cassandra.utils.BloomFilter.bucketsFor(BloomFilter.java:56)
at org.apache.cassandra.utils.BloomFilter.getFilter(BloomFilter.java:73)
at 
org.apache.cassandra.db.ColumnIndexer.serializeInternal(ColumnIndexer.java:62)
at 
org.apache.cassandra.db.ColumnIndexer.serialize(ColumnIndexer.java:50)
at 
org.apache.cassandra.db.compaction.LazilyCompactedRow.init(LazilyCompactedRow.java:89)
at 
org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:138)
at 
org.apache.cassandra.db.compaction.CompactionIterator.getReduced(CompactionIterator.java:123)
at 
org.apache.cassandra.db.compaction.CompactionIterator.getReduced(CompactionIterator.java:43)
at 
org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:74)
at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
at 
org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183)
at 
org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94)
at 
org.apache.cassandra.db.compaction.CompactionManager.doCompactionWithoutSizeEstimation(CompactionManager.java:569)
at 
org.apache.cassandra.db.compaction.CompactionManager.doCompaction(CompactionManager.java:506)
at 
org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:141)
at 
org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:107)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)


We've ~70 files still in f format. And 80 in g. We've ~100 GB of data on 
this node.

Thanks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3091) Move the caching of KS and CF metadata in the JDBC suite from Connection to Statement

2011-08-28 Thread Rick Shaw (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rick Shaw updated CASSANDRA-3091:
-

Attachment: move-metadata-for-decoder-to-statement-level-v2.txt

v2 of patch adds better clarity to the non-interface methods of 
{{CassandraConnection}} by making them package {{protected}}.

 Move the caching of KS and CF metadata in the JDBC suite from Connection to 
 Statement
 -

 Key: CASSANDRA-3091
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3091
 Project: Cassandra
  Issue Type: Improvement
  Components: Drivers
Affects Versions: 0.8.4
Reporter: Rick Shaw
Assignee: Rick Shaw
Priority: Minor
  Labels: JDBC
 Fix For: 0.8.5

 Attachments: move-metadata-for decoder-to-statement-level-v1.txt, 
 move-metadata-for-decoder-to-statement-level-v2.txt


 Currently, all caching of metadata used in JDBC's {{ColumnDecoder}} class is 
 loaded and held in the {{CassandraConnection}} class. The implication of this 
 is that any activity on the connected server from the time the connection is 
 established is not reflected in the KSs and CF that can be accessed by the 
 {{ResultSet, Statement}} and {{PreparedStatement}}.
 By moving the cached metadata to the {{Statement}} level, the currency of the 
 metadata can be checked within the {{Statement}} and reloaded if it is seen 
 to be absent. And by instantiating a new {{Statement}} (on any existing 
 connection) you are assured of getting the most current copy of the metadata 
 known to the server at the new time of instantiation.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2664) JDBC driver for CQL works only with Strings

2011-08-28 Thread Rick Shaw (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13092517#comment-13092517
 ] 

Rick Shaw commented on CASSANDRA-2664:
--

This appears closable... There is no such method in the current code and all 
current methods in the {{PreparedStatement}} unit test succeed.

 JDBC driver for CQL works only with Strings
 ---

 Key: CASSANDRA-2664
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2664
 Project: Cassandra
  Issue Type: Bug
  Components: API
Affects Versions: 0.8.0 beta 2
 Environment: It happens to JDBC driver for both: 0.8.0 beta version 
 and 0.8.0-rc1
Reporter: Roman Kuzmin
  Labels: cql, jdbc
   Original Estimate: 4h
  Remaining Estimate: 4h

 CassandraPreparedStatement.java
 Line 141:
 String stringParam = makeCqlString(type.toString(param));
 It crashes with ClassCastException for all parameters that are not Strings. 
 It is because, when the method applyDualBindings is called from makeUpdate it 
 ALWAYS get one and the same type as parameter. In fact it is a comparator 
 of columnfamily itself.
 In my case it is UTF8Type. And UTF8Type.toString() method expects only 
 Strings.
 I think it must be column-dependent.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-1608) Redesigned Compaction

2011-08-28 Thread Benjamin Coverston (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13092518#comment-13092518
]

Benjamin Coverston commented on CASSANDRA-1608:
---

.bq Additional note: test suite runs about 20% slower for me w/ Leveled
compactions. Unsure if that should be expected.

That's not entirely expected. It's probably due in part to the amount of
flushing that we force during in the tests. Flushes and compactions both
trigger interval tree builds.

Other than that the codepaths are the same.

Redesigned Compaction
-

Key: CASSANDRA-1608
URL: https://issues.apache.org/jira/browse/CASSANDRA-1608
Project: Cassandra
Issue Type: Improvement
Components: Core
Reporter: Chris Goffinet
Assignee: Benjamin Coverston
Fix For: 1.0

Attachments: 1608-22082011.txt, 1608-v2.txt, 1608-v4.txt, 1608-v5.txt

After seeing the I/O issues in CASSANDRA-1470, I've been doing some more
thinking on this subject that I wanted to lay out.
I propose we redo the concept of how compaction works in Cassandra. At the
moment, compaction is kicked off based on a write access pattern, not read
access pattern. In most cases, you want the opposite. You want to be able to
track how well each SSTable is performing in the system. If we were to keep
statistics in-memory of each SSTable, prioritize them based on most accessed,
and bloom filter hit/miss ratios, we could intelligently group sstables that
are being read most often and schedule them for compaction. We could also
schedule lower priority maintenance on SSTable's not often accessed.
I also propose we limit the size of each SSTable to a fix sized, that gives
us the ability to better utilize our bloom filters in a predictable manner.
At the moment after a certain size, the bloom filters become less reliable.
This would also allow us to group data most accessed. Currently the size of
an SSTable can grow to a point where large portions of the data might not
actually be accessed as often.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-1608) Redesigned Compaction

2011-08-28 Thread Benjamin Coverston (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13092518#comment-13092518
]

Benjamin Coverston edited comment on CASSANDRA-1608 at 8/28/11 5:25 PM:

bq. Additional note: test suite runs about 20% slower for me w/ Leveled
compactions. Unsure if that should be expected.

That's not entirely expected. It's probably due in part to the amount of
flushing that we force during in the tests. Flushes and compactions both
trigger interval tree builds.

Other than that the codepaths are the same.

was (Author: bcoverston):
.bq Additional note: test suite runs about 20% slower for me w/ Leveled
compactions. Unsure if that should be expected.

That's not entirely expected. It's probably due in part to the amount of
flushing that we force during in the tests. Flushes and compactions both
trigger interval tree builds.

Other than that the codepaths are the same.

Redesigned Compaction
-

Attachments: 1608-22082011.txt, 1608-v2.txt, 1608-v4.txt, 1608-v5.txt

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CASSANDRA-3092) Delete columns using range without specifying the column names


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-3092.
---

Resolution: Duplicate

see CASSANDRA-494

 Delete columns using range without specifying the column names
 --

 Key: CASSANDRA-3092
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3092
 Project: Cassandra
  Issue Type: Improvement
Reporter: Tongguo Pang

 When we delete columns, especially whose names are time stamps(obtained from 
 System.curMillis() method), it's very hard to get the column names. If we the 
 delete can take a range of column names(using start and end), that can make 
 this operation much easier

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CASSANDRA-2664) JDBC driver for CQL works only with Strings


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-2664.
---

Resolution: Invalid

 JDBC driver for CQL works only with Strings
 ---

 Key: CASSANDRA-2664
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2664
 Project: Cassandra
  Issue Type: Bug
  Components: API
Affects Versions: 0.8.0 beta 2
 Environment: It happens to JDBC driver for both: 0.8.0 beta version 
 and 0.8.0-rc1
Reporter: Roman Kuzmin
  Labels: cql, jdbc
   Original Estimate: 4h
  Remaining Estimate: 4h

 CassandraPreparedStatement.java
 Line 141:
 String stringParam = makeCqlString(type.toString(param));
 It crashes with ClassCastException for all parameters that are not Strings. 
 It is because, when the method applyDualBindings is called from makeUpdate it 
 ALWAYS get one and the same type as parameter. In fact it is a comparator 
 of columnfamily itself.
 In my case it is UTF8Type. And UTF8Type.toString() method expects only 
 Strings.
 I think it must be column-dependent.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

svn commit: r1162598 - /cassandra/trunk/src/java/org/apache/cassandra/security/streaming/SSLIncomingStreamReader.java

2011-08-28 Thread xedin

Author: xedin
Date: Sun Aug 28 21:37:49 2011
New Revision: 1162598

URL: http://svn.apache.org/viewvc?rev=1162598view=rev
Log:
Deleted empty file 
src/java/org/apache/cassandra/security/streaming/SSLIncomingStreamReader.java

Modified:

cassandra/trunk/src/java/org/apache/cassandra/security/streaming/SSLIncomingStreamReader.java

[jira] [Updated] (CASSANDRA-3085) Race condition in sstable reference counting


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3085:
--

Attachment: (was: 3085-v2.txt)

 Race condition in sstable reference counting
 

 Key: CASSANDRA-3085
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3085
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Critical
 Fix For: 1.0

 Attachments: 3085-v2.txt, 3085.txt


 DataTracker gives us an atomic View of memtable/sstables, but acquiring 
 references is not atomic.  So it is possible to acquire references to an 
 SSTableReader object that is no longer valid, as in this example:
 View V contains sstables {A, B}.  We attempt a read in thread T using this 
 View.
 Meanwhile, A and B are compacted to {C}, yielding View W.  No references 
 exist to A or B so they are cleaned up.
 Back in thread T we acquire references to A and B.  This does not cause an 
 error, but it will when we attempt to read from them next.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3085) Race condition in sstable reference counting

[
https://issues.apache.org/jira/browse/CASSANDRA-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jonathan Ellis updated CASSANDRA-3085:
--

Attachment: 3085-v2.txt

v2 encapsulates the lockless atomic acquisition in CFS.markReferenced(Interval).

Not 100% sure how important the changes to the getRangeSlice tokens were, that
I took out. :)

If we need those, we might need to make getRangeSlice loop manually w/o the
encapsulation, since we need the view to compute the Interval, but we need the
Interval to search for sstables.

Race condition in sstable reference counting

Key: CASSANDRA-3085
URL: https://issues.apache.org/jira/browse/CASSANDRA-3085
Project: Cassandra
Issue Type: Bug
Components: Core
Affects Versions: 1.0
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Critical
Fix For: 1.0

Attachments: 3085-v2.txt, 3085.txt

DataTracker gives us an atomic View of memtable/sstables, but acquiring
references is not atomic. So it is possible to acquire references to an
SSTableReader object that is no longer valid, as in this example:
View V contains sstables {A, B}. We attempt a read in thread T using this
View.
Meanwhile, A and B are compacted to {C}, yielding View W. No references
exist to A or B so they are cleaned up.
Back in thread T we acquire references to A and B. This does not cause an
error, but it will when we attempt to read from them next.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3085) Race condition in sstable reference counting


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3085:
--

Attachment: 3085-v2.txt

 Race condition in sstable reference counting
 

 Key: CASSANDRA-3085
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3085
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Critical
 Fix For: 1.0

 Attachments: 3085-v2.txt, 3085.txt


 DataTracker gives us an atomic View of memtable/sstables, but acquiring 
 references is not atomic.  So it is possible to acquire references to an 
 SSTableReader object that is no longer valid, as in this example:
 View V contains sstables {A, B}.  We attempt a read in thread T using this 
 View.
 Meanwhile, A and B are compacted to {C}, yielding View W.  No references 
 exist to A or B so they are cleaned up.
 Back in thread T we acquire references to A and B.  This does not cause an 
 error, but it will when we attempt to read from them next.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-3096) Test RoundRobinScheduler timeouts

Test RoundRobinScheduler timeouts
-

 Key: CASSANDRA-3096
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3096
 Project: Cassandra
  Issue Type: Bug
  Components: API
Reporter: Stu Hood
Assignee: Stu Hood


CASSANDRA-3079 was very hasty, and introduced two bugs that would: 1) cause the 
scheduler to busywait after a timeout, 2) never actually throw timeouts. This 
calls for a test.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3096) Test RoundRobinScheduler timeouts


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-3096:


Attachment: 0001-Properly-throw-timeouts-decrement-the-count-of-waiters.txt

0001 - Properly throw timeouts from WeightedQueue, decrement the count of 
waiters on timeout, fix off-by-one in taskCount, and test all of it.

 Test RoundRobinScheduler timeouts
 -

 Key: CASSANDRA-3096
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3096
 Project: Cassandra
  Issue Type: Bug
  Components: API
Reporter: Stu Hood
Assignee: Stu Hood
 Fix For: 1.0

 Attachments: 
 0001-Properly-throw-timeouts-decrement-the-count-of-waiters.txt


 CASSANDRA-3079 was very hasty, and introduced two bugs that would: 1) cause 
 the scheduler to busywait after a timeout, 2) never actually throw timeouts. 
 This calls for a test.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3096) Test RoundRobinScheduler timeouts


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-3096:


Attachment: (was: 
0001-Properly-throw-timeouts-decrement-the-count-of-waiters.txt)

 Test RoundRobinScheduler timeouts
 -

 Key: CASSANDRA-3096
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3096
 Project: Cassandra
  Issue Type: Bug
  Components: API
Reporter: Stu Hood
Assignee: Stu Hood
 Fix For: 1.0


 CASSANDRA-3079 was very hasty, and introduced two bugs that would: 1) cause 
 the scheduler to busywait after a timeout, 2) never actually throw timeouts. 
 This calls for a test.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3096) Test RoundRobinScheduler timeouts


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-3096:


Attachment: 0001-Properly-throw-timeouts-decrement-the-count-of-waiters.txt

 Test RoundRobinScheduler timeouts
 -

 Key: CASSANDRA-3096
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3096
 Project: Cassandra
  Issue Type: Bug
  Components: API
Reporter: Stu Hood
Assignee: Stu Hood
 Fix For: 1.0

 Attachments: 
 0001-Properly-throw-timeouts-decrement-the-count-of-waiters.txt


 CASSANDRA-3079 was very hasty, and introduced two bugs that would: 1) cause 
 the scheduler to busywait after a timeout, 2) never actually throw timeouts. 
 This calls for a test.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2630) CLI - 'describe column family' would be nice

2011-08-28 Thread satish babu krishnamoorthy (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

satish babu krishnamoorthy updated CASSANDRA-2630:
--

Attachment: cassandra-0.8.2-2630-2.txt

updated comments from pavel :)

 CLI - 'describe column family' would be nice
 

 Key: CASSANDRA-2630
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2630
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 0.8.4
Reporter: Jeremy Hanna
Assignee: satish babu krishnamoorthy
Priority: Minor
  Labels: cli, lhf
 Fix For: 1.0

 Attachments: cassandra-0.8.2-2630-1.txt, cassandra-0.8.2-2630-2.txt, 
 cassandra-0.8.2-2630.txt


 I end up verifying column families a lot and using 'describe keyspace 
 keyspace;' spits out a whole bunch of data since our keyspace has a lot of 
 metadata.  It would be really useful to have a 'describe column family;' 
 for a given column family in the currently authenticated keyspace.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2630) CLI - 'describe column family' would be nice

2011-08-28 Thread Pavel Yaskevich (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13092569#comment-13092569
 ] 

Pavel Yaskevich commented on CASSANDRA-2630:


One last thing: can you please re-attach v2 version and check Grant license to 
ASF for inclusion in ASF works checkbox, thanks!

 CLI - 'describe column family' would be nice
 

 Key: CASSANDRA-2630
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2630
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 0.8.4
Reporter: Jeremy Hanna
Assignee: satish babu krishnamoorthy
Priority: Minor
  Labels: cli, lhf
 Fix For: 1.0

 Attachments: cassandra-0.8.2-2630-1.txt, cassandra-0.8.2-2630-2.txt, 
 cassandra-0.8.2-2630.txt


 I end up verifying column families a lot and using 'describe keyspace 
 keyspace;' spits out a whole bunch of data since our keyspace has a lot of 
 metadata.  It would be really useful to have a 'describe column family;' 
 for a given column family in the currently authenticated keyspace.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-2252) arena allocation for memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13092574#comment-13092574
 ] 

Yang Yang edited comment on CASSANDRA-2252 at 8/29/11 12:17 AM:


hi Jonathan:

I checked the counters code, they currently use HeapAllocator .  what is the 
reason we don't yet use SlabAllocator for Counters?

Thanks


also I put the idea to use 2 SlabAllocators (one for those buffers with long 
life, one for those short-lived) in 
https://github.com/yangyangyyy/cassandra/commit/bc017835c64240e58c0c51b2d5f8793f3c7f3a76

https://github.com/yangyangyyy/cassandra/commit/8431ca1b9586086073e6b81d346a06e8172a97e7
maybe it is useful



  was (Author: yangyangyyy):
hi Jonathan:

I checked the counters code, they currently use HeapAllocator .  what is the 
reason we don't yet use SlabAllocator for Counters?

Thanks


also I put the idea to use 2 SlabAllocators (one for those buffers with long 
life, one for those short-lived) in 
https://github.com/yangyangyyy/cassandra/commit/bc017835c64240e58c0c51b2d5f8793f3c7f3a76
maybe it is useful


  
 arena allocation for memtables
 --

 Key: CASSANDRA-2252
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2252
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
 Fix For: 1.0

 Attachments: 0001-add-MemtableAllocator.txt, 
 0002-add-off-heap-MemtableAllocator-support.txt, 2252-v3.txt, 2252-v4.txt, 
 merged-2252.tgz


 The memtable design practically actively fights Java's GC design.  Todd 
 Lipcon gave a good explanation over on HBASE-3455.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-1599) Add sort/order support for secondary indexing


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Nine updated CASSANDRA-1599:
-

Issue Type: Sub-task  (was: New Feature)
Parent: CASSANDRA-2915

 Add sort/order support for secondary indexing
 -

 Key: CASSANDRA-1599
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1599
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Reporter: Todd Nine
Assignee: Jonathan Ellis
   Original Estimate: 32h
  Remaining Estimate: 32h

 For a lot of users paging is a standard use case on many web applications.  
 It would be nice to allow paging as part of a Boolean Expression.
 Page - start index
- end index
- page timestamp 
- Sort Order
 When sorting, is it possible to sort both ASC and DESC? 
 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-1598) Add Boolean Expression to secondary querying


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Nine updated CASSANDRA-1598:
-

Issue Type: Sub-task  (was: New Feature)
Parent: CASSANDRA-2915

 Add Boolean Expression to secondary querying
 

 Key: CASSANDRA-1598
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1598
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.7 beta 3
Reporter: Todd Nine

 Add boolean operators similar to Lucene style searches.  Currently there is 
 implicit support for the  operator.  It would be helpful to also add 
 support for ||/Union operators.  I would envision this as the client would be 
 required to construct the expression tree and pass it via the thrift 
 interface.
 BooleanExpression -- BooleanOrIndexExpression
  -- BooleanOperator
  -- BooleanOrIndexExpression
 I'd like to take a crack at this since it will greatly improve my Datanucleus 
 plugin

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-1598) Add Boolean Expression to secondary querying


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Nine updated CASSANDRA-1598:
-

Issue Type: New Feature  (was: Sub-task)
Parent: (was: CASSANDRA-2915)

 Add Boolean Expression to secondary querying
 

 Key: CASSANDRA-1598
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1598
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Affects Versions: 0.7 beta 3
Reporter: Todd Nine

 Add boolean operators similar to Lucene style searches.  Currently there is 
 implicit support for the  operator.  It would be helpful to also add 
 support for ||/Union operators.  I would envision this as the client would be 
 required to construct the expression tree and pass it via the thrift 
 interface.
 BooleanExpression -- BooleanOrIndexExpression
  -- BooleanOperator
  -- BooleanOrIndexExpression
 I'd like to take a crack at this since it will greatly improve my Datanucleus 
 plugin

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2252) arena allocation for memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13092600#comment-13092600
 ] 

Yang Yang commented on CASSANDRA-2252:
--

if it's a memtable-related operation  aren't the CounterContext's finally 
inserted into the CounterColumns, hence the Memtable too ?

for example:

CounterMutation.computeShardMerger() == CounterColumn.computeOldShardMerger()
=== ByteBuffer contextManager.computeOldShardMerger {
  ..
   ContextState merger = ContextState.allocate(2, nbDelta, 
HeapAllocator.instance);
 
 return merger.context;
}

the merger.context is a ByteBuffer that is inserted into CounterColumn by 
CounterColumn.computeOldShardMerger()



Thanks
Yang


 arena allocation for memtables
 --

 Key: CASSANDRA-2252
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2252
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
 Fix For: 1.0

 Attachments: 0001-add-MemtableAllocator.txt, 
 0002-add-off-heap-MemtableAllocator-support.txt, 2252-v3.txt, 2252-v4.txt, 
 merged-2252.tgz


 The memtable design practically actively fights Java's GC design.  Todd 
 Lipcon gave a good explanation over on HBASE-3455.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-2252) arena allocation for memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13092600#comment-13092600
 ] 

Yang Yang edited comment on CASSANDRA-2252 at 8/29/11 2:33 AM:
---

if it's a memtable-related operation  CounterContext.allocate produces a 
ByteBuffer , some of which goes into CounterColumn, hence Memtable, it seems.

for example:

CounterMutation.computeShardMerger() == CounterColumn.computeOldShardMerger()
=== ByteBuffer contextManager.computeOldShardMerger {
  ..
   ContextState merger = ContextState.allocate(2, nbDelta, 
HeapAllocator.instance);
 
 return merger.context;
}

the merger.context is a ByteBuffer that is inserted into CounterColumn by 
CounterColumn.computeOldShardMerger()



Thanks
Yang


  was (Author: yangyangyyy):
if it's a memtable-related operation  aren't the CounterContext's 
finally inserted into the CounterColumns, hence the Memtable too ?

for example:

CounterMutation.computeShardMerger() == CounterColumn.computeOldShardMerger()
=== ByteBuffer contextManager.computeOldShardMerger {
  ..
   ContextState merger = ContextState.allocate(2, nbDelta, 
HeapAllocator.instance);
 
 return merger.context;
}

the merger.context is a ByteBuffer that is inserted into CounterColumn by 
CounterColumn.computeOldShardMerger()



Thanks
Yang

  
 arena allocation for memtables
 --

 Key: CASSANDRA-2252
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2252
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
 Fix For: 1.0

 Attachments: 0001-add-MemtableAllocator.txt, 
 0002-add-off-heap-MemtableAllocator-support.txt, 2252-v3.txt, 2252-v4.txt, 
 merged-2252.tgz


 The memtable design practically actively fights Java's GC design.  Todd 
 Lipcon gave a good explanation over on HBASE-3455.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-2252) arena allocation for memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13092600#comment-13092600
 ] 

Yang Yang edited comment on CASSANDRA-2252 at 8/29/11 2:33 AM:
---

if it's a memtable-related operation  but CounterContext.allocate 
produces a ByteBuffer , some of which goes into CounterColumn, hence Memtable, 
it seems.

for example:

CounterMutation.computeShardMerger() == CounterColumn.computeOldShardMerger()
=== ByteBuffer contextManager.computeOldShardMerger {
  ..
   ContextState merger = ContextState.allocate(2, nbDelta, 
HeapAllocator.instance);
 
 return merger.context;
}

the merger.context is a ByteBuffer that is inserted into CounterColumn by 
CounterColumn.computeOldShardMerger()



Thanks
Yang


  was (Author: yangyangyyy):
if it's a memtable-related operation  CounterContext.allocate 
produces a ByteBuffer , some of which goes into CounterColumn, hence Memtable, 
it seems.

for example:

CounterMutation.computeShardMerger() == CounterColumn.computeOldShardMerger()
=== ByteBuffer contextManager.computeOldShardMerger {
  ..
   ContextState merger = ContextState.allocate(2, nbDelta, 
HeapAllocator.instance);
 
 return merger.context;
}

the merger.context is a ByteBuffer that is inserted into CounterColumn by 
CounterColumn.computeOldShardMerger()



Thanks
Yang

  
 arena allocation for memtables
 --

 Key: CASSANDRA-2252
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2252
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
 Fix For: 1.0

 Attachments: 0001-add-MemtableAllocator.txt, 
 0002-add-off-heap-MemtableAllocator-support.txt, 2252-v3.txt, 2252-v4.txt, 
 merged-2252.tgz


 The memtable design practically actively fights Java's GC design.  Todd 
 Lipcon gave a good explanation over on HBASE-3455.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2252) arena allocation for memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13092602#comment-13092602
 ] 

Jonathan Ellis commented on CASSANDRA-2252:
---

Happy to look at a patch to fix that.  Please open a new ticket.

 arena allocation for memtables
 --

 Key: CASSANDRA-2252
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2252
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
 Fix For: 1.0

 Attachments: 0001-add-MemtableAllocator.txt, 
 0002-add-off-heap-MemtableAllocator-support.txt, 2252-v3.txt, 2252-v4.txt, 
 merged-2252.tgz


 The memtable design practically actively fights Java's GC design.  Todd 
 Lipcon gave a good explanation over on HBASE-3455.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2915) Lucene based Secondary Indexes

[
https://issues.apache.org/jira/browse/CASSANDRA-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13092606#comment-13092606
]

Todd Nine commented on CASSANDRA-2915:
--

I don't necessaryly think there is a 1 to 1 relationship between a column and a
Lucene document field. In our case we have the need to index fields in more
than one manner. For instance, we index users as straight strings (lowercased)
with email, first name and last name columns. However we also want to tokenize
the email, first and last name columns to allow our customer support people to
perform partial name matching. I think a 1 to N mapping is required for column
to document field to allow this sort of functionality.

As far as expiration on columns, is there a system event that we can hook into
to just force a document reindex when a column expires rather than add an
additional field that will need to be sorted from?

As per Jason's previous post, I think supporting ORDER BY, GROUP BY, COUNT,
LIKE etc are a must. Most users have become accustomed to this functionality
with RDBMS. If they cause potential performance problems, I think this should
be documented so that users have enough information to determine if they can
rely on the Lucene index or should build their own index directly.

Lastly, this is a huge feature for the hector-jpa plugin, what can I do to help?

Lucene based Secondary Indexes
--

Key: CASSANDRA-2915
URL: https://issues.apache.org/jira/browse/CASSANDRA-2915
Project: Cassandra
Issue Type: New Feature
Components: Core
Reporter: T Jake Luciani
Assignee: Jason Rutherglen
Labels: secondary_index

Secondary indexes (of type KEYS) suffer from a number of limitations in their
current form:
- Multiple IndexClauses only work when there is a subset of rows under the
highest clause
- One new column family is created per index this means 10 new CFs for 10
secondary indexes
This ticket will use the Lucene library to implement secondary indexes as one
index per CF, and utilize the Lucene query engine to handle multiple index
clauses. Also, by using the Lucene we get a highly optimized file format.
There are a few parallels we can draw between Cassandra and Lucene.
Lucene indexes segments in memory then flushes them to disk so we can sync
our memtable flushes to lucene flushes. Lucene also has optimize() which
correlates to our compaction process, so these can be sync'd as well.
We will also need to correlate column validators to Lucene tokenizers, so the
data can be stored properly, the big win in once this is done we can perform
complex queries within a column like wildcard searches.
The downside of this approach is we will need to read before write since
documents in Lucene are written as complete documents. For random workloads
with lot's of indexed columns this means we need to read the document from
the index, update it and write it back.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2252) arena allocation for memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13092608#comment-13092608
 ] 

Yang Yang commented on CASSANDRA-2252:
--

cool, actually after some thought, I think we need to put more care to 
utilizing SlabAllocator for counters:

I realized this when u said only if it's a memtable-related operation, this 
would be very true for some temp variable ByteBuffers, which are thrown away 
immediately, and hence get relaimed in the new gen GC, and never go into old 
gen.

for counters, the column values (which contain the CounterContext) change a 
lot, if we assume that the value of each counter is updated 1000 times during 
the life time of  a memtable before being flushed, then if you look at a 
typical 2MB slab allocated out, 99.9% of the buffers it contains are going to 
be non-reachable and GC'ed before flushing. so when the 0.1% buffer is 
promoted, it occupies 2MB space instead of its actual size, which would be more 
waste than the possible fragmentation problem it causes.

so in this case (or, more generally, all cases where update is more often), 
using HeapAllocator may be better.


Thanks
Yang

 arena allocation for memtables
 --

 Key: CASSANDRA-2252
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2252
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
 Fix For: 1.0

 Attachments: 0001-add-MemtableAllocator.txt, 
 0002-add-off-heap-MemtableAllocator-support.txt, 2252-v3.txt, 2252-v4.txt, 
 merged-2252.tgz


 The memtable design practically actively fights Java's GC design.  Todd 
 Lipcon gave a good explanation over on HBASE-3455.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2630) CLI - 'describe column family' would be nice

2011-08-28 Thread satish babu krishnamoorthy (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

satish babu krishnamoorthy updated CASSANDRA-2630:
--

Attachment: cassandra-0.8.2-2630-2.txt

re-attach v2 version and check Grant license to ASF for inclusion in ASF 
works checkbox

 CLI - 'describe column family' would be nice
 

 Key: CASSANDRA-2630
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2630
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 0.8.4
Reporter: Jeremy Hanna
Assignee: satish babu krishnamoorthy
Priority: Minor
  Labels: cli, lhf
 Fix For: 1.0

 Attachments: cassandra-0.8.2-2630-1.txt, cassandra-0.8.2-2630-2.txt, 
 cassandra-0.8.2-2630-2.txt, cassandra-0.8.2-2630.txt


 I end up verifying column families a lot and using 'describe keyspace 
 keyspace;' spits out a whole bunch of data since our keyspace has a lot of 
 metadata.  It would be really useful to have a 'describe column family;' 
 for a given column family in the currently authenticated keyspace.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-2915) Lucene based Secondary Indexes

[
https://issues.apache.org/jira/browse/CASSANDRA-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13092606#comment-13092606
]

Todd Nine edited comment on CASSANDRA-2915 at 8/29/11 4:29 AM:
---

Has anyone looked at existing code in ElasticSearch to avoid some of the
pitfalls they have already experienced in building something similar?

http://www.elasticsearch.org/

Lastly, this is a huge feature for the hector-jpa plugin, what can I do to
help?

was (Author: tnine):

Lastly, this is a huge feature for the hector-jpa plugin, what can I do to help?

Lucene based Secondary Indexes
--

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-2915) Lucene based Secondary Indexes