[jira] [Created] (CASSANDRA-2401) getColumnFamily() return null, which is not checked in ColumnFamilyStore.java scan() method, causing Timeout Exception in query

2011-03-29 Thread Tey Kar Shiang (JIRA)
getColumnFamily() return null, which is not checked in ColumnFamilyStore.java 
scan() method, causing Timeout Exception in query
---

 Key: CASSANDRA-2401
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2401
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.7.4
 Environment: Hector 0.7.0-28, Cassandra 0.7.4, Windows 7, Eclipse
Reporter: Tey Kar Shiang


ColumnFamilyStore.java, line near 1680, ColumnFamily data = 
getColumnFamily(new QueryFilter(dk, path, firstFilter)), the data is returned 
null, causing NULL exception in satisfies(data, clause, primary) which is not 
captured. The callback got timeout and return a Timeout exception to Hector.

The data is empty, as I traced, I have the the columns Count as 0 in 
removeDeletedCF(), which return the null there. (I am new and trying to 
understand the logics around still). Instead of crash to NULL, could we bypass 
the data?

About my test:
A stress-test program to add, modify and delete data to keyspace. I have 30 
threads simulate concurrent users to perform the actions above, and do a query 
to all rows periodically. I have Column Family with rows (as File) and columns 
as index (e.g. userID, fileType).

No issue on the first day of test, and stopped for 3 days. I restart the test 
on 4th day, 1 of the users failed to query the files (timeout exception 
received). Most of the users are still okay with the query.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-2358) CLI doesn't handle inserting negative integers

2011-03-29 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-2358:
---

Attachment: CASSANDRA-2358-trunk.patch

branch: trunk (latest commit e6c5a28da940a086d0e786f1ad0288c0b0efa27d) 

 CLI doesn't handle inserting negative integers
 --

 Key: CASSANDRA-2358
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2358
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Affects Versions: 0.7.0
Reporter: Tyler Hobbs
Assignee: Pavel Yaskevich
Priority: Trivial
 Fix For: 0.7.5, 0.8

 Attachments: CASSANDRA-2358-trunk.patch, CASSANDRA-2358.patch

   Original Estimate: 0.5h
  Remaining Estimate: 0.5h

 The CLI raises a syntax error when trying to insert negative integers:
 {noformat}
 [default@Keyspace1] set StandardInteger['key'][-12] = 'val';
 Syntax error at position 28: mismatched character '1' expecting '-'
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-2358) CLI doesn't handle inserting negative integers

2011-03-29 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-2358:
---

Fix Version/s: 0.8

 CLI doesn't handle inserting negative integers
 --

 Key: CASSANDRA-2358
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2358
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Affects Versions: 0.7.0
Reporter: Tyler Hobbs
Assignee: Pavel Yaskevich
Priority: Trivial
 Fix For: 0.7.5, 0.8

 Attachments: CASSANDRA-2358-trunk.patch, CASSANDRA-2358.patch

   Original Estimate: 0.5h
  Remaining Estimate: 0.5h

 The CLI raises a syntax error when trying to insert negative integers:
 {noformat}
 [default@Keyspace1] set StandardInteger['key'][-12] = 'val';
 Syntax error at position 28: mismatched character '1' expecting '-'
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2397) Improve or remove replicate-on-write setting

2011-03-29 Thread Chris Burroughs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012459#comment-13012459
 ] 

Chris Burroughs commented on CASSANDRA-2397:


For what it's worth I *think* lazily replicated counters is what I want.

 Improve or remove replicate-on-write setting
 

 Key: CASSANDRA-2397
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2397
 Project: Cassandra
  Issue Type: Bug
Reporter: Stu Hood

 The replicate on write setting breaks assumptions in various places in the 
 codebase dealing with whether data will be replicated in a timely fashion. 
 It's worthwhile to discuss whether we should go all-the-way on 
 replicate-on-write, such that it is a fully supported feature, or whether we 
 should remove it entirely.
 On one hand, ROW could be considered to be just another replication tunable 
 like HH, RR and AES. On the other hand, a lazily replicating store is very 
 rarely what you actually wanted.
 Open issues related to ROW are linked, but additionally, we'd need to:
  * Make the setting have an effect for standard column families
  * Change the default for ROW to enabled and properly warn of the effects

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2401) getColumnFamily() return null, which is not checked in ColumnFamilyStore.java scan() method, causing Timeout Exception in query

2011-03-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012475#comment-13012475
 ] 

Jonathan Ellis commented on CASSANDRA-2401:
---

Are you querying for zero columns?

 getColumnFamily() return null, which is not checked in ColumnFamilyStore.java 
 scan() method, causing Timeout Exception in query
 ---

 Key: CASSANDRA-2401
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2401
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.7.4
 Environment: Hector 0.7.0-28, Cassandra 0.7.4, Windows 7, Eclipse
Reporter: Tey Kar Shiang

 ColumnFamilyStore.java, line near 1680, ColumnFamily data = 
 getColumnFamily(new QueryFilter(dk, path, firstFilter)), the data is 
 returned null, causing NULL exception in satisfies(data, clause, primary) 
 which is not captured. The callback got timeout and return a Timeout 
 exception to Hector.
 The data is empty, as I traced, I have the the columns Count as 0 in 
 removeDeletedCF(), which return the null there. (I am new and trying to 
 understand the logics around still). Instead of crash to NULL, could we 
 bypass the data?
 About my test:
 A stress-test program to add, modify and delete data to keyspace. I have 30 
 threads simulate concurrent users to perform the actions above, and do a 
 query to all rows periodically. I have Column Family with rows (as File) and 
 columns as index (e.g. userID, fileType).
 No issue on the first day of test, and stopped for 3 days. I restart the test 
 on 4th day, 1 of the users failed to query the files (timeout exception 
 received). Most of the users are still okay with the query.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-1902) Migrate cached pages during compaction

2011-03-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012496#comment-13012496
 ] 

Jonathan Ellis commented on CASSANDRA-1902:
---

why do we need normalmappedsegment as well as native?  could we get rid of 
normal?

 Migrate cached pages during compaction 
 ---

 Key: CASSANDRA-1902
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1902
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.1
Reporter: T Jake Luciani
Assignee: T Jake Luciani
 Fix For: 0.7.5, 0.8

 Attachments: 
 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt, 
 1902-formatted.txt, 1902-per-column-migration-rebase2.txt, 
 1902-per-column-migration.txt, CASSANDRA-1902-v3.patch, 
 CASSANDRA-1902-v4.patch, CASSANDRA-1902-v5.patch

   Original Estimate: 32h
  Time Spent: 56h
  Remaining Estimate: 0h

 Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a 
 pre-compacted CF during the compaction process.  This is now important since 
 CASSANDRA-1470 caches effectively nothing.  
 For example an active CF being compacted hurts reads since nothing is cached 
 in the new SSTable. 
 The purpose of this ticket then is to make sure SOME data is cached from 
 active CFs. This can be done my monitoring which Old SSTables are in the page 
 cache and caching active rows in the New SStable.
 A simpler yet similar approach is described here: 
 http://insights.oetiker.ch/linux/fadvise/

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-1902) Migrate cached pages during compaction

2011-03-29 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012506#comment-13012506
 ] 

Pavel Yaskevich commented on CASSANDRA-1902:


NativeMappedSegment if used to give us a better control over memory-mapped 
region of the file, especially for skipping page cache using 
(madvice(DONT_NEED) and utilizing a mapping by native way - munmap and doing 
page cache migration. NormalMappedSegment is used in cases when we don't have a 
page size (JNA is not installed or not on Linux). I have added those methods 
because it's not possible to clean page cache from the region of the MBB, also 
using NativeMappedSegment we don't need to duplicate buffers on slice calls.

 Migrate cached pages during compaction 
 ---

 Key: CASSANDRA-1902
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1902
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.1
Reporter: T Jake Luciani
Assignee: T Jake Luciani
 Fix For: 0.7.5, 0.8

 Attachments: 
 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt, 
 1902-formatted.txt, 1902-per-column-migration-rebase2.txt, 
 1902-per-column-migration.txt, CASSANDRA-1902-v3.patch, 
 CASSANDRA-1902-v4.patch, CASSANDRA-1902-v5.patch

   Original Estimate: 32h
  Time Spent: 56h
  Remaining Estimate: 0h

 Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a 
 pre-compacted CF during the compaction process.  This is now important since 
 CASSANDRA-1470 caches effectively nothing.  
 For example an active CF being compacted hurts reads since nothing is cached 
 in the new SSTable. 
 The purpose of this ticket then is to make sure SOME data is cached from 
 active CFs. This can be done my monitoring which Old SSTables are in the page 
 cache and caching active rows in the New SStable.
 A simpler yet similar approach is described here: 
 http://insights.oetiker.ch/linux/fadvise/

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-1902) Migrate cached pages during compaction

2011-03-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012511#comment-13012511
 ] 

Jonathan Ellis commented on CASSANDRA-1902:
---

bq. NormalMappedSegment is used in cases when we don't have a page size (JNA is 
not installed or not on Linux)

Makes sense.

bq. using NativeMappedSegment we don't need to duplicate buffers on slice calls

Don't we still need to duplicate, in case we unmap the sstable we read from 
before we return the data to the requester?

Similarly, if we manually munmap, isn't there a race condition where we say 
give me the list of sstables and then while reading one gets compacted and 
unmapped?

 Migrate cached pages during compaction 
 ---

 Key: CASSANDRA-1902
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1902
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.1
Reporter: T Jake Luciani
Assignee: T Jake Luciani
 Fix For: 0.7.5, 0.8

 Attachments: 
 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt, 
 1902-formatted.txt, 1902-per-column-migration-rebase2.txt, 
 1902-per-column-migration.txt, CASSANDRA-1902-v3.patch, 
 CASSANDRA-1902-v4.patch, CASSANDRA-1902-v5.patch

   Original Estimate: 32h
  Time Spent: 56h
  Remaining Estimate: 0h

 Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a 
 pre-compacted CF during the compaction process.  This is now important since 
 CASSANDRA-1470 caches effectively nothing.  
 For example an active CF being compacted hurts reads since nothing is cached 
 in the new SSTable. 
 The purpose of this ticket then is to make sure SOME data is cached from 
 active CFs. This can be done my monitoring which Old SSTables are in the page 
 cache and caching active rows in the New SStable.
 A simpler yet similar approach is described here: 
 http://insights.oetiker.ch/linux/fadvise/

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-1902) Migrate cached pages during compaction

2011-03-29 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012513#comment-13012513
 ] 

Pavel Yaskevich commented on CASSANDRA-1902:


Pointer.getByteArray gives us the same performance as zero-reads at the current 
version, I have tried Pointer.getByteBuffer - it's slower than getByteArray + 
it needs to be re-ordered to Big Endian.

 Migrate cached pages during compaction 
 ---

 Key: CASSANDRA-1902
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1902
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.1
Reporter: T Jake Luciani
Assignee: T Jake Luciani
 Fix For: 0.7.5, 0.8

 Attachments: 
 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt, 
 1902-formatted.txt, 1902-per-column-migration-rebase2.txt, 
 1902-per-column-migration.txt, CASSANDRA-1902-v3.patch, 
 CASSANDRA-1902-v4.patch, CASSANDRA-1902-v5.patch

   Original Estimate: 32h
  Time Spent: 56h
  Remaining Estimate: 0h

 Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a 
 pre-compacted CF during the compaction process.  This is now important since 
 CASSANDRA-1470 caches effectively nothing.  
 For example an active CF being compacted hurts reads since nothing is cached 
 in the new SSTable. 
 The purpose of this ticket then is to make sure SOME data is cached from 
 active CFs. This can be done my monitoring which Old SSTables are in the page 
 cache and caching active rows in the New SStable.
 A simpler yet similar approach is described here: 
 http://insights.oetiker.ch/linux/fadvise/

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-1902) Migrate cached pages during compaction

2011-03-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012517#comment-13012517
 ] 

Jonathan Ellis commented on CASSANDRA-1902:
---

Okay, so we *are* adding a copy by using getByteArray.  I guess that takes care 
of the race conditions but it baffles me that copy + wrap can be as fast as 
duplicate + set position.  (duplicate != copy.)

All the order(nativeOrder) call does is set byte order for things like the get* 
methods, which all look like this:

{code}
private long getLong(long a) {
if (unaligned) {
long x = unsafe.getLong(a);
return (nativeByteOrder ? x : Bits.swap(x));
}
return Bits.getLong(a, bigEndian);
}
{code}

And here is where unaligned gets set:
{code}
static boolean unaligned() {
if (unalignedKnown)
return unaligned;
PrivilegedAction pa
= new sun.security.action.GetPropertyAction(os.arch);
String arch = (String)AccessController.doPrivileged(pa);
unaligned = arch.equals(i386) || arch.equals(x86) || 
arch.equals(x86_64)
|| arch.equals(amd64) || arch.equals(ppc); // Mac OS X / PPC: 
see Radar #3253257
unalignedKnown = true;
return unaligned;
}
{code}

... so byte ordering is basically a no-op on any architecture we are likely to 
run on.

 Migrate cached pages during compaction 
 ---

 Key: CASSANDRA-1902
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1902
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.1
Reporter: T Jake Luciani
Assignee: T Jake Luciani
 Fix For: 0.7.5, 0.8

 Attachments: 
 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt, 
 1902-formatted.txt, 1902-per-column-migration-rebase2.txt, 
 1902-per-column-migration.txt, CASSANDRA-1902-v3.patch, 
 CASSANDRA-1902-v4.patch, CASSANDRA-1902-v5.patch

   Original Estimate: 32h
  Time Spent: 56h
  Remaining Estimate: 0h

 Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a 
 pre-compacted CF during the compaction process.  This is now important since 
 CASSANDRA-1470 caches effectively nothing.  
 For example an active CF being compacted hurts reads since nothing is cached 
 in the new SSTable. 
 The purpose of this ticket then is to make sure SOME data is cached from 
 active CFs. This can be done my monitoring which Old SSTables are in the page 
 cache and caching active rows in the New SStable.
 A simpler yet similar approach is described here: 
 http://insights.oetiker.ch/linux/fadvise/

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-1902) Migrate cached pages during compaction

2011-03-29 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012520#comment-13012520
 ] 

Pavel Yaskevich commented on CASSANDRA-1902:


the story behind ordering I mentioned is that Pointer.getByteBuffer always 
ensures native byte order and MappedByteBuffer always has Big Endian order so 
we need to ensure byte order of the Pointer.getByteBuffer by setting it to Big 
Endian, but this is not a case here - on my tests on the server with 2gb RAM 
hosted on rackspace and on the server with high-memory and medium server hosted 
on ec2 both gave me a better performance using Pointer.getByteArray and wrap 
comparing to Pointer.getByteBuffer (and the getByteArray performance is almost 
identical to our current version). 

 Migrate cached pages during compaction 
 ---

 Key: CASSANDRA-1902
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1902
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.1
Reporter: T Jake Luciani
Assignee: T Jake Luciani
 Fix For: 0.7.5, 0.8

 Attachments: 
 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt, 
 1902-formatted.txt, 1902-per-column-migration-rebase2.txt, 
 1902-per-column-migration.txt, CASSANDRA-1902-v3.patch, 
 CASSANDRA-1902-v4.patch, CASSANDRA-1902-v5.patch

   Original Estimate: 32h
  Time Spent: 56h
  Remaining Estimate: 0h

 Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a 
 pre-compacted CF during the compaction process.  This is now important since 
 CASSANDRA-1470 caches effectively nothing.  
 For example an active CF being compacted hurts reads since nothing is cached 
 in the new SSTable. 
 The purpose of this ticket then is to make sure SOME data is cached from 
 active CFs. This can be done my monitoring which Old SSTables are in the page 
 cache and caching active rows in the New SStable.
 A simpler yet similar approach is described here: 
 http://insights.oetiker.ch/linux/fadvise/

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2231) Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal

2011-03-29 Thread Ed Anuff (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012531#comment-13012531
 ] 

Ed Anuff commented on CASSANDRA-2231:
-

Sylvain, in the JPA implementation, we're seeing that we'd like to have a 
little more flexibility with the trailing end-of-component, specifically, that 
it be able to have values of -1,0,1 rather than just 0,1.  The comparison logic 
would look like this:

{noformat}
byte b1 = bb1.get();
byte b2 = bb2.get();
if (b1  0) {
if (b2 = 0) {
return -1;
}
}

if (b1  0) {
if (b2 = 0) {
return 1;
}
}

if ((b1 == 0)  (b2 != 0)) {
return - b2;
}
{noformat}



 Add CompositeType comparer to the comparers provided in 
 org.apache.cassandra.db.marshal
 ---

 Key: CASSANDRA-2231
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2231
 Project: Cassandra
  Issue Type: Improvement
  Components: Contrib
Affects Versions: 0.7.3
Reporter: Ed Anuff
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 0.7.5

 Attachments: CompositeType-and-DynamicCompositeType.patch, 
 edanuff-CassandraCompositeType-1e253c4.zip


 CompositeType is a custom comparer that makes it possible to create 
 comparable composite values out of the basic types that Cassandra currently 
 supports, such as Long, UUID, etc.  This is very useful in both the creation 
 of custom inverted indexes using columns in a skinny row, where each column 
 name is a composite value, and also when using Cassandra's built-in secondary 
 index support, where it can be used to encode the values in the columns that 
 Cassandra indexes.  One scenario for the usage of these is documented here: 
 http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html.  Source for 
 contribution is attached and has been previously maintained on github here: 
 https://github.com/edanuff/CassandraCompositeType

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-2402) Python dbapi driver for CQL

2011-03-29 Thread Jon Hermes (JIRA)
Python dbapi driver for CQL
---

 Key: CASSANDRA-2402
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2402
 Project: Cassandra
  Issue Type: Task
Reporter: Jon Hermes
Assignee: Jon Hermes
 Fix For: 0.8


Create a driver that emulates python's dbapi.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2400) Resolve Maven Ant Tasks from local Maven repository if present

2011-03-29 Thread Jeremy Hanna (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012536#comment-13012536
 ] 

Jeremy Hanna commented on CASSANDRA-2400:
-

Tried to clarify a couple of things with Stephen in IRC.  This would probably 
speed up the build a little since it checks the local repo first.  It also 
doesn't change the build process at all - it just checks the cache first.  So I 
would be +1 on this as it is generally useful as it speeds things up for others 
(and helps for testing).

 Resolve Maven Ant Tasks from local Maven repository if present
 --

 Key: CASSANDRA-2400
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2400
 Project: Cassandra
  Issue Type: Improvement
  Components: Packaging
Affects Versions: 0.7.4
Reporter: Stephen Connolly
Priority: Minor
 Fix For: 0.7.5

 Attachments: MCASSANDRA-tweaks.patch


 To aid with testing using newer versions of Maven ANT Tasks it can be helpful 
 to copy the version from the local repository if present rather than always 
 downloading

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Issue Comment Edited] (CASSANDRA-2400) Resolve Maven Ant Tasks from local Maven repository if present

2011-03-29 Thread Jeremy Hanna (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012536#comment-13012536
 ] 

Jeremy Hanna edited comment on CASSANDRA-2400 at 3/29/11 4:12 PM:
--

Tried to clarify a couple of things with Stephen in IRC.  This would probably 
speed up the build a little since it checks the local repo first.  It also 
doesn't change the build process at all - it just checks the cache first.  So I 
would be +1 on this as it is generally useful in speeding up the build a bit.  
It also helps for testing.

  was (Author: jeromatron):
Tried to clarify a couple of things with Stephen in IRC.  This would 
probably speed up the build a little since it checks the local repo first.  It 
also doesn't change the build process at all - it just checks the cache first.  
So I would be +1 on this as it is generally useful as it speeds things up for 
others (and helps for testing).
  
 Resolve Maven Ant Tasks from local Maven repository if present
 --

 Key: CASSANDRA-2400
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2400
 Project: Cassandra
  Issue Type: Improvement
  Components: Packaging
Affects Versions: 0.7.4
Reporter: Stephen Connolly
Priority: Minor
 Fix For: 0.7.5

 Attachments: MCASSANDRA-tweaks.patch


 To aid with testing using newer versions of Maven ANT Tasks it can be helpful 
 to copy the version from the local repository if present rather than always 
 downloading

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2401) getColumnFamily() return null, which is not checked in ColumnFamilyStore.java scan() method, causing Timeout Exception in query

2011-03-29 Thread Tey Kar Shiang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012542#comment-13012542
 ] 

Tey Kar Shiang commented on CASSANDRA-2401:
---

Hi, nope. 

It is a query for 4 columns. 

I cheked that only 1 row has this problem (no column found), out of the 948 
records returned; I skipped the row with zero columns. 

In my stress-test, all rows have 4 columns; i.e. row is the file, the 4 columns 
(index) are like its version, modified time, type, etc. I added all the columns 
when added each file. The addition should be working since there is no such 
exception on day 1, and I start and stop the stress tests until each users have 
around 1500 files. Row with 0 column only found on the 4th day after I continue 
to run it.

I will keep picking up cassandra logics, as I have little understanding about 
how data loaded, stored and deleted. Any suggestion / guide on how I should go 
on with my study is greatly appreciated. Thank you!

Btw, for this test, I have not yet going to 2 nodes / 3 nodes. It is only a 
single-node cassandra runnning on my localhost.


 getColumnFamily() return null, which is not checked in ColumnFamilyStore.java 
 scan() method, causing Timeout Exception in query
 ---

 Key: CASSANDRA-2401
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2401
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.7.4
 Environment: Hector 0.7.0-28, Cassandra 0.7.4, Windows 7, Eclipse
Reporter: Tey Kar Shiang

 ColumnFamilyStore.java, line near 1680, ColumnFamily data = 
 getColumnFamily(new QueryFilter(dk, path, firstFilter)), the data is 
 returned null, causing NULL exception in satisfies(data, clause, primary) 
 which is not captured. The callback got timeout and return a Timeout 
 exception to Hector.
 The data is empty, as I traced, I have the the columns Count as 0 in 
 removeDeletedCF(), which return the null there. (I am new and trying to 
 understand the logics around still). Instead of crash to NULL, could we 
 bypass the data?
 About my test:
 A stress-test program to add, modify and delete data to keyspace. I have 30 
 threads simulate concurrent users to perform the actions above, and do a 
 query to all rows periodically. I have Column Family with rows (as File) and 
 columns as index (e.g. userID, fileType).
 No issue on the first day of test, and stopped for 3 days. I restart the test 
 on 4th day, 1 of the users failed to query the files (timeout exception 
 received). Most of the users are still okay with the query.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2401) getColumnFamily() return null, which is not checked in ColumnFamilyStore.java scan() method, causing Timeout Exception in query

2011-03-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012547#comment-13012547
 ] 

Jonathan Ellis commented on CASSANDRA-2401:
---

Is there any data from earlier than 0.7.4?

 getColumnFamily() return null, which is not checked in ColumnFamilyStore.java 
 scan() method, causing Timeout Exception in query
 ---

 Key: CASSANDRA-2401
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2401
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.7.4
 Environment: Hector 0.7.0-28, Cassandra 0.7.4, Windows 7, Eclipse
Reporter: Tey Kar Shiang

 ColumnFamilyStore.java, line near 1680, ColumnFamily data = 
 getColumnFamily(new QueryFilter(dk, path, firstFilter)), the data is 
 returned null, causing NULL exception in satisfies(data, clause, primary) 
 which is not captured. The callback got timeout and return a Timeout 
 exception to Hector.
 The data is empty, as I traced, I have the the columns Count as 0 in 
 removeDeletedCF(), which return the null there. (I am new and trying to 
 understand the logics around still). Instead of crash to NULL, could we 
 bypass the data?
 About my test:
 A stress-test program to add, modify and delete data to keyspace. I have 30 
 threads simulate concurrent users to perform the actions above, and do a 
 query to all rows periodically. I have Column Family with rows (as File) and 
 columns as index (e.g. userID, fileType).
 No issue on the first day of test, and stopped for 3 days. I restart the test 
 on 4th day, 1 of the users failed to query the files (timeout exception 
 received). Most of the users are still okay with the query.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


svn commit: r1086640 - /cassandra/trunk/CHANGES.txt

2011-03-29 Thread jbellis
Author: jbellis
Date: Tue Mar 29 16:47:19 2011
New Revision: 1086640

URL: http://svn.apache.org/viewvc?rev=1086640view=rev
Log:
add 1669 and 924 to CHANGES

Modified:
cassandra/trunk/CHANGES.txt

Modified: cassandra/trunk/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/CHANGES.txt?rev=1086640r1=1086639r2=1086640view=diff
==
--- cassandra/trunk/CHANGES.txt (original)
+++ cassandra/trunk/CHANGES.txt Tue Mar 29 16:47:19 2011
@@ -1,4 +1,5 @@
 0.8-dev
+ * remove Avro RPC support (CASSANDRA-926)
  * avoid double RowMutation serialization on write path (CASSANDRA-1800)
  * adds support for columns that act as incr/decr counters 
(CASSANDRA-1072, 1937, 1944, 1936, 2101, 2093, 2288, 2105)
@@ -11,6 +12,7 @@
  * Fix for Cli to support updating replicate_on_write (CASSANDRA-2236)
  * JDBC driver for CQL (CASSANDRA-2124, 2302)
  * atomic switch of memtables and sstables (CASSANDRA-2284)
+ * add pluggable SeedProvider (CASSANDRA-1669)
 
 
 0.7.5




svn commit: r1086696 - /cassandra/branches/cassandra-0.7/CHANGES.txt

2011-03-29 Thread jbellis
Author: jbellis
Date: Tue Mar 29 19:30:55 2011
New Revision: 1086696

URL: http://svn.apache.org/viewvc?rev=1086696view=rev
Log:
update CHANGES

Modified:
cassandra/branches/cassandra-0.7/CHANGES.txt

Modified: cassandra/branches/cassandra-0.7/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/CHANGES.txt?rev=1086696r1=1086695r2=1086696view=diff
==
--- cassandra/branches/cassandra-0.7/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.7/CHANGES.txt Tue Mar 29 19:30:55 2011
@@ -63,8 +63,7 @@
  * fixes for cache save/load (CASSANDRA-2172, -2174)
  * Handle whole-row deletions in CFOutputFormat (CASSANDRA-2014)
  * Make memtable_flush_writers flush in parallel (CASSANDRA-2178)
- * make key cache preheating default to false; enable with
-   -Dcompaction_preheat_key_cache=true (CASSANDRA-2175)
+ * Add compaction_preheat_key_cache option (CASSANDRA-2175)
  * refactor stress.py to have only one copy of the format string 
used for creating row keys (CASSANDRA-2108)
  * validate index names for \w+ (CASSANDRA-2196)




[jira] [Updated] (CASSANDRA-2281) keep a count of errors

2011-03-29 Thread Ryan King (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan King updated CASSANDRA-2281:
-

Attachment: patch

update to trunk

 keep a count of errors
 --

 Key: CASSANDRA-2281
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2281
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Ryan King
Assignee: Ryan King
Priority: Minor
 Fix For: 0.7.5

 Attachments: patch, textmate stdin Vrj9Xa.txt


 I have patch that keeps a counter (exposed via jmx) of errors. This is quite 
 useful for operators to keep track of the quality of cassandra without having 
 to tail and parse logs across a cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2281) keep a count of errors

2011-03-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012661#comment-13012661
 ] 

Jonathan Ellis commented on CASSANDRA-2281:
---

bq. Also you could add a log4j socket appender for error level that would 
collect the actual errors for operators to be notified of. Another option would 
be to write a log4j FileAppender that also has a JMX bean.

Well, you can specify multiple appenders, so all you really need is to stack 
the JMX one in w/ the existing RollingFileAppender.

Looks like JBoss had the same idea: 
http://community.jboss.org/wiki/JMXNotificationAppender

 keep a count of errors
 --

 Key: CASSANDRA-2281
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2281
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Ryan King
Assignee: Ryan King
Priority: Minor
 Fix For: 0.7.5

 Attachments: patch, textmate stdin Vrj9Xa.txt


 I have patch that keeps a counter (exposed via jmx) of errors. This is quite 
 useful for operators to keep track of the quality of cassandra without having 
 to tail and parse logs across a cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2311) type validated row keys

2011-03-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012668#comment-13012668
 ] 

Jonathan Ellis commented on CASSANDRA-2311:
---

I'm getting this running nosetests -x test/system/test_thrift_server.py w/ 
Eric's patch applied:

{noformat}
java.lang.NullPointerException
at 
org.apache.cassandra.thrift.ThriftValidation.validateKeyType(ThriftValidation.java:68)
at 
org.apache.cassandra.thrift.CassandraServer.internal_batch_mutate(CassandraServer.java:391)
at 
org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:418)
{noformat}

Also: shouldn't we move validateKeyType into validateKey?

 type validated row keys
 ---

 Key: CASSANDRA-2311
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2311
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Eric Evans
Assignee: Jon Hermes
  Labels: cql
 Fix For: 0.8

 Attachments: 2311.txt, 
 v1-0001-CASSANDRA-2311-missed-CFM-conversion.txt


 The idea here is to allow the assignment of a column-family-wide key type 
 used to perform validation, (similar to how default_validation_class does for 
 column values).
 This should be as straightforward as extending the column family schema to 
 include the new attribute, and updating {{ThriftValidation.validateKey}} to 
 validate the key ({{AbstractType.validate}}).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-2321) disallow to querying a counter CF with non-counter operation

2011-03-29 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2321:
--

Summary: disallow to querying a counter CF with non-counter operation  
(was: Counter column values shows in hex values. Need to show it in string 
value.)

 disallow to querying a counter CF with non-counter operation
 

 Key: CASSANDRA-2321
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2321
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8
 Environment: Linux
Reporter: Mubarak Seyed
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 0.8

 Attachments: 0001-Don-t-allow-normal-query-on-counter-CF.patch


 CounterColumnType.getString() returns hexString.
 {code}
 public String getString(ByteBuffer bytes)
 { 
return ByteBufferUtil.bytesToHex(bytes);
 }
 {code}
 and python stress.py reader returns
 [ColumnOrSuperColumn(column=None, super_column=SuperColumn(name='19', 
 columns=[Column(timestamp=1299984960277, name='56', 
 value='\x7f\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00,',
  ttl=None), Column(timestamp=1299985019923, name='57', 
 value='\x7f\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00;\x00\x00\x00\x00\x00\x00\x08\xfd',
  ttl=None))]

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


svn commit: r1086729 - in /cassandra/trunk: src/java/org/apache/cassandra/config/ src/java/org/apache/cassandra/cql/ src/java/org/apache/cassandra/thrift/ test/system/ test/unit/org/apache/cassandra/t

2011-03-29 Thread jbellis
Author: jbellis
Date: Tue Mar 29 20:39:37 2011
New Revision: 1086729

URL: http://svn.apache.org/viewvc?rev=1086729view=rev
Log:
disallow querying a counter CF with non-counter operation
patch by slebresne; reviewed by jbellis for CASSANDRA-2321

Modified:
cassandra/trunk/src/java/org/apache/cassandra/config/CFMetaData.java
cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java
cassandra/trunk/src/java/org/apache/cassandra/thrift/CassandraServer.java
cassandra/trunk/src/java/org/apache/cassandra/thrift/ThriftValidation.java
cassandra/trunk/test/system/test_thrift_server.py

cassandra/trunk/test/unit/org/apache/cassandra/thrift/ThriftValidationTest.java

Modified: cassandra/trunk/src/java/org/apache/cassandra/config/CFMetaData.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/config/CFMetaData.java?rev=1086729r1=1086728r2=1086729view=diff
==
--- cassandra/trunk/src/java/org/apache/cassandra/config/CFMetaData.java 
(original)
+++ cassandra/trunk/src/java/org/apache/cassandra/config/CFMetaData.java Tue 
Mar 29 20:39:37 2011
@@ -468,6 +468,11 @@ public final class CFMetaData
 {
 return Collections.unmodifiableMap(column_metadata);
 }
+
+public AbstractType getComparatorFor(ByteBuffer superColumnName)
+{
+return superColumnName == null ? comparator : subcolumnComparator;
+}
 
 public boolean equals(Object obj) 
 {

Modified: cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java?rev=1086729r1=1086728r2=1086729view=diff
==
--- cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java 
(original)
+++ cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java Tue 
Mar 29 20:39:37 2011
@@ -236,7 +236,7 @@ public class QueryProcessor
 // FIXME: keys as ascii is not a Real Solution
 ByteBuffer key = update.getKey().getByteBuffer(AsciiType.instance);
 validateKey(key);
-validateColumnFamily(keyspace, cfname);
+validateColumnFamily(keyspace, update.getColumnFamily(), false);
 validateKeyType(key, keyspace, cfname);
 AbstractType? comparator = update.getComparator(keyspace);
 
@@ -460,7 +460,7 @@ public class QueryProcessor
 case SELECT:
 SelectStatement select = (SelectStatement)statement.statement;
 clientState.hasColumnFamilyAccess(select.getColumnFamily(), 
Permission.READ);
-validateColumnFamily(keyspace, select.getColumnFamily());
+validateColumnFamily(keyspace, select.getColumnFamily(), 
false);
 validateSelect(keyspace, select);
 
 Listorg.apache.cassandra.db.Row rows = null;

Modified: 
cassandra/trunk/src/java/org/apache/cassandra/thrift/CassandraServer.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/thrift/CassandraServer.java?rev=1086729r1=1086728r2=1086729view=diff
==
--- cassandra/trunk/src/java/org/apache/cassandra/thrift/CassandraServer.java 
(original)
+++ cassandra/trunk/src/java/org/apache/cassandra/thrift/CassandraServer.java 
Tue Mar 29 20:39:37 2011
@@ -241,7 +241,7 @@ public class CassandraServer implements 
 logger.debug(get_slice);
 
 state().hasColumnFamilyAccess(column_parent.column_family, 
Permission.READ);
-return multigetSliceInternal(state().getKeyspace(), 
Collections.singletonList(key), column_parent, predicate, 
consistency_level).get(key);
+return multigetSliceInternal(state().getKeyspace(), 
Collections.singletonList(key), column_parent, predicate, consistency_level, 
false).get(key);
 }
 
 public MapByteBuffer, ListColumnOrSuperColumn 
multiget_slice(ListByteBuffer keys, ColumnParent column_parent, 
SlicePredicate predicate, ConsistencyLevel consistency_level)
@@ -250,14 +250,15 @@ public class CassandraServer implements 
 logger.debug(multiget_slice);
 
 state().hasColumnFamilyAccess(column_parent.column_family, 
Permission.READ);
-return multigetSliceInternal(state().getKeyspace(), keys, 
column_parent, predicate, consistency_level);
+return multigetSliceInternal(state().getKeyspace(), keys, 
column_parent, predicate, consistency_level, false);
 }
 
-private MapByteBuffer, ListColumnOrSuperColumn 
multigetSliceInternal(String keyspace, ListByteBuffer keys, ColumnParent 
column_parent, SlicePredicate predicate, ConsistencyLevel consistency_level)
+private MapByteBuffer, ListColumnOrSuperColumn 
multigetSliceInternal(String keyspace, ListByteBuffer 

[jira] [Commented] (CASSANDRA-2311) type validated row keys

2011-03-29 Thread Jon Hermes (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012684#comment-13012684
 ] 

Jon Hermes commented on CASSANDRA-2311:
---

I'm getting different test failures right now (the system tests are in a state 
of flux), but the NPE definitely means the CFMetaData didn't get loaded 
correctly or we're asking for it incorrectly.

As for moving vKT into vK, they have different args. 
Assume key K validates correctly. K may validate correctly in CF Foo (because 
the keyValidator is a no-op BytesType), but fail to validate in CF Bar (because 
the keyValidator is something restrictive such as UTF8, and K is random bytes).
Ideally we'd like to just ask for the CFMD.keyValidator in every vK, but we 
don't know that the CF they're asking for is valid either. We happen to have a 
nice validateCF already, and it's a waste to duplicate work.

Hence why my original function was names validateKeyInCF... first you validate 
the key on its own (it's nonzero bytes, etc.). Then you validate the cf (it's a 
valid cf in the system). Lastly you validate that the key IN that cf is 
altogether valid.

 type validated row keys
 ---

 Key: CASSANDRA-2311
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2311
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Eric Evans
Assignee: Jon Hermes
  Labels: cql
 Fix For: 0.8

 Attachments: 2311.txt, 
 v1-0001-CASSANDRA-2311-missed-CFM-conversion.txt


 The idea here is to allow the assignment of a column-family-wide key type 
 used to perform validation, (similar to how default_validation_class does for 
 column values).
 This should be as straightforward as extending the column family schema to 
 include the new attribute, and updating {{ThriftValidation.validateKey}} to 
 validate the key ({{AbstractType.validate}}).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-1263) Push replication factor down to the replication strategy

2011-03-29 Thread Jon Hermes (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Hermes updated CASSANDRA-1263:
--

Attachment: 1263-3.txt

rebased.
NOTE: system tests are currently failing. If the system tests in trunk runs 
fine and these continue to fail, please re-open.

 Push replication factor down to the replication strategy
 

 Key: CASSANDRA-1263
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1263
 Project: Cassandra
  Issue Type: Task
  Components: Core
Reporter: Jeremy Hanna
Assignee: Jon Hermes
Priority: Minor
 Fix For: 0.8

 Attachments: 1263-2.txt, 1263-3.txt, 1263-incomplete.txt, 1263.txt

   Original Estimate: 8h
  Remaining Estimate: 8h

 Currently the replication factor is in the keyspace metadata.  As we've added 
 the datacenter shard strategy, the replication factor becomes more computed 
 by the replication strategy.  It seems reasonable to therefore push the 
 replication factor for the keyspace down to the replication strategy so that 
 it can be handled in one place.
 This adds on the work being done in CASSANDRA-1066 since that ticket will 
 make the replication strategy a member variable of keyspace metadata instead 
 of just a quasi singleton giving the replication strategy state for each 
 keyspace.  That makes it able to have the replication factor.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2311) type validated row keys

2011-03-29 Thread Eric Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012709#comment-13012709
 ] 

Eric Evans commented on CASSANDRA-2311:
---

bq. Also: shouldn't we move validateKeyType into validateKey?

FWIW, this is not used in CQL (because the ByteBuffer is created by 
{{AT.fromString}}, there is no need to invoke {{validate()}}).  So 
{{validateKeyType}}'s use is limited to those two invocations in 
{{CassandraServer}}.  

 type validated row keys
 ---

 Key: CASSANDRA-2311
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2311
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Eric Evans
Assignee: Jon Hermes
  Labels: cql
 Fix For: 0.8

 Attachments: 2311.txt, 
 v1-0001-CASSANDRA-2311-missed-CFM-conversion.txt


 The idea here is to allow the assignment of a column-family-wide key type 
 used to perform validation, (similar to how default_validation_class does for 
 column values).
 This should be as straightforward as extending the column family schema to 
 include the new attribute, and updating {{ThriftValidation.validateKey}} to 
 validate the key ({{AbstractType.validate}}).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


svn commit: r1086755 - in /cassandra/trunk/src/java/org/apache/cassandra: config/CFMetaData.java cql/QueryProcessor.java thrift/CassandraServer.java thrift/ThriftValidation.java

2011-03-29 Thread jbellis
Author: jbellis
Date: Tue Mar 29 21:26:48 2011
New Revision: 1086755

URL: http://svn.apache.org/viewvc?rev=1086755view=rev
Log:
merge validateKey/validateKeyType, add CF validation to cql, add comparator to 
cql name validation.  fixes test NPE.
patch by jbellis

Modified:
cassandra/trunk/src/java/org/apache/cassandra/config/CFMetaData.java
cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java
cassandra/trunk/src/java/org/apache/cassandra/thrift/CassandraServer.java
cassandra/trunk/src/java/org/apache/cassandra/thrift/ThriftValidation.java

Modified: cassandra/trunk/src/java/org/apache/cassandra/config/CFMetaData.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/config/CFMetaData.java?rev=1086755r1=1086754r2=1086755view=diff
==
--- cassandra/trunk/src/java/org/apache/cassandra/config/CFMetaData.java 
(original)
+++ cassandra/trunk/src/java/org/apache/cassandra/config/CFMetaData.java Tue 
Mar 29 21:26:48 2011
@@ -730,6 +730,7 @@ public final class CFMetaData
 def.memtable_throughput_in_mb = cfm.memtableThroughputInMb;
 def.memtable_operations_in_millions = cfm.memtableOperationsInMillions;
 def.merge_shards_chance = cfm.mergeShardsChance;
+def.key_validation_class = cfm.keyValidator.getClass().getName();
 Listorg.apache.cassandra.db.migration.avro.ColumnDef column_meta = 
new 
ArrayListorg.apache.cassandra.db.migration.avro.ColumnDef(cfm.column_metadata.size());
 for (ColumnDefinition cd : cfm.column_metadata.values())
 {

Modified: cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java?rev=1086755r1=1086754r2=1086755view=diff
==
--- cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java 
(original)
+++ cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java Tue 
Mar 29 21:26:48 2011
@@ -69,7 +69,7 @@ import com.google.common.collect.Maps;
 
 import static org.apache.cassandra.thrift.ThriftValidation.validateKey;
 import static 
org.apache.cassandra.thrift.ThriftValidation.validateColumnFamily;
-import static org.apache.cassandra.thrift.ThriftValidation.validateKeyType;
+import static org.apache.cassandra.thrift.ThriftValidation.validateColumnNames;
 
 public class QueryProcessor
 {
@@ -86,8 +86,9 @@ public class QueryProcessor
 assert select.getKeys().size() == 1;
 
 ByteBuffer key = 
select.getKeys().get(0).getByteBuffer(AsciiType.instance);
-validateKey(key);
-
+CFMetaData metadata = validateColumnFamily(keyspace, 
select.getColumnFamily(), false);
+validateKey(metadata, key);
+
 // ...of a list of column names
 if (!select.isColumnRange())
 {
@@ -95,7 +96,7 @@ public class QueryProcessor
 for (Term column : select.getColumnNames())
 columnNames.add(column.getByteBuffer(comparator));
 
-validateColumnNames(keyspace, select.getColumnFamily(), 
columnNames);
+validateColumnNames(metadata, null, columnNames);
 commands.add(new SliceByNamesReadCommand(keyspace, key, queryPath, 
columnNames));
 }
 // ...a range (slice) of column names
@@ -104,7 +105,7 @@ public class QueryProcessor
 ByteBuffer start = 
select.getColumnStart().getByteBuffer(comparator);
 ByteBuffer finish = 
select.getColumnFinish().getByteBuffer(comparator);
 
-validateSliceRange(keyspace, select.getColumnFamily(), start, 
finish, select.isColumnsReversed());
+validateSliceRange(metadata, start, finish, 
select.isColumnsReversed());
 commands.add(new SliceFromReadCommand(keyspace,
   key,
   queryPath,
@@ -140,10 +141,11 @@ public class QueryProcessor
 IPartitioner? p = StorageService.getPartitioner();
 AbstractBounds bounds = new Bounds(p.getToken(startKey), 
p.getToken(finishKey));
 
-AbstractType? comparator = select.getComparator(keyspace);
+CFMetaData metadata = validateColumnFamily(keyspace, 
select.getColumnFamily(), false);
+AbstractType? comparator = metadata.getComparatorFor(null);
 // XXX: Our use of Thrift structs internally makes me Sad. :(
 SlicePredicate thriftSlicePredicate = slicePredicateFromSelect(select, 
comparator);
-validateSlicePredicate(keyspace, select.getColumnFamily(), 
thriftSlicePredicate);
+validateSlicePredicate(metadata, thriftSlicePredicate);
 
 try
 {
@@ -174,10 +176,11 @@ public class QueryProcessor
 private static Listorg.apache.cassandra.db.Row 

buildbot failure in ASF Buildbot on cassandra-trunk

2011-03-29 Thread buildbot
The Buildbot has detected a new failure on builder cassandra-trunk while 
building ASF Buildbot.
Full details are available at:
 http://ci.apache.org/builders/cassandra-trunk/builds/1193

Buildbot URL: http://ci.apache.org/

Buildslave for this Build: isis_ubuntu

Build Reason: scheduler
Build Source Stamp: [branch cassandra/trunk] 1086755
Blamelist: jbellis

BUILD FAILED: failed compile

sincerely,
 -The Buildbot



svn commit: r1086759 - /cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java

2011-03-29 Thread jbellis
Author: jbellis
Date: Tue Mar 29 21:37:14 2011
New Revision: 1086759

URL: http://svn.apache.org/viewvc?rev=1086759view=rev
Log:
avoid unnecessary type validation in cql
patch by jbellis

Modified:
cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java

Modified: cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java?rev=1086759r1=1086758r2=1086759view=diff
==
--- cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java 
(original)
+++ cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java Tue 
Mar 29 21:37:14 2011
@@ -63,13 +63,12 @@ import org.apache.cassandra.service.Stor
 import org.apache.cassandra.thrift.Column;
 import org.apache.cassandra.thrift.*;
 import org.apache.cassandra.utils.ByteBufferUtil;
+import org.apache.cassandra.utils.FBUtilities;
 
 import com.google.common.base.Predicates;
 import com.google.common.collect.Maps;
 
-import static org.apache.cassandra.thrift.ThriftValidation.validateKey;
 import static 
org.apache.cassandra.thrift.ThriftValidation.validateColumnFamily;
-import static org.apache.cassandra.thrift.ThriftValidation.validateColumnNames;
 
 public class QueryProcessor
 {
@@ -87,7 +86,7 @@ public class QueryProcessor
 
 ByteBuffer key = 
select.getKeys().get(0).getByteBuffer(AsciiType.instance);
 CFMetaData metadata = validateColumnFamily(keyspace, 
select.getColumnFamily(), false);
-validateKey(metadata, key);
+validateKey(key);
 
 // ...of a list of column names
 if (!select.isColumnRange())
@@ -96,7 +95,7 @@ public class QueryProcessor
 for (Term column : select.getColumnNames())
 columnNames.add(column.getByteBuffer(comparator));
 
-validateColumnNames(metadata, null, columnNames);
+validateColumnNames(columnNames);
 commands.add(new SliceByNamesReadCommand(keyspace, key, queryPath, 
columnNames));
 }
 // ...a range (slice) of column names
@@ -238,7 +237,7 @@ public class QueryProcessor
 
 // FIXME: keys as ascii is not a Real Solution
 ByteBuffer key = update.getKey().getByteBuffer(AsciiType.instance);
-validateKey(metadata, key);
+validateKey(key);
 AbstractType? comparator = update.getComparator(keyspace);
 
 RowMutation rm = new RowMutation(keyspace, key);
@@ -367,16 +366,45 @@ public class QueryProcessor
 }
 }
 
-private static void validateColumnName(CFMetaData metadata, ByteBuffer 
column)
+private static void validateKey(ByteBuffer key) throws 
InvalidRequestException
+{
+if (key == null || key.remaining() == 0)
+{
+throw new InvalidRequestException(Key may not be empty);
+}
+
+// check that key can be handled by FBUtilities.writeShortByteArray
+if (key.remaining()  FBUtilities.MAX_UNSIGNED_SHORT)
+{
+throw new InvalidRequestException(Key length of  + 
key.remaining() +
+   is longer than maximum of  + 
FBUtilities.MAX_UNSIGNED_SHORT);
+}
+}
+
+private static void validateColumnNames(IterableByteBuffer columns)
+throws InvalidRequestException
+{
+for (ByteBuffer name : columns)
+{
+if (name.remaining()  IColumn.MAX_NAME_LENGTH)
+throw new InvalidRequestException(String.format(column name 
is too long (%s  %s),
+
name.remaining(),
+
IColumn.MAX_NAME_LENGTH));
+if (name.remaining() == 0)
+throw new InvalidRequestException(zero-length column name);
+}
+}
+
+private static void validateColumnName(ByteBuffer column)
 throws InvalidRequestException
 {
-validateColumnNames(metadata, null, Arrays.asList(column));
+validateColumnNames(Arrays.asList(column));
 }
 
 private static void validateColumn(CFMetaData metadata, ByteBuffer name, 
ByteBuffer value)
 throws InvalidRequestException
 {
-validateColumnName(metadata, name);
+validateColumnName(name);
 AbstractType? validator = metadata.getValueValidator(name);
 
 try
@@ -398,7 +426,7 @@ public class QueryProcessor
 if (predicate.slice_range != null)
 validateSliceRange(metadata, predicate.slice_range);
 else
-validateColumnNames(metadata, null, predicate.column_names);
+validateColumnNames(predicate.column_names);
 }
 
 private static void validateSliceRange(CFMetaData metadata, SliceRange 
range)
@@ -578,7 +606,7 @@ public 

[jira] [Resolved] (CASSANDRA-2311) type validated row keys

2011-03-29 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-2311.
---

Resolution: Fixed
  Reviewer: urandom

Committed Eric's fix and some other Thrift fixes.  Merged TV type validation 
into validateKey; added QP.validateKey that only checks length.

 type validated row keys
 ---

 Key: CASSANDRA-2311
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2311
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Eric Evans
Assignee: Jon Hermes
  Labels: cql
 Fix For: 0.8

 Attachments: 2311.txt, 
 v1-0001-CASSANDRA-2311-missed-CFM-conversion.txt


 The idea here is to allow the assignment of a column-family-wide key type 
 used to perform validation, (similar to how default_validation_class does for 
 column values).
 This should be as straightforward as extending the column family schema to 
 include the new attribute, and updating {{ThriftValidation.validateKey}} to 
 validate the key ({{AbstractType.validate}}).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


buildbot success in ASF Buildbot on cassandra-trunk

2011-03-29 Thread buildbot
The Buildbot has detected a restored build on builder cassandra-trunk while 
building ASF Buildbot.
Full details are available at:
 http://ci.apache.org/builders/cassandra-trunk/builds/1194

Buildbot URL: http://ci.apache.org/

Buildslave for this Build: isis_ubuntu

Build Reason: scheduler
Build Source Stamp: [branch cassandra/trunk] 1086759
Blamelist: jbellis

Build succeeded!

sincerely,
 -The Buildbot



[jira] [Commented] (CASSANDRA-1263) Push replication factor down to the replication strategy

2011-03-29 Thread Jeremy Hanna (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012753#comment-13012753
 ] 

Jeremy Hanna commented on CASSANDRA-1263:
-

Overall looks like a clean way of pushing down the RF to strategy.  A few minor 
points:

In KSMetaData it has the following:
{code}
StringBuilder sb = new StringBuilder();
sb.append(name)
  .append(rep factor:)
  .append(rep strategy:)
  .append(strategyClass.getSimpleName())
  .append({)
  .append(StringUtils.join(cfMetaData.values(), , ))
  .append(});
return sb.toString();
{code}
Shouldn't the rep factor String be gone along with the variable output?

The end of SimpleStrategy - the curly brace should be on its own line.

I see an instance of replication_factor in Cli.g - not sure if that matters.  
Seems that's just for typing generally.

Looks like CQL still has some references to the way things were with RF - that 
could be a separate issue I would think.

 Push replication factor down to the replication strategy
 

 Key: CASSANDRA-1263
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1263
 Project: Cassandra
  Issue Type: Task
  Components: Core
Reporter: Jeremy Hanna
Assignee: Jon Hermes
Priority: Minor
 Fix For: 0.8

 Attachments: 1263-2.txt, 1263-3.txt, 1263-incomplete.txt, 1263.txt

   Original Estimate: 8h
  Remaining Estimate: 8h

 Currently the replication factor is in the keyspace metadata.  As we've added 
 the datacenter shard strategy, the replication factor becomes more computed 
 by the replication strategy.  It seems reasonable to therefore push the 
 replication factor for the keyspace down to the replication strategy so that 
 it can be handled in one place.
 This adds on the work being done in CASSANDRA-1066 since that ticket will 
 make the replication strategy a member variable of keyspace metadata instead 
 of just a quasi singleton giving the replication strategy state for each 
 keyspace.  That makes it able to have the replication factor.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-1263) Push replication factor down to the replication strategy

2011-03-29 Thread Jeremy Hanna (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012757#comment-13012757
 ] 

Jeremy Hanna commented on CASSANDRA-1263:
-

Btw - Jon - how extensively did you test this?  I wonder if something like RF + 
Strategy could be something that could be handled in a distributed test.

 Push replication factor down to the replication strategy
 

 Key: CASSANDRA-1263
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1263
 Project: Cassandra
  Issue Type: Task
  Components: Core
Reporter: Jeremy Hanna
Assignee: Jon Hermes
Priority: Minor
 Fix For: 0.8

 Attachments: 1263-2.txt, 1263-3.txt, 1263-incomplete.txt, 1263.txt

   Original Estimate: 8h
  Remaining Estimate: 8h

 Currently the replication factor is in the keyspace metadata.  As we've added 
 the datacenter shard strategy, the replication factor becomes more computed 
 by the replication strategy.  It seems reasonable to therefore push the 
 replication factor for the keyspace down to the replication strategy so that 
 it can be handled in one place.
 This adds on the work being done in CASSANDRA-1066 since that ticket will 
 make the replication strategy a member variable of keyspace metadata instead 
 of just a quasi singleton giving the replication strategy state for each 
 keyspace.  That makes it able to have the replication factor.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-1263) Push replication factor down to the replication strategy

2011-03-29 Thread Jeremy Hanna (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012766#comment-13012766
 ] 

Jeremy Hanna commented on CASSANDRA-1263:
-

One other small thing that would be easy to fix while you're in there - in 
CliUserHelp:
{code}
state.out.println(update keyspace foo with);
state.out.println(placement_strategy = 
'org.apache.cassandra.locator.SimpleStrategy';);
state.out.println(and 
strategy_options=[{replication_factor:4}];);
{code}
That second line shouldn't have a ';' at the end of it - it's the middle of the 
update keyspace.

 Push replication factor down to the replication strategy
 

 Key: CASSANDRA-1263
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1263
 Project: Cassandra
  Issue Type: Task
  Components: Core
Reporter: Jeremy Hanna
Assignee: Jon Hermes
Priority: Minor
 Fix For: 0.8

 Attachments: 1263-2.txt, 1263-3.txt, 1263-incomplete.txt, 1263.txt

   Original Estimate: 8h
  Remaining Estimate: 8h

 Currently the replication factor is in the keyspace metadata.  As we've added 
 the datacenter shard strategy, the replication factor becomes more computed 
 by the replication strategy.  It seems reasonable to therefore push the 
 replication factor for the keyspace down to the replication strategy so that 
 it can be handled in one place.
 This adds on the work being done in CASSANDRA-1066 since that ticket will 
 make the replication strategy a member variable of keyspace metadata instead 
 of just a quasi singleton giving the replication strategy state for each 
 keyspace.  That makes it able to have the replication factor.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


svn commit: r1086806 - in /cassandra/trunk: src/java/org/apache/cassandra/cql/ test/system/

2011-03-29 Thread eevans
Author: eevans
Date: Tue Mar 29 23:41:43 2011
New Revision: 1086806

URL: http://svn.apache.org/viewvc?rev=1086806view=rev
Log:
CQL support for typed keys

Patch by eevans

Modified:
cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g

cassandra/trunk/src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java
cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java
cassandra/trunk/src/java/org/apache/cassandra/cql/UpdateStatement.java
cassandra/trunk/test/system/test_cql.py

Modified: cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g?rev=1086806r1=1086805r2=1086806view=diff
==
--- cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g (original)
+++ cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g Tue Mar 29 23:41:43 
2011
@@ -270,16 +270,18 @@ createKeyspaceStatement returns [CreateK
  */
 createColumnFamilyStatement returns [CreateColumnFamilyStatement expr]
 : K_CREATE K_COLUMNFAMILY name=( IDENT | STRING_LITERAL | INTEGER ) { 
$expr = new CreateColumnFamilyStatement($name.text); }
-  ( '('
-  col1=term v1=createCfamColumnValidator { $expr.addColumn(col1, 
$v1.validator); } ( ','
-  colN=term vN=createCfamColumnValidator { $expr.addColumn(colN, 
$vN.validator); } )*
-  ')' )?
+  ( '(' createCfamColumns[expr] ( ',' createCfamColumns[expr] )* ')' )?
   ( K_WITH prop1=IDENT '=' arg1=createCfamKeywordArgument { 
$expr.addProperty($prop1.text, $arg1.arg); }
   ( K_AND propN=IDENT '=' argN=createCfamKeywordArgument { 
$expr.addProperty($propN.text, $argN.arg); } )*
   )?
   endStmnt
 ;
 
+createCfamColumns[CreateColumnFamilyStatement expr]
+: n=term v=createCfamColumnValidator { $expr.addColumn(n, $v.validator); }
+| K_KEY v=createCfamColumnValidator K_PRIMARY K_KEY { 
$expr.setKeyType($v.validator); }
+;
+
 createCfamColumnValidator returns [String validator]
 : comparatorType { $validator = $comparatorType.text; }
 | STRING_LITERAL { $validator = $STRING_LITERAL.text; }
@@ -378,6 +380,7 @@ K_COLUMNFAMILY: C O L U M N F A M I L Y;
 K_INDEX:   I N D E X;
 K_ON:  O N;
 K_DROP:D R O P;
+K_PRIMARY: P R I M A R Y;
 
 // Case-insensitive alpha characters
 fragment A: ('a'|'A');

Modified: 
cassandra/trunk/src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java?rev=1086806r1=1086805r2=1086806view=diff
==
--- 
cassandra/trunk/src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java
 (original)
+++ 
cassandra/trunk/src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java
 Tue Mar 29 23:41:43 2011
@@ -87,6 +87,7 @@ public class CreateColumnFamilyStatement
 private final String name;
 private final MapTerm, String columns = new HashMapTerm, String();
 private final MapString, String properties = new HashMapString, 
String();
+private String keyValidator;
 
 public CreateColumnFamilyStatement(String name)
 {
@@ -157,6 +158,11 @@ public class CreateColumnFamilyStatement
 columns.put(term, comparator);
 }
 
+public void setKeyType(String validator)
+{
+this.keyValidator = validator;
+}
+
 /** Map a keyword to the corresponding value */
 public void addProperty(String name, String value)
 {
@@ -180,7 +186,7 @@ public class CreateColumnFamilyStatement
 {
 ByteBuffer columnName = col.getKey().getByteBuffer(comparator);
 String validatorClassName = 
comparators.containsKey(col.getValue()) ? comparators.get(col.getValue()) : 
col.getValue();
-AbstractType validator = 
DatabaseDescriptor.getComparator(validatorClassName);
+AbstractType? validator = 
DatabaseDescriptor.getComparator(validatorClassName);
 columnDefs.put(columnName, new ColumnDefinition(columnName, 
validator, null, null));
 }
 catch (ConfigurationException e)
@@ -212,6 +218,7 @@ public class CreateColumnFamilyStatement
 // RPC uses BytesType as the default validator/comparator but 
BytesType expects hex for string terms, (not convenient).
 AbstractType? comparator = 
DatabaseDescriptor.getComparator(comparators.get(getPropertyString(KW_COMPARATOR,
 utf8)));
 String validator = getPropertyString(KW_DEFAULTVALIDATION, utf8);
+AbstractType? keyType = 
DatabaseDescriptor.getComparator(comparators.get((keyValidator != null) ? 
keyValidator : utf8));
 
 newCFMD = new CFMetaData(keyspace,
  name,
@@ -234,7 +241,8 @@ public class 

svn commit: r1086807 - in /cassandra/trunk: src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java test/system/test_cql.py

2011-03-29 Thread eevans
Author: eevans
Date: Tue Mar 29 23:41:49 2011
New Revision: 1086807

URL: http://svn.apache.org/viewvc?rev=1086807view=rev
Log:
allow exactly one PRIMARY KEY definition

Patch by eevans

Modified:

cassandra/trunk/src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java
cassandra/trunk/test/system/test_cql.py

Modified: 
cassandra/trunk/src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java?rev=1086807r1=1086806r2=1086807view=diff
==
--- 
cassandra/trunk/src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java
 (original)
+++ 
cassandra/trunk/src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java
 Tue Mar 29 23:41:49 2011
@@ -21,8 +21,10 @@
 package org.apache.cassandra.cql;
 
 import java.nio.ByteBuffer;
+import java.util.ArrayList;
 import java.util.HashMap;
 import java.util.HashSet;
+import java.util.List;
 import java.util.Map;
 import java.util.Set;
 
@@ -87,7 +89,7 @@ public class CreateColumnFamilyStatement
 private final String name;
 private final MapTerm, String columns = new HashMapTerm, String();
 private final MapString, String properties = new HashMapString, 
String();
-private String keyValidator;
+private ListString keyValidator = new ArrayListString();
 
 public CreateColumnFamilyStatement(String name)
 {
@@ -150,6 +152,12 @@ public class CreateColumnFamilyStatement
 if ((memOps != null)  (memOps =0))
 throw new InvalidRequestException(String.format(%s must be 
non-negative and greater than zero,
 
KW_MEMTABLEOPSINMILLIONS));
+
+// Ensure that exactly one key has been specified.
+if (keyValidator.size()  1)
+throw new InvalidRequestException(You must specify a PRIMARY 
KEY);
+else if (keyValidator.size()  1)
+throw new InvalidRequestException(You may only specify one 
PRIMARY KEY);
 }
 
 /** Map a column name to a validator for its value */
@@ -160,7 +168,12 @@ public class CreateColumnFamilyStatement
 
 public void setKeyType(String validator)
 {
-this.keyValidator = validator;
+keyValidator.add(validator);
+}
+
+public String getKeyType()
+{
+return keyValidator.get(0);
 }
 
 /** Map a keyword to the corresponding value */
@@ -218,7 +231,6 @@ public class CreateColumnFamilyStatement
 // RPC uses BytesType as the default validator/comparator but 
BytesType expects hex for string terms, (not convenient).
 AbstractType? comparator = 
DatabaseDescriptor.getComparator(comparators.get(getPropertyString(KW_COMPARATOR,
 utf8)));
 String validator = getPropertyString(KW_DEFAULTVALIDATION, utf8);
-AbstractType? keyType = 
DatabaseDescriptor.getComparator(comparators.get((keyValidator != null) ? 
keyValidator : utf8));
 
 newCFMD = new CFMetaData(keyspace,
  name,
@@ -242,7 +254,7 @@ public class CreateColumnFamilyStatement
.memOps(getPropertyDouble(KW_MEMTABLEOPSINMILLIONS, 
CFMetaData.DEFAULT_MEMTABLE_OPERATIONS_IN_MILLIONS))
.mergeShardsChance(0.0)
.columnMetadata(getColumns(comparator))
-   .keyValidator(keyType);
+   
.keyValidator(DatabaseDescriptor.getComparator(comparators.get(getKeyType(;
 }
 catch (ConfigurationException e)
 {

Modified: cassandra/trunk/test/system/test_cql.py
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/test/system/test_cql.py?rev=1086807r1=1086806r2=1086807view=diff
==
--- cassandra/trunk/test/system/test_cql.py (original)
+++ cassandra/trunk/test/system/test_cql.py Tue Mar 29 23:41:49 2011
@@ -365,6 +365,7 @@ class TestCql(ThriftTester):
 
 conn.execute(
 CREATE COLUMNFAMILY NewCf1 (
+KEY int PRIMARY KEY,
 'username' utf8,
 'age' int,
 'birthdate' long,
@@ -383,25 +384,31 @@ class TestCql(ThriftTester):
 assert cfam.comment == shiny, new, cf
 assert cfam.default_validation_class == 
org.apache.cassandra.db.marshal.AsciiType
 assert cfam.comparator_type == 
org.apache.cassandra.db.marshal.UTF8Type
+assert cfam.key_validation_class == 
org.apache.cassandra.db.marshal.IntegerType
 
-# No column defs, defaults all-around
-conn.execute(CREATE COLUMNFAMILY NewCf2)
-ksdef = thrift_client.describe_keyspace(CreateCFKeyspace)
-assert len(ksdef.cf_defs) == 2, \
-expected 2 column families total, found %d % len(ksdef.cf_defs)
+   

buildbot failure in ASF Buildbot on cassandra-trunk

2011-03-29 Thread buildbot
The Buildbot has detected a new failure on builder cassandra-trunk while 
building ASF Buildbot.
Full details are available at:
 http://ci.apache.org/builders/cassandra-trunk/builds/1195

Buildbot URL: http://ci.apache.org/

Buildslave for this Build: isis_ubuntu

Build Reason: scheduler
Build Source Stamp: [branch cassandra/trunk] 1086806
Blamelist: eevans

BUILD FAILED: failed compile

sincerely,
 -The Buildbot



[jira] [Commented] (CASSANDRA-2321) disallow to querying a counter CF with non-counter operation

2011-03-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012781#comment-13012781
 ] 

Hudson commented on CASSANDRA-2321:
---

Integrated in Cassandra #817 (See 
[https://hudson.apache.org/hudson/job/Cassandra/817/])


 disallow to querying a counter CF with non-counter operation
 

 Key: CASSANDRA-2321
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2321
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8
 Environment: Linux
Reporter: Mubarak Seyed
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 0.8

 Attachments: 0001-Don-t-allow-normal-query-on-counter-CF.patch


 CounterColumnType.getString() returns hexString.
 {code}
 public String getString(ByteBuffer bytes)
 { 
return ByteBufferUtil.bytesToHex(bytes);
 }
 {code}
 and python stress.py reader returns
 [ColumnOrSuperColumn(column=None, super_column=SuperColumn(name='19', 
 columns=[Column(timestamp=1299984960277, name='56', 
 value='\x7f\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00,',
  ttl=None), Column(timestamp=1299985019923, name='57', 
 value='\x7f\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00;\x00\x00\x00\x00\x00\x00\x08\xfd',
  ttl=None))]

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-2403) Backport AbstractType.compose from trunk

2011-03-29 Thread Brandon Williams (JIRA)
Backport AbstractType.compose from trunk


 Key: CASSANDRA-2403
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2403
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Brandon Williams
Priority: Minor
 Fix For: 0.7.5


It was added in CASSANDRA-2262, but is also useful for 0.7.x.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-2387) Make it possible for pig to understand packed data

2011-03-29 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2387:


Attachment: 2387-v2.txt

v2 builds upon this by utilizing AbstractType to handle any type, also handles 
column names the same way, and only deserializes the cfdef once per row instead 
of for every column.  Still has some string roundtrip casting lameness until 
CASSANDRA-2403 is resolved.

 Make it possible for pig to understand packed data
 --

 Key: CASSANDRA-2387
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2387
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jeremy Hanna
Assignee: Jeremy Hanna
  Labels: contrib, hadoop, pig
 Attachments: 2387-1.txt, 2387-v2.txt, loadstorecaster-patch.txt


 Packed values are throwing off pig. This ticket is to make it so pig can 
 interpret packed values. Originally we thought we could just use a 
 loadcaster.  However, the only way we know how we can do it now is to get the 
 schema through thrift and essentially perform the function of the loadcaster 
 in the getNext method.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-1902) Migrate cached pages during compaction

2011-03-29 Thread Peter Schuller (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012783#comment-13012783
 ] 

Peter Schuller commented on CASSANDRA-1902:
---

Catching up with ticket history and the latest version of the patch, a few 
things based on the history+patch themselves (I have not tested or benchmarked 
anything):

With respect to avoiding waiting on GC: the munmap() is still in finalize() we 
we're still waiting on GC, right? Just not on every possible ByteBuffer 
(instead only on the MappedFileSegment itself).

BufferedSegmentedFile.tryPreserveFilePageCache() is doing a 
tryPreserveCacheRegion() for every page considered hot. The first thing to be 
aware of then is that this will translate into a posix_fadvise() syscall for 
every page, even when all or almost all pages are in fact in memory. This may 
be acceptable, but keep in mind that use-cases where all or almost all pages 
are in cache, are likely to be the ones CPU-bound rather than disk bound.

The bigger issue with the same thing, is that in the cache of large column 
families that we're trying to optimize for, unless I am missing something the 
preservation process is expected to be entirely seek bound for sparsely hot 
sstables. In the best case for mostly-hot sstables it might not be seek bound 
provided that pre-fetching and/or read-ahead and/or linear access detection is 
working well, but that seems very dependent on system details and the type of 
load the system is under (probably less likely to work well under high live 
read i/o loads). In the non-best case (sparsely hot), it should most definitely 
be entirely seek bound.

fadvising entire regions at once instead of once per page might improve that, 
but I still think the better solution is to just not DONTNEED hot data to begin 
with (subject to potential limitations to avoid too frequent DONTNEEDs).

Note: The original motivation for avoiding frequent DONTNEED was performance in 
relation to the syscall. But in this case we're taking a one syscall per page 
hit anyway with the WILLNEED:s. In fact in the case of a very hot sstable 
(where CPU efficiency is more important than a cold sstable where disk I/O is 
more important) the WILLNEED:s should be more numerous than the DONTNEED:s 
would have been had they been fragmented according to a hotness map.

Disregarding the CPU efficiency concerns though, the primary concern I'd have 
is the WILLNEED calls. Again I haven't tested to make sure I'm not mis-reading 
it, but this should mean that all compactions of actively used sstables will 
end, after the streaming I/O, with lots of seek bound reads to fullfil the 
WILLNEED:s. This can take a lot of time and be expensive in terms of the amount 
of disk time being spent (relative to a rate limited compaction process), and 
also violates the otherwise preserved rule that the only seek-bound I/O is 
live reads; all other I/O is sequential.

Also: If WILLNEED blocks until it's been read, the impact on live traffic 
should be limited but on the other hand latency should be high under read load. 
If WILLNEED doesn't block throughput should have a chance of being reasonable 
by maintaining some queue depth, but on the other hand would potentially 
severely affect live reads. (I don't know which is true, I should check, but I 
haven't yet.)

Minor nit: Seemingly truncated doc string for SegmentedFile.complete().

Minor suggestion: Should isRangeInCache() be renamed to wasRangeInCache() to 
reflect the fact that it does not represent current status? It is not an 
implementation detail because if it did reflect current reality, the caller 
would be incorrect (the test on a per-column basis would constantly give false 
positives as being in cache due to (1) the column just having been serialized, 
which would be easily fixable, but also because (2) previous columns on the 
same page, which is more difficult to fix than moving a line of code).



 Migrate cached pages during compaction 
 ---

 Key: CASSANDRA-1902
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1902
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.1
Reporter: T Jake Luciani
Assignee: T Jake Luciani
 Fix For: 0.7.5, 0.8

 Attachments: 
 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt, 
 1902-formatted.txt, 1902-per-column-migration-rebase2.txt, 
 1902-per-column-migration.txt, CASSANDRA-1902-v3.patch, 
 CASSANDRA-1902-v4.patch, CASSANDRA-1902-v5.patch

   Original Estimate: 32h
  Time Spent: 56h
  Remaining Estimate: 0h

 Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a 
 pre-compacted CF during the compaction process.  This is now important since 
 CASSANDRA-1470 caches effectively 

[jira] [Issue Comment Edited] (CASSANDRA-2387) Make it possible for pig to understand packed data

2011-03-29 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012782#comment-13012782
 ] 

Brandon Williams edited comment on CASSANDRA-2387 at 3/30/11 12:02 AM:
---

v2 builds upon this by utilizing AbstractType to handle any type, also handles 
column names the same way, and only deserializes the cfdef once per row instead 
of for every column.  Still has some string roundtrip casting lameness until 
CASSANDRA-2403 is resolved.  Also handles serialization when storing.

  was (Author: brandon.williams):
v2 builds upon this by utilizing AbstractType to handle any type, also 
handles column names the same way, and only deserializes the cfdef once per row 
instead of for every column.  Still has some string roundtrip casting lameness 
until CASSANDRA-2403 is resolved.
  
 Make it possible for pig to understand packed data
 --

 Key: CASSANDRA-2387
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2387
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jeremy Hanna
Assignee: Jeremy Hanna
  Labels: contrib, hadoop, pig
 Attachments: 2387-1.txt, 2387-v2.txt, loadstorecaster-patch.txt


 Packed values are throwing off pig. This ticket is to make it so pig can 
 interpret packed values. Originally we thought we could just use a 
 loadcaster.  However, the only way we know how we can do it now is to get the 
 schema through thrift and essentially perform the function of the loadcaster 
 in the getNext method.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


svn commit: r1086812 - /cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g

2011-03-29 Thread eevans
Author: eevans
Date: Wed Mar 30 00:26:32 2011
New Revision: 1086812

URL: http://svn.apache.org/viewvc?rev=1086812view=rev
Log:
allow but do not require semicolon in batch updates

Patch by eevans

Modified:
cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g

Modified: cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g?rev=1086812r1=1086811r2=1086812view=diff
==
--- cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g (original)
+++ cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g Wed Mar 30 00:26:32 
2011
@@ -100,7 +100,7 @@ options {
 
 query returns [CQLStatement stmnt]
 : selectStatement   { $stmnt = new CQLStatement(StatementType.SELECT, 
$selectStatement.expr); }
-| updateStatement   { $stmnt = new CQLStatement(StatementType.UPDATE, 
$updateStatement.expr); }
+| updateStatement endStmnt { $stmnt = new 
CQLStatement(StatementType.UPDATE, $updateStatement.expr); }
 | batchUpdateStatement { $stmnt = new 
CQLStatement(StatementType.BATCH_UPDATE, $batchUpdateStatement.expr); }
 | useStatement  { $stmnt = new CQLStatement(StatementType.USE, 
$useStatement.keyspace); }
 | truncateStatement { $stmnt = new CQLStatement(StatementType.TRUNCATE, 
$truncateStatement.cfam); }
@@ -188,7 +188,7 @@ batchUpdateStatement returns [BatchUpdat
   ListUpdateStatement updates = new ArrayListUpdateStatement();
   }
   K_BEGIN K_BATCH ( K_USING K_CONSISTENCY K_LEVEL { cLevel = 
ConsistencyLevel.valueOf($K_LEVEL.text); } )?
-  u1=updateStatement { updates.add(u1); } ( uN=updateStatement { 
updates.add(uN); } )*
+  u1=updateStatement ';'? { updates.add(u1); } ( uN=updateStatement 
';'? { updates.add(uN); } )*
   K_APPLY K_BATCH EOF
   {
   return new BatchUpdateStatement(updates, cLevel);
@@ -214,7 +214,7 @@ updateStatement returns [UpdateStatement
   K_UPDATE columnFamily=( IDENT | STRING_LITERAL | INTEGER )
   (K_USING K_CONSISTENCY K_LEVEL { cLevel = 
ConsistencyLevel.valueOf($K_LEVEL.text); })?
   K_SET termPair[columns] (',' termPair[columns])*
-  K_WHERE K_KEY '=' key=term endStmnt
+  K_WHERE K_KEY '=' key=term
   {
   return new UpdateStatement($columnFamily.text, cLevel, columns, key);
   }
@@ -241,7 +241,7 @@ deleteStatement returns [DeleteStatement
   K_FROM columnFamily=( IDENT | STRING_LITERAL | INTEGER ) ( K_USING 
K_CONSISTENCY K_LEVEL )?
   K_WHERE ( K_KEY '=' key=term   { keyList = 
Collections.singletonList(key); }
   | K_KEY K_IN '(' keys=termList { keyList = $keys.items; } ')'
-  )?
+  )? endStmnt
   {
   return new DeleteStatement(columnsList, $columnFamily.text, cLevel, 
keyList);
   }
@@ -339,7 +339,7 @@ truncateStatement returns [String cfam]
 ;
 
 endStmnt
-: (EOF | ';')
+: ';'?  EOF
 ;
 
 




[jira] [Commented] (CASSANDRA-1902) Migrate cached pages during compaction

2011-03-29 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012793#comment-13012793
 ] 

Pavel Yaskevich commented on CASSANDRA-1902:


bq. With respect to avoiding waiting on GC: the munmap() is still in finalize() 
we we're still waiting on GC, right? Just not on every possible ByteBuffer 
(instead only on the MappedFileSegment itself).

This is correct but on other hand we can unmap segment only when we don't need 
a reader, I can do an explicite method for it but my tests show that unmap in 
finalize if pretty useful. 

bq. BufferedSegmentedFile.tryPreserveFilePageCache() is doing a 
tryPreserveCacheRegion() for every page considered hot. The first thing to be 
aware of then is that this will translate into a posix_fadvise() syscall for 
every page, even when all or almost all pages are in fact in memory. This may 
be acceptable, but keep in mind that use-cases where all or almost all pages 
are in cache, are likely to be the ones CPU-bound rather than disk bound.

Documentation for posix_fadvice/madvice calls suggests to do more frequent 
little requests instead of big requests - kernel in the high possibility going 
to ignore an advice on the big region. 

bq. fadvising entire regions at once instead of once per page might improve 
that, but I still think the better solution is to just not DONTNEED hot data to 
begin with (subject to potential limitations to avoid too frequent DONTNEEDs).

We can't stop using DONTNEED while writing compacted file because it will suck 
pages from sstables which are currently in use. And we do WILLNEED's only when 
we have SSTableReader for a compacted file ready - right before old sstables 
going to be replaced with new ones so this is not going to make a big 
performance impact on the reads. Note that WILLNEED is non-blocking call.

bq. Minor nit: Seemingly truncated doc string for SegmentedFile.complete().

Yes, I will fix that, thanks!

bq. Minor suggestion: Should isRangeInCache() be renamed to wasRangeInCache() 
to reflect the fact that it does not represent current status? It is not an 
implementation detail because if it did reflect current reality, the caller 
would be incorrect (the test on a per-column basis would constantly give false 
positives as being in cache due to (1) the column just having been serialized, 
which would be easily fixable, but also because (2) previous columns on the 
same page, which is more difficult to fix than moving a line of code).

Sounds reasonable to me, I will rename a method.

 Migrate cached pages during compaction 
 ---

 Key: CASSANDRA-1902
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1902
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.1
Reporter: T Jake Luciani
Assignee: T Jake Luciani
 Fix For: 0.7.5, 0.8

 Attachments: 
 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt, 
 1902-formatted.txt, 1902-per-column-migration-rebase2.txt, 
 1902-per-column-migration.txt, CASSANDRA-1902-v3.patch, 
 CASSANDRA-1902-v4.patch, CASSANDRA-1902-v5.patch

   Original Estimate: 32h
  Time Spent: 56h
  Remaining Estimate: 0h

 Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a 
 pre-compacted CF during the compaction process.  This is now important since 
 CASSANDRA-1470 caches effectively nothing.  
 For example an active CF being compacted hurts reads since nothing is cached 
 in the new SSTable. 
 The purpose of this ticket then is to make sure SOME data is cached from 
 active CFs. This can be done my monitoring which Old SSTables are in the page 
 cache and caching active rows in the New SStable.
 A simpler yet similar approach is described here: 
 http://insights.oetiker.ch/linux/fadvise/

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2045) Simplify HH to decrease read load when nodes come back

2011-03-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012813#comment-13012813
 ] 

Jonathan Ellis commented on CASSANDRA-2045:
---

You want to use the pointer approach when your ratio of overwrites : row size 
is sufficiently high -- the biggest win there is when you can turn dozens or 
hundreds of mutations, into replay of just the latest version.

Not sure what the best way to estimate that is -- Brandon suggested checking 
SSTable bloom filters on writes.  Which is probably low-overhead enough, 
especially if we just do it only every 10% of writes for instance. I kind of 
like that idea, I think it will be useful in multiple places down the road.

(Sufficiently high depends on SSD vs magnetic -- time to introduce a 
postgresql-like random vs sequential penalty setting?)


 Simplify HH to decrease read load when nodes come back
 --

 Key: CASSANDRA-2045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2045
 Project: Cassandra
  Issue Type: Improvement
Reporter: Chris Goffinet
 Fix For: 0.8


 Currently when HH is enabled, hints are stored, and when a node comes back, 
 we begin sending that node data. We do a lookup on the local node for the row 
 to send. To help reduce read load (if a node is offline for long period of 
 time) we should store the data we want forward the node locally instead. We 
 wouldn't have to do any lookups, just take byte[] and send to the destination.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2401) getColumnFamily() return null, which is not checked in ColumnFamilyStore.java scan() method, causing Timeout Exception in query

2011-03-29 Thread Tey Kar Shiang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012821#comment-13012821
 ] 

Tey Kar Shiang commented on CASSANDRA-2401:
---

Hi,

New finding here:
For the 0-column data, it is because it is never read from the file. As I step 
through the line, here it returns -1 position from 
org.apache.cassandra.io.sstable.SSTableReader.java::getPosition(DecoratedKey 
decoratedKey, Operator op), line 448 (bf.isPresent(decoratedKey.key) is 
returning false) - key is missing.

There seem to be a missing record which is indexed or indexed column itself not 
updated when the record is removed (?). 

As for the data return with 0-column, simply because a container is always 
created (final ColumnFamily returnCF = ColumnFamily.create(metadata)) and 
returned from getTopLevelColumns even if there is no read taken.

 getColumnFamily() return null, which is not checked in ColumnFamilyStore.java 
 scan() method, causing Timeout Exception in query
 ---

 Key: CASSANDRA-2401
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2401
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.7.4
 Environment: Hector 0.7.0-28, Cassandra 0.7.4, Windows 7, Eclipse
Reporter: Tey Kar Shiang

 ColumnFamilyStore.java, line near 1680, ColumnFamily data = 
 getColumnFamily(new QueryFilter(dk, path, firstFilter)), the data is 
 returned null, causing NULL exception in satisfies(data, clause, primary) 
 which is not captured. The callback got timeout and return a Timeout 
 exception to Hector.
 The data is empty, as I traced, I have the the columns Count as 0 in 
 removeDeletedCF(), which return the null there. (I am new and trying to 
 understand the logics around still). Instead of crash to NULL, could we 
 bypass the data?
 About my test:
 A stress-test program to add, modify and delete data to keyspace. I have 30 
 threads simulate concurrent users to perform the actions above, and do a 
 query to all rows periodically. I have Column Family with rows (as File) and 
 columns as index (e.g. userID, fileType).
 No issue on the first day of test, and stopped for 3 days. I restart the test 
 on 4th day, 1 of the users failed to query the files (timeout exception 
 received). Most of the users are still okay with the query.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Cassandra Wiki] Update of FAQ_JP by MakiWatanabe

2011-03-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The FAQ_JP page has been changed by MakiWatanabe.
The comment on this change is: Translate batch_mutate, mmap.
http://wiki.apache.org/cassandra/FAQ_JP?action=diffrev1=76rev2=77

--

  Anchor(batch_mutate_atomic)
  
  == batch_mutateはアトミックな操作ですか? ==
- As a special case, mutations against a single key are atomic but not 
isolated. Reads which occur during such a mutation may see part of the write 
before they see the whole thing. More generally, batch_mutate operations are 
not atomic. [[API#batch_mutate|batch_mutate]] allows grouping operations on 
many keys into a single call in order to save on the cost of network 
round-trips. If `batch_mutate` fails in the middle of its list of mutations, no 
rollback occurs and the mutations that have already been applied stay applied. 
The client should typically retry the `batch_mutate` operation.
+ 
特殊な例として、単一のキーに対するbatch_mutationを考えると、それぞれのmutationはアトミックですが、アイソレーションはされていません。このようなmutationの最中にreadを行うと、部分的に更新された状態が見える可能性があります。
+ 一般的に言うと、batch_mutateはアトミックではありません。ネットワークのラウンドトリップコストを削減するため、
+ [[API#batch_mutate|batch_mutate]]
+ 
は複数のキーに対する操作を単一の呼び出しにまとめることを許しています。`batch_mutate`がmutationの途中で失敗した場合、既に適用されたmutationはそのまま残ります。ロールバックはされません。
+ このような場合、一般的にはクライアントアプリケーションは`batch_mutate`をリトライする必要があります。
+ 
  
  Anchor(hadoop_support)
  
@@ -446, +451 @@

  == Compactionを実行してもディスク使用量が減らないのはなぜでしょうか? ==
  SSTables that are obsoleted by a compaction are deleted asynchronously when 
the JVM performs a GC. You can force a GC from jconsole if necessary, but 
Cassandra will force one itself if it detects that it is low on space. A 
compaction marker is also added to obsolete sstables so they can be deleted on 
startup if the server does not perform a GC before being restarted. Read more 
on this subject [[http://wiki.apache.org/cassandra/MemtableSSTable|here]].
  
+ 
  Anchor(mmap)
  
  == topコマンドの出力で,CassandraがJava heapの最大値よりも大きなメモリを使用しているのはなぜでしょうか? ==
- Cassandra uses mmap to do zero-copy reads. That is, we use the operating 
system's virtual memory system to map the sstable data files into the Cassandra 
process' address space. This will use virtual memory; i.e. address space, and 
will be reported by tools like top accordingly, but on 64 bit systems virtual 
address space is effectively unlimited so you should not worry about that.
- 
- What matters from the perspective of memory use in the sense as it is 
normally meant, is the amount of data allocated on brk() or mmap'd /dev/zero, 
which represent real memory used.  The key issue is that for a mmap'd file, 
there is never a need to retain the data resident in physical memory. Thus, 
whatever you do keep resident in physical memory is essentially just there as a 
cache, in the same way as normal I/O will cause the kernel page cache to retain 
data that you read/write.
- 
- The difference between normal I/O and mmap() is that in the mmap() case the 
memory is actually mapped to the process, thus affecting the virtual size as 
reported by top. The main argument for using mmap() instead of standard I/O is 
the fact that reading entails just touching memory - in the case of the memory 
being resident, you just read it - you don't even take a page fault (so no 
overhead in entering the kernel and doing a semi-context switch). This is 
covered in more detail 
[[http://www.varnish-cache.org/trac/wiki/ArchitectNotes|here]].
+ Cassandraはzero-copy 
readのためにmmapを使用しています。即ち、sstableデータファイルをCassandraプロセスのアドレス空間にマップするためにOSの仮想メモリシステムを使用しているのです。これが仮想メモリが多量に使用されているように見える理由です。実際に使用されているのは仮想アドレス空間ですが、topなどのツールでは仮想メモリの消費としてレポートされます。64ビット環境ではアドレス空間はほぼ無限大ですので、あまり気にする必要はありません。通常の意味でのメモリ使用量の観点から問題となるのはbrk()でアロケートされたデータの量やmmapされた/dev/zeroです。これは使用された実メモリ量を示しています。
+ 
mmapされたファイルについて注意すべき点は、それらを物理メモリに保持する必要がないということです。つまり物理メモリに保持されたmmapファイルは基本的にはキャッシュとみなせます。これはちょうど通常のI/Oによってread/writeしたデータがカーネルのページキャッシュに保持されるのと同じです。
+ 通常のI/Oとmmap()の違いは、mmap()がメモリをプロセスにマップするため、topでレポートされる仮想メモリサイズに影響することにあります。
+ 
通常のI/Oに替えてmmap()を使う主な利点は、メモリにアクセスするだけでreadが完了する点にあります。そのメモリ領域が実際にロードされている(residentである)場合は、正にそれを読むだけで、ページフォルトも不要です。(カーネル空間に入ったり、コンテキストスイッチするオーバーヘッドも必要ありません)
+ 
詳細については次のリンクを参照してください。[[http://www.varnish-cache.org/trac/wiki/ArchitectNotes]]
  
  Anchor(jna)
  


[jira] [Issue Comment Edited] (CASSANDRA-2401) getColumnFamily() return null, which is not checked in ColumnFamilyStore.java scan() method, causing Timeout Exception in query

2011-03-29 Thread Tey Kar Shiang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012821#comment-13012821
 ] 

Tey Kar Shiang edited comment on CASSANDRA-2401 at 3/30/11 2:19 AM:


Hi,

New finding here:
For the 0-column data, it is because it is never read from the file. As I step 
through the line, here it returns -1 position from 
org.apache.cassandra.io.sstable.SSTableReader.java::getPosition(DecoratedKey 
decoratedKey, Operator op), line 448 (bf.isPresent(decoratedKey.key) is 
returning false) - key is missing.

There seem to be a missing record which is indexed or indexed column itself not 
updated when the record is removed (?). 

As for the data returned with 0-column, simply because a container is always 
created (final ColumnFamily returnCF = ColumnFamily.create(metadata)) and 
returned from getTopLevelColumns even if there is no read taken.

As for this case, it causes Timeout exception to Hector when null exception 
thrown without captured.

  was (Author: karshiang):
Hi,

New finding here:
For the 0-column data, it is because it is never read from the file. As I step 
through the line, here it returns -1 position from 
org.apache.cassandra.io.sstable.SSTableReader.java::getPosition(DecoratedKey 
decoratedKey, Operator op), line 448 (bf.isPresent(decoratedKey.key) is 
returning false) - key is missing.

There seem to be a missing record which is indexed or indexed column itself not 
updated when the record is removed (?). 

As for the data return with 0-column, simply because a container is always 
created (final ColumnFamily returnCF = ColumnFamily.create(metadata)) and 
returned from getTopLevelColumns even if there is no read taken.
  
 getColumnFamily() return null, which is not checked in ColumnFamilyStore.java 
 scan() method, causing Timeout Exception in query
 ---

 Key: CASSANDRA-2401
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2401
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.7.4
 Environment: Hector 0.7.0-28, Cassandra 0.7.4, Windows 7, Eclipse
Reporter: Tey Kar Shiang

 ColumnFamilyStore.java, line near 1680, ColumnFamily data = 
 getColumnFamily(new QueryFilter(dk, path, firstFilter)), the data is 
 returned null, causing NULL exception in satisfies(data, clause, primary) 
 which is not captured. The callback got timeout and return a Timeout 
 exception to Hector.
 The data is empty, as I traced, I have the the columns Count as 0 in 
 removeDeletedCF(), which return the null there. (I am new and trying to 
 understand the logics around still). Instead of crash to NULL, could we 
 bypass the data?
 About my test:
 A stress-test program to add, modify and delete data to keyspace. I have 30 
 threads simulate concurrent users to perform the actions above, and do a 
 query to all rows periodically. I have Column Family with rows (as File) and 
 columns as index (e.g. userID, fileType).
 No issue on the first day of test, and stopped for 3 days. I restart the test 
 on 4th day, 1 of the users failed to query the files (timeout exception 
 received). Most of the users are still okay with the query.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira