[jira] [Commented] (CASSANDRA-4448) CQL3: allow to define a per-cf default consistency level

2012-07-20 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13418996#comment-13418996
 ] 

Sylvain Lebresne commented on CASSANDRA-4448:
-

bq. nobody should write raw JDBC at the app layer

I still disagree with that. Or rather, I don't care about jdbc particularly, 
but I disagree that nobody should using some CQL query directly in their code. 
Among other, one of the advantage of a query language is that it is very 
readable. Very readable is not only good when you use cqlsh. And using queries 
directly doesn't have to mean that your API is low-level. I could definitely 
see some simple yet elegant and modern object mapper where you still describe 
how to retrieve a given type of object using CQL queries. Such API do have some 
advantages over a pycassa-like API. Not only is it hard to beat the clarity and 
concision of most CQL query with a programmatic API, but it is also convenient 
for debugging to be able to copy-paste your query into cqlsh to test.

Besides, even for relatively low level JDBC-like API, I'm not comfortable 
saying that all the developers that uses JDBC or db-api2 directly in their app, 
and there is tons of them (even if there is even more that use something else), 
are misguided developers that should know better (and for SQL you cannot say 
that it's just that they don't have a higher level API available yet).

But another important point is that I think adding this has close to no 
downside. It's a few trivial lines of codes, it has no performance impact. And 
it hardly complicate the language in a measurable way since 1) those are just 
new options of the create table, not some new complex syntactic construction 
and 2) these options have self-documented and have a straightforward semantic. 
Users that don't care about those options and happen to find them in the doc 
will just move along and never use them, no harm done. But I am convinced that 
for at least some users their existence will be convenient (*I* would find it 
convenient, and having a pycassa-for-java-over-cql wouldn't change that).

bq. But you create CF objects exactly once and then import them elsewhere

What I meant here is that you create a table exactly once, oftentimes not even 
in application code. So I believe many people would find it more convenient to 
define the default CL that make the most sense for that table at creation, 
rather than in all application code that access the table. And there can be 
more than one such code, if only because a lot of people need to access their 
DB from multiple languages.

I'm not pretending this is a killer feature. But I do am saying that it will be 
convenient for at least some users and adding it costs us pretty much nothing. 

 CQL3: allow to define a per-cf default consistency level
 

 Key: CASSANDRA-4448
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4448
 Project: Cassandra
  Issue Type: New Feature
Reporter: Sylvain Lebresne
  Labels: cql3
 Fix For: 1.2


 One of the goal of CQL3 is that client library should not have to parse 
 queries to provide a good experience. In particular, that means such client 
 (that don't want to parse queries) won't be able to allow the user to define 
 a specific default read/write consistency level per-CF, forcing user to 
 specific the consistency level with every query, which is not very user 
 friendly.
 This ticket suggests the addition of per-cf default read/write consitency 
 level. Typically the syntax would be:
 {noformat}
 CREATE TABLE foo (...)
 WITH DEFAULT_READ_CONSISTENCY = QUORUM
  AND DEFAULT_WRITE_CONSISTENCY = QUORUM
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3855) RemoveDeleted dominates compaction time for large sstable counts

2012-07-20 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13419006#comment-13419006
 ] 

Sylvain Lebresne commented on CASSANDRA-3855:
-

Agreed that it is wrong, but I think that it's more than the first line that is 
wrong. I think that method should be:
{noformat}
public boolean hasIrrelevantData(int gcBefore)
{
if (deletionInfo().isLive())
return false;

// Do we have gcable deletion infos?
if (!deletionInfo().purge(gcbefore).equals(deletionInfo()))
return true;

// Do we have colums that are either deleted by the container or gcable 
tombstone?
for (IColumn column : columns)
if (deletionInfo().isDeleteted(column) || 
column.hasIrrelevantData(gcBefore))
return true;

return false;
}
{noformat}

 RemoveDeleted dominates compaction time for large sstable counts
 

 Key: CASSANDRA-3855
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3855
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.0
Reporter: Stu Hood
Assignee: Yuki Morishita
  Labels: compaction, deletes, leveled
 Attachments: with-cleaning-java.hprof.txt


 With very large numbers of sstables (2000+ generated by a `bin/stress -n 
 100,000,000` run with LeveledCompactionStrategy), 
 PrecompactedRow.removeDeletedAndOldShards dominates compaction runtime, such 
 that commenting it out takes compaction throughput from 200KB/s to 12MB/s.
 Stack attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4452) remove RangeKeySample from attributes in jmx

2012-07-20 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13419020#comment-13419020
 ] 

Sylvain Lebresne commented on CASSANDRA-4452:
-

Same here. Renaming it to be an operation with a comment to explain why it 
shouldn't be renamed back to an attribute.

 remove RangeKeySample from attributes in jmx
 

 Key: CASSANDRA-4452
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4452
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.2
Reporter: Jan Prach

 RangeKeySample in org.apache.cassandra.db:type=StorageService MBean can be 
 really huge (over 200MB in our case). That's a problem for monitoring tools 
 as they're not build for that. Recommended and often used mx4j may be killer 
 in this situation.
 It would be good enough to make RangeKeySample operation instead of 
 attribute in jmx. Looking at how MBeanServer.registerMBean() works we can 
 do one of the following:
 a) add some dummy parameter to getRangeKeySample
 b) name it differently - not like getter (next time somebody will rename it 
 back)
 c) implement MXBean instead of MBean (a lot of work)
 Any of those work. All of them are hacks. Any better idea?
 BTW: It's blocker for some installations. Our update to 1.1.2 caused 
 downtime, downgrade back to 1.0.x, repairs, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4436) Counters in columns don't preserve correct values after cluster restart

2012-07-20 Thread Peter Velas (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13419031#comment-13419031
 ] 

Peter Velas commented on CASSANDRA-4436:


Thanks for your interest and time to fix it. We currently move to 1.1.2 version 
to avoid some random aws failure and patiently waiting for 1.1.3 release. 

 Counters in columns don't preserve correct values after cluster restart
 ---

 Key: CASSANDRA-4436
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4436
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.10
Reporter: Peter Velas
Assignee: Sylvain Lebresne
 Fix For: 1.1.3

 Attachments: 4436-1.0.txt, 4436-1.1.txt, increments.cql.gz


 Similar to #3821. but affecting normal columns. 
 Set up a 2-node cluster with rf=2.
 1. Create a counter column family and increment a 100 keys in loop 5000 
 times. 
 2. Then make a rolling restart to cluster. 
 3. Again increment another 5000 times.
 4. Make a rolling restart to cluster.
 5. Again increment another 5000 times.
 6. Make a rolling restart to cluster.
 After step 6 we were able to reproduce bug with bad counter values. 
 Expected values were 15 000. Values returned from cluster are higher then 
 15000 + some random number.
 Rolling restarts are done with nodetool drain. Always waiting until second 
 node discover its down then kill java process. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[2/2] Refactor set/list/map CQL3 code

2012-07-20 Thread slebresne
http://git-wip-us.apache.org/repos/asf/cassandra/blob/2b62df24/src/java/org/apache/cassandra/db/marshal/SetType.java
--
diff --git a/src/java/org/apache/cassandra/db/marshal/SetType.java 
b/src/java/org/apache/cassandra/db/marshal/SetType.java
index 7a72fe9..1090d09 100644
--- a/src/java/org/apache/cassandra/db/marshal/SetType.java
+++ b/src/java/org/apache/cassandra/db/marshal/SetType.java
@@ -27,13 +27,8 @@ import java.util.Map;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
-import org.apache.cassandra.cql3.ColumnNameBuilder;
-import org.apache.cassandra.cql3.Term;
-import org.apache.cassandra.cql3.UpdateParameters;
-import org.apache.cassandra.db.ColumnFamily;
 import org.apache.cassandra.db.IColumn;
 import org.apache.cassandra.config.ConfigurationException;
-import org.apache.cassandra.thrift.InvalidRequestException;
 import org.apache.cassandra.utils.ByteBufferUtil;
 import org.apache.cassandra.utils.FBUtilities;
 import org.apache.cassandra.utils.Pair;
@@ -73,12 +68,12 @@ public class SetType extends CollectionType
 this.elements = elements;
 }
 
-protected AbstractType? nameComparator()
+public AbstractType? nameComparator()
 {
 return elements;
 }
 
-protected AbstractType? valueComparator()
+public AbstractType? valueComparator()
 {
 return EmptyType.instance;
 }
@@ -88,41 +83,6 @@ public class SetType extends CollectionType
 
sb.append(getClass().getName()).append(TypeParser.stringifyTypeParameters(Collections.AbstractType?singletonList(elements)));
 }
 
-public void executeFunction(ColumnFamily cf, ColumnNameBuilder fullPath, 
Function fct, ListTerm args, UpdateParameters params) throws 
InvalidRequestException
-{
-switch (fct)
-{
-case ADD:
-doAdd(cf, fullPath, args, params);
-break;
-case DISCARD_SET:
-doDiscard(cf, fullPath, args, params);
-break;
-default:
-throw new AssertionError(Unsupported function  + fct);
-}
-}
-
-public void doAdd(ColumnFamily cf, ColumnNameBuilder builder, ListTerm 
values, UpdateParameters params) throws InvalidRequestException
-{
-for (int i = 0; i  values.size(); ++i)
-{
-ColumnNameBuilder b = i == values.size() - 1 ? builder : 
builder.copy();
-ByteBuffer name = b.add(values.get(i).getByteBuffer(elements, 
params.variables)).build();
-cf.addColumn(params.makeColumn(name, 
ByteBufferUtil.EMPTY_BYTE_BUFFER));
-}
-}
-
-public void doDiscard(ColumnFamily cf, ColumnNameBuilder builder, 
ListTerm values, UpdateParameters params) throws InvalidRequestException
-{
-for (int i = 0; i  values.size(); ++i)
-{
-ColumnNameBuilder b = i == values.size() - 1 ? builder : 
builder.copy();
-ByteBuffer name = b.add(values.get(i).getByteBuffer(elements, 
params.variables)).build();
-cf.addColumn(params.makeTombstone(name));
-}
-}
-
 public ByteBuffer serializeForThrift(ListPairByteBuffer, IColumn 
columns)
 {
 // We're using a list for now, since json doesn't have maps



[jira] [Created] (CASSANDRA-4453) Better support of collections in the binary protocol

2012-07-20 Thread Sylvain Lebresne (JIRA)
Sylvain Lebresne created CASSANDRA-4453:
---

 Summary: Better support of collections in the binary protocol
 Key: CASSANDRA-4453
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4453
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
Priority: Minor


Currently, collections results are serialized to json string and send that way. 
This doesn't feel right at all for the binary protocol and we should use a 
simple binary serialization of the collection instead.

For the thrift protocol, we might want to keep the json serialization or use 
the same binary serialization. I don't really have much opinion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3647) Support set and map value types in CQL

2012-07-20 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13419036#comment-13419036
 ] 

Sylvain Lebresne commented on CASSANDRA-3647:
-

bq. Attaching updated alternative patch with all problems fixed

lgtm, committed.

bq. schema borrowed from Jonathan's comment from 04/Jun/12

That was due to the fact that the patch hadn't be updated post CASSANDRA-4329. 
I've added the trivial fix for that with the commit of the refactor (and added 
the relevant test to dtests).

 Support set and map value types in CQL
 --

 Key: CASSANDRA-3647
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3647
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Sylvain Lebresne
  Labels: cql
 Fix For: 1.2

 Attachments: CASSANDRA-3647-alternative.patch


 Composite columns introduce the ability to have arbitrarily nested data in a 
 Cassandra row.  We should expose this through CQL.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (CASSANDRA-3647) Support set and map value types in CQL

2012-07-20 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne resolved CASSANDRA-3647.
-

Resolution: Fixed

 Support set and map value types in CQL
 --

 Key: CASSANDRA-3647
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3647
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Sylvain Lebresne
  Labels: cql
 Fix For: 1.2

 Attachments: CASSANDRA-3647-alternative.patch


 Composite columns introduce the ability to have arbitrarily nested data in a 
 Cassandra row.  We should expose this through CQL.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4448) CQL3: allow to define a per-cf default consistency level

2012-07-20 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13419037#comment-13419037
 ] 

Michaël Figuière commented on CASSANDRA-4448:
-

{quote}If you're writing application code, you should not be writing raw CQL; 
you should be using a higher-level, idiomatic API{quote}

Hibernate let users execute queries using either their their own SQL-ish query 
language (HQL) or their QueryBuilder API (Criteria). Actually as far as I can 
observe, most heavy-weight business applications that rely on Hibernate to 
execute many different kind of queries typically mostly use HQL as a proper 
String query often ends up being more readable and easier to maintain than a 
chain of methods. This different case might lead to different habits, but we 
should still consider CQL Language as a major API for applications.

{quote}There can be more than one such code, if only because a lot of people 
need to access their DB from multiple languages.{quote}

I think this is an important point: this feature allow for a central 
enforcement point for CL for applications that rely on default, thus 
simplifying the headache of changing the common CL.
Furthermore I guess some users will wish to decouple their application from CL 
configuration to allow them for some behavior or performance tuning over time. 
Typically I can imagine a DBA that want to trade some consistency for 
performance as a graceful degradation strategy will be happy to just have to 
push an {{ALTER}} command.
Not having it I guess many developers would follow the 
_parameterize-it-just-in-case_ strategy and this would lead to some additional 
properties in their {{.properties}} files, in the case of Java apps.

 CQL3: allow to define a per-cf default consistency level
 

 Key: CASSANDRA-4448
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4448
 Project: Cassandra
  Issue Type: New Feature
Reporter: Sylvain Lebresne
  Labels: cql3
 Fix For: 1.2


 One of the goal of CQL3 is that client library should not have to parse 
 queries to provide a good experience. In particular, that means such client 
 (that don't want to parse queries) won't be able to allow the user to define 
 a specific default read/write consistency level per-CF, forcing user to 
 specific the consistency level with every query, which is not very user 
 friendly.
 This ticket suggests the addition of per-cf default read/write consitency 
 level. Typically the syntax would be:
 {noformat}
 CREATE TABLE foo (...)
 WITH DEFAULT_READ_CONSISTENCY = QUORUM
  AND DEFAULT_WRITE_CONSISTENCY = QUORUM
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2864) Alternative Row Cache Implementation

2012-07-20 Thread Daniel Doubleday (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13419134#comment-13419134
 ] 

Daniel Doubleday commented on CASSANDRA-2864:
-

bq. or to skip cache during the memtable merge for counters

Just a thought: it might be easy enough to only skip the cache if the row is in 
one of the memtables. As in tryCache. When the controller reads a CF from a 
memtable it bails out and the read could be re-performed uncached

 Alternative Row Cache Implementation
 

 Key: CASSANDRA-2864
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2864
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Daniel Doubleday
Assignee: Daniel Doubleday
  Labels: cache
 Fix For: 1.2

 Attachments: 0001-CASSANDRA-2864-w-out-direct-counter-support.patch, 
 rowcache-with-snaptree-sketch.patch


 we have been working on an alternative implementation to the existing row 
 cache(s)
 We have 2 main goals:
 - Decrease memory - get more rows in the cache without suffering a huge 
 performance penalty
 - Reduce gc pressure
 This sounds a lot like we should be using the new serializing cache in 0.8. 
 Unfortunately our workload consists of loads of updates which would 
 invalidate the cache all the time.
 *Note: Updated Patch Description (Please check history if you're interested 
 where this was comming from)*
 h3. Rough Idea
 - Keep serialized row (ByteBuffer) in mem which represents unfiltered but 
 collated columns of all ssts but not memtable columns
 - Writes dont affect the cache at all. They go only to the memtables
 - Reads collect columns from memtables and row cache
 - Serialized Row is re-written (merged) with mem tables when flushed
 h3. Some Implementation Details
 h4. Reads
 - Basically the read logic differ from regular uncached reads only in that a 
 special CollationController which is deserializing columns from in memory 
 bytes
 - In the first version of this cache the serialized in memory format was the 
 same as the fs format but test showed that performance sufferd because a lot 
 of unnecessary deserialization takes place and that columns seeks are O( n ) 
 whithin one block
 - To improve on that a different in memory format was used. It splits length 
 meta info and data of columns so that the names can be binary searched. 
 {noformat}
 ===
 Header (24)
 ===
 MaxTimestamp:long  
 LocalDeletionTime:   int   
 MarkedForDeleteAt:   long  
 NumColumns:  int   
 ===
 Column Index (num cols * 12)  
 ===
 NameOffset:  int   
 ValueOffset: int   
 ValueLength: int   
 ===
 Column Data
 ===
 Name:byte[]
 Value:   byte[]
 SerializationFlags:  byte  
 Misc:? 
 Timestamp:   long  
 ---
 Misc Counter Column
 ---
 TSOfLastDelete:  long  
 ---
 Misc Expiring Column   
 ---
 TimeToLive:  int   
 LocalDeletionTime:   int   
 ===
 {noformat}
 - These rows are read by 2 new column interators which correspond to 
 SSTableNamesIterator and SSTableSliceIterator. During filtering only columns 
 that actually match are constructed. The searching / skipping is performed on 
 the raw ByteBuffer and does not create any objects.
 - A special CollationController is used to access and collate via cache and 
 said new iterators. It also supports skipping the cached row by max update 
 timestamp
 h4. Writes
 - Writes dont update or invalidate the cache.
 - In CFS.replaceFlushed memtables are merged before the data view is 
 switched. I fear that this is killing counters because they would be 
 overcounted but my understading of counters is somewhere between weak and 
 non-existing. I guess that counters if one wants to support them here would 
 need an additional unique local identifier in memory and in serialized cache 
 to be able to filter duplicates or something like that.
 {noformat}
 void replaceFlushed(Memtable memtable, SSTableReader sstable)
 {
 if (sstCache.getCapacity()  0) {
 mergeSSTCache(memtable);
 }
 data.replaceFlushed(memtable, sstable);
 CompactionManager.instance.submitBackground(this);
 }
 {noformat}
 Test Results: See comments below

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 

[jira] [Commented] (CASSANDRA-3855) RemoveDeleted dominates compaction time for large sstable counts

2012-07-20 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13419177#comment-13419177
 ] 

Jonathan Ellis commented on CASSANDRA-3855:
---

We definitely don't want if row is live, nothing to do here behavior, 
otherwise we'll never purge column-level tombstones without a full row deletion.

 RemoveDeleted dominates compaction time for large sstable counts
 

 Key: CASSANDRA-3855
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3855
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.0
Reporter: Stu Hood
Assignee: Yuki Morishita
  Labels: compaction, deletes, leveled
 Attachments: with-cleaning-java.hprof.txt


 With very large numbers of sstables (2000+ generated by a `bin/stress -n 
 100,000,000` run with LeveledCompactionStrategy), 
 PrecompactedRow.removeDeletedAndOldShards dominates compaction runtime, such 
 that commenting it out takes compaction throughput from 200KB/s to 12MB/s.
 Stack attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3855) RemoveDeleted dominates compaction time for large sstable counts

2012-07-20 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13419183#comment-13419183
 ] 

Jonathan Ellis commented on CASSANDRA-3855:
---

+1 for proposed method w/ first 2 lines removed

 RemoveDeleted dominates compaction time for large sstable counts
 

 Key: CASSANDRA-3855
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3855
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.0
Reporter: Stu Hood
Assignee: Yuki Morishita
  Labels: compaction, deletes, leveled
 Attachments: with-cleaning-java.hprof.txt


 With very large numbers of sstables (2000+ generated by a `bin/stress -n 
 100,000,000` run with LeveledCompactionStrategy), 
 PrecompactedRow.removeDeletedAndOldShards dominates compaction runtime, such 
 that commenting it out takes compaction throughput from 200KB/s to 12MB/s.
 Stack attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3855) RemoveDeleted dominates compaction time for large sstable counts

2012-07-20 Thread Yuki Morishita (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuki Morishita updated CASSANDRA-3855:
--

Attachment: 3855.txt

So I summarized and attached patch. Tested on trunk and confirmed it fixed.

 RemoveDeleted dominates compaction time for large sstable counts
 

 Key: CASSANDRA-3855
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3855
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.0
Reporter: Stu Hood
Assignee: Yuki Morishita
  Labels: compaction, deletes, leveled
 Attachments: 3855.txt, with-cleaning-java.hprof.txt


 With very large numbers of sstables (2000+ generated by a `bin/stress -n 
 100,000,000` run with LeveledCompactionStrategy), 
 PrecompactedRow.removeDeletedAndOldShards dominates compaction runtime, such 
 that commenting it out takes compaction throughput from 200KB/s to 12MB/s.
 Stack attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-4454) Add a notice on cqlsh startup about CQL2/3 switches

2012-07-20 Thread JIRA
Michaël Figuière created CASSANDRA-4454:
---

 Summary: Add a notice on cqlsh startup about CQL2/3 switches
 Key: CASSANDRA-4454
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4454
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Affects Versions: 1.1.2
Reporter: Michaël Figuière


Several developers I've talked with seem not to have noticed the {{-3}} switch 
immediately to run in CQL3 mode. If missing, cqlsh can obviously appear buggy 
in its way to handle CQL3.
I guess it would be worth to add a notice at startup about this important 
detail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-4454) Add a notice on cqlsh startup about CQL2/3 switches

2012-07-20 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-4454:
--

 Priority: Minor  (was: Major)
Affects Version/s: (was: 1.1.2)
   1.1.0
Fix Version/s: 1.1.3
 Assignee: paul cannon

 Add a notice on cqlsh startup about CQL2/3 switches
 ---

 Key: CASSANDRA-4454
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4454
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Affects Versions: 1.1.0
Reporter: Michaël Figuière
Assignee: paul cannon
Priority: Minor
  Labels: cqlsh
 Fix For: 1.1.3


 Several developers I've talked with seem not to have noticed the {{-3}} 
 switch immediately to run in CQL3 mode. If missing, cqlsh can obviously 
 appear buggy in its way to handle CQL3.
 I guess it would be worth to add a notice at startup about this important 
 detail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4454) Add a notice on cqlsh startup about CQL2/3 switches

2012-07-20 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13419432#comment-13419432
 ] 

Brandon Williams commented on CASSANDRA-4454:
-

It does actually tell you this on startup:

{noformat}
cassandra-3:/srv/cassandra# bin/cqlsh cassandra-3
Connected to Test Cluster at cassandra-3:9160.
[cqlsh 2.2.0 | Cassandra unknown | CQL spec 2.0.0 | Thrift protocol 19.33.0]
Use HELP for help.
cqlsh 
cassandra-3:/srv/cassandra# bin/cqlsh cassandra-3 -3
Connected to Test Cluster at cassandra-3:9160.
[cqlsh 2.2.0 | Cassandra unknown | CQL spec 3.0.0 | Thrift protocol 19.33.0]
Use HELP for help.
cqlsh 
{noformat}

 Add a notice on cqlsh startup about CQL2/3 switches
 ---

 Key: CASSANDRA-4454
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4454
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Affects Versions: 1.1.0
Reporter: Michaël Figuière
Assignee: paul cannon
Priority: Minor
  Labels: cqlsh
 Fix For: 1.1.3


 Several developers I've talked with seem not to have noticed the {{-3}} 
 switch immediately to run in CQL3 mode. If missing, cqlsh can obviously 
 appear buggy in its way to handle CQL3.
 I guess it would be worth to add a notice at startup about this important 
 detail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




git commit: Query the JVM for the minimum stack size. Patch by Trevor Robinson, reviewed by brandonwilliams for CASSANDRA-4442

2012-07-20 Thread brandonwilliams
Updated Branches:
  refs/heads/cassandra-1.1 e220efa2a - 5bde2a6d5


Query the JVM for the minimum stack size.
Patch by Trevor Robinson, reviewed by brandonwilliams for CASSANDRA-4442


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5bde2a6d
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5bde2a6d
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5bde2a6d

Branch: refs/heads/cassandra-1.1
Commit: 5bde2a6d5d6bed3ff15ec6caf20524717a130ecb
Parents: e220efa
Author: Brandon Williams brandonwilli...@apache.org
Authored: Fri Jul 20 15:01:51 2012 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Fri Jul 20 15:01:51 2012 -0500

--
 conf/cassandra-env.sh |   11 ++-
 1 files changed, 6 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/5bde2a6d/conf/cassandra-env.sh
--
diff --git a/conf/cassandra-env.sh b/conf/cassandra-env.sh
index cef0efb..17c2fb8 100644
--- a/conf/cassandra-env.sh
+++ b/conf/cassandra-env.sh
@@ -151,18 +151,19 @@ if [ x$CASSANDRA_HEAPDUMP_DIR != x ]; then
 JVM_OPTS=$JVM_OPTS 
-XX:HeapDumpPath=$CASSANDRA_HEAPDUMP_DIR/cassandra-`date +%s`-pid$$.hprof
 fi
 
+java_version=`${JAVA:-java} -version 21 | awk '/version/ {print $3}' | 
egrep -o '[0-9]+\.[0-9]+'`
 
 if [ `uname` = Linux ] ; then
-java_version=`${JAVA:-java} -version 21 | awk '/version/ {print $3}' | 
egrep -o '[0-9]+\.[0-9]+'`
+# try to determine JVM stack minimum by using too-small stack
+# (note that 16k causes segfault and smaller prints invalid size error)
+java_min_stack=`${JAVA:-java} -Xss32k 21 | sed -nr 's/The stack size 
specified is too small, Specify at least ([0-9]+k)/\1/p'`
 # reduce the per-thread stack size to minimize the impact of Thrift
 # thread-per-client.  (Best practice is for client connections to
 # be pooled anyway.) Only do so on Linux where it is known to be
 # supported.
-if [ $java_version = 1.7 ]
+if [ -n $java_min_stack ]
 then
-JVM_OPTS=$JVM_OPTS -Xss160k
-else
-JVM_OPTS=$JVM_OPTS -Xss128k
+JVM_OPTS=$JVM_OPTS -Xss$java_min_stack
 fi
 fi
 echo xss = $JVM_OPTS



[jira] [Commented] (CASSANDRA-3991) Investigate importance of jsvc in debian packages

2012-07-20 Thread Taras Ovsyankin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13419526#comment-13419526
 ] 

Taras Ovsyankin commented on CASSANDRA-3991:


We ran into a similar issue with 1.1.2 - nearly 100K open unix sockets got the 
cluster hosed.

 Investigate importance of jsvc in debian packages
 -

 Key: CASSANDRA-3991
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3991
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Brandon Williams
Assignee: Eric Evans
Priority: Minor
 Fix For: 1.2


 jsvc seems to be buggy at best.  For instance, if you set a small heap like 
 128M it seems to completely ignore this and use as much memory as it wants.  
 I don't know what this is buying us over launching /usr/bin/cassandra 
 directly like the redhat scripts do, but I've seen multiple complaints about 
 its memory usage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




git commit: Revert Query the JVM for the minimum stack size.

2012-07-20 Thread brandonwilliams
Updated Branches:
  refs/heads/cassandra-1.1 5bde2a6d5 - bfdfe9014


Revert Query the JVM for the minimum stack size.

This reverts commit 5bde2a6d5d6bed3ff15ec6caf20524717a130ecb.


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/bfdfe901
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/bfdfe901
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/bfdfe901

Branch: refs/heads/cassandra-1.1
Commit: bfdfe9014af788fc9f84ad1283165d9730b999a5
Parents: 5bde2a6
Author: Brandon Williams brandonwilli...@apache.org
Authored: Fri Jul 20 15:41:57 2012 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Fri Jul 20 15:41:57 2012 -0500

--
 conf/cassandra-env.sh |   11 +--
 1 files changed, 5 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/bfdfe901/conf/cassandra-env.sh
--
diff --git a/conf/cassandra-env.sh b/conf/cassandra-env.sh
index 17c2fb8..cef0efb 100644
--- a/conf/cassandra-env.sh
+++ b/conf/cassandra-env.sh
@@ -151,19 +151,18 @@ if [ x$CASSANDRA_HEAPDUMP_DIR != x ]; then
 JVM_OPTS=$JVM_OPTS 
-XX:HeapDumpPath=$CASSANDRA_HEAPDUMP_DIR/cassandra-`date +%s`-pid$$.hprof
 fi
 
-java_version=`${JAVA:-java} -version 21 | awk '/version/ {print $3}' | 
egrep -o '[0-9]+\.[0-9]+'`
 
 if [ `uname` = Linux ] ; then
-# try to determine JVM stack minimum by using too-small stack
-# (note that 16k causes segfault and smaller prints invalid size error)
-java_min_stack=`${JAVA:-java} -Xss32k 21 | sed -nr 's/The stack size 
specified is too small, Specify at least ([0-9]+k)/\1/p'`
+java_version=`${JAVA:-java} -version 21 | awk '/version/ {print $3}' | 
egrep -o '[0-9]+\.[0-9]+'`
 # reduce the per-thread stack size to minimize the impact of Thrift
 # thread-per-client.  (Best practice is for client connections to
 # be pooled anyway.) Only do so on Linux where it is known to be
 # supported.
-if [ -n $java_min_stack ]
+if [ $java_version = 1.7 ]
 then
-JVM_OPTS=$JVM_OPTS -Xss$java_min_stack
+JVM_OPTS=$JVM_OPTS -Xss160k
+else
+JVM_OPTS=$JVM_OPTS -Xss128k
 fi
 fi
 echo xss = $JVM_OPTS



[jira] [Reopened] (CASSANDRA-4442) Stack size settings in cassandra-env.sh assume 64-bit x86

2012-07-20 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams reopened CASSANDRA-4442:
-


Reverted, as this is causing failures:

{noformat}

java.lang.StackOverflowError
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
at java.io.DataOutputStream.flush(DataOutputStream.java:106)
at 
org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:156)
at 
org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:126)
{noformat}

 Stack size settings in cassandra-env.sh assume 64-bit x86
 -

 Key: CASSANDRA-4442
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4442
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.2
Reporter: Trevor Robinson
Assignee: Trevor Robinson
 Fix For: 1.1.3

 Attachments: 
 v1-0001-CASSANDRA-4275-Use-JVM-s-reported-minimum-stack-size-o.txt


 The fix for CASSANDRA-4275 hard-codes a 160 KB stack size when using Java 7 
 on Linux. This assumes the Oracle 7u4 JVM on 64-bit x86. For systems like 
 32-bit ARM, this size is excessive (the minimum for 7u4 on ARM is 60-64 KB). 
 Also, the minimum allowed value is version-dependent and is calculated 
 dynamically by the JVM on startup based on Linux parameters that can also 
 change. A better approach would be to query the JVM for the minimum stack 
 size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Comment Edited] (CASSANDRA-4442) Stack size settings in cassandra-env.sh assume 64-bit x86

2012-07-20 Thread Trevor Robinson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13419564#comment-13419564
 ] 

Trevor Robinson edited comment on CASSANDRA-4442 at 7/20/12 8:59 PM:
-

So apparently we don't want the minimum after all. :)

What type of system hit the failure? Java 6 on amd64, I'm guessing, since it 
used to use 128k vs 104k.

Should the patch be changed to add a 25% margin? ((128 - 104) / 104 = ~0.23) 
This would yield settings of 200k, 130k, and 80k for the systems mentioned in 
my initial comment.

  was (Author: scurrilous):
So apparently we don't want the minimum after all. :-

What type of system hit the failure? Java 6 on amd64, I'm guessing, since it 
used to use 128k vs 104k.

Should the patch be changed to add a 25% margin? ((128 - 104) / 104 = ~0.23) 
This would yield settings of 200k, 130k, and 80k for the systems mentioned in 
my initial comment.
  
 Stack size settings in cassandra-env.sh assume 64-bit x86
 -

 Key: CASSANDRA-4442
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4442
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.2
Reporter: Trevor Robinson
Assignee: Trevor Robinson
 Fix For: 1.1.3

 Attachments: 
 v1-0001-CASSANDRA-4275-Use-JVM-s-reported-minimum-stack-size-o.txt


 The fix for CASSANDRA-4275 hard-codes a 160 KB stack size when using Java 7 
 on Linux. This assumes the Oracle 7u4 JVM on 64-bit x86. For systems like 
 32-bit ARM, this size is excessive (the minimum for 7u4 on ARM is 60-64 KB). 
 Also, the minimum allowed value is version-dependent and is calculated 
 dynamically by the JVM on startup based on Linux parameters that can also 
 change. A better approach would be to query the JVM for the minimum stack 
 size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-4442) Stack size settings in cassandra-env.sh assume 64-bit x86

2012-07-20 Thread Trevor Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Trevor Robinson updated CASSANDRA-4442:
---

Attachment: 
v1-0001-CASSANDRA-4442-Use-JVM-s-reported-minimum-stack-size-o.txt

 Stack size settings in cassandra-env.sh assume 64-bit x86
 -

 Key: CASSANDRA-4442
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4442
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.2
Reporter: Trevor Robinson
Assignee: Trevor Robinson
 Fix For: 1.1.3

 Attachments: 
 v1-0001-CASSANDRA-4275-Use-JVM-s-reported-minimum-stack-size-o.txt, 
 v1-0001-CASSANDRA-4442-Use-JVM-s-reported-minimum-stack-size-o.txt


 The fix for CASSANDRA-4275 hard-codes a 160 KB stack size when using Java 7 
 on Linux. This assumes the Oracle 7u4 JVM on 64-bit x86. For systems like 
 32-bit ARM, this size is excessive (the minimum for 7u4 on ARM is 60-64 KB). 
 Also, the minimum allowed value is version-dependent and is calculated 
 dynamically by the JVM on startup based on Linux parameters that can also 
 change. A better approach would be to query the JVM for the minimum stack 
 size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4275) Oracle Java 1.7 u4 does not allow Xss128k

2012-07-20 Thread Jeremy Hanna (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13419602#comment-13419602
 ] 

Jeremy Hanna commented on CASSANDRA-4275:
-

Should this be reopened to set it to -Xss256k?  Both Ed Capriolo and another 
person posting in IRC have found 160 insufficient.  Ed is running in production 
with 256 and the other person is changing to 256.
Rav|2 how should I set -Xss for oracle java 7? 160k causes 
java.lang.StackOverflowError :(
ecapriolo 256 or higher
Rav|2 ecapriolo: everything is back to normal with 256. big thanks :)

 Oracle Java 1.7 u4 does not allow Xss128k
 -

 Key: CASSANDRA-4275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.9, 1.1.0
Reporter: Edward Capriolo
Assignee: Sylvain Lebresne
 Fix For: 1.1.2

 Attachments: 4275.txt, trunk-cassandra-4275.1.patch.txt, 
 v1-0001-CASSANDRA-4275-Use-JVM-s-reported-minimum-stack-size-o.txt


 Problem: This happens when you try to start it with default Xss setting of 
 128k
 ===
 The stack size specified is too small, Specify at least 160k
 Error: Could not create the Java Virtual Machine.
 Error: A fatal exception has occurred. Program will exit.
 Solution
 ===
 Set -Xss to 256k
 Problem: This happens when you try to start it with Xss = 160k
 
 ERROR [Thrift:14] 2012-05-22 14:42:40,479 AbstractCassandraDaemon.java (line 
 139) Fatal exception in thread Thread[Thrift:14,5,main]
 java.lang.StackOverflowError
 Solution
 ===
 Set -Xss to 256k

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4275) Oracle Java 1.7 u4 does not allow Xss128k

2012-07-20 Thread Trevor Robinson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13419606#comment-13419606
 ] 

Trevor Robinson commented on CASSANDRA-4275:


My proposed fix for CASSANDRA-4442 would set it to 200k (160k * 1.25). Would 
that be large enough?

 Oracle Java 1.7 u4 does not allow Xss128k
 -

 Key: CASSANDRA-4275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.9, 1.1.0
Reporter: Edward Capriolo
Assignee: Sylvain Lebresne
 Fix For: 1.1.2

 Attachments: 4275.txt, trunk-cassandra-4275.1.patch.txt, 
 v1-0001-CASSANDRA-4275-Use-JVM-s-reported-minimum-stack-size-o.txt


 Problem: This happens when you try to start it with default Xss setting of 
 128k
 ===
 The stack size specified is too small, Specify at least 160k
 Error: Could not create the Java Virtual Machine.
 Error: A fatal exception has occurred. Program will exit.
 Solution
 ===
 Set -Xss to 256k
 Problem: This happens when you try to start it with Xss = 160k
 
 ERROR [Thrift:14] 2012-05-22 14:42:40,479 AbstractCassandraDaemon.java (line 
 139) Fatal exception in thread Thread[Thrift:14,5,main]
 java.lang.StackOverflowError
 Solution
 ===
 Set -Xss to 256k

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




git commit: fix incorrect hasIrrelevantData result for live CF; patch by yukim, reviewed by jbellis/slebresne for CASSANDRA-3855

2012-07-20 Thread yukim
Updated Branches:
  refs/heads/trunk 2b62df244 - d74103735


fix incorrect hasIrrelevantData result for live CF; patch by yukim, reviewed by 
jbellis/slebresne for CASSANDRA-3855


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d7410373
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d7410373
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d7410373

Branch: refs/heads/trunk
Commit: d74103735126658d64cb92a16f4bb40f63d5e2e8
Parents: 2b62df2
Author: Yuki Morishita yu...@apache.org
Authored: Fri Jul 20 17:33:05 2012 -0500
Committer: Yuki Morishita yu...@apache.org
Committed: Fri Jul 20 17:33:05 2012 -0500

--
 .../cassandra/db/AbstractColumnContainer.java  |7 ---
 1 files changed, 4 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/d7410373/src/java/org/apache/cassandra/db/AbstractColumnContainer.java
--
diff --git a/src/java/org/apache/cassandra/db/AbstractColumnContainer.java 
b/src/java/org/apache/cassandra/db/AbstractColumnContainer.java
index f58f6ab..2c86070 100644
--- a/src/java/org/apache/cassandra/db/AbstractColumnContainer.java
+++ b/src/java/org/apache/cassandra/db/AbstractColumnContainer.java
@@ -197,12 +197,13 @@ public abstract class AbstractColumnContainer implements 
IColumnContainer, IIter
 
 public boolean hasIrrelevantData(int gcBefore)
 {
-if (deletionInfo().purge(gcBefore) == DeletionInfo.LIVE)
+// Do we have gcable deletion infos?
+if (!deletionInfo().purge(gcBefore).equals(deletionInfo()))
 return true;
 
-long deletedAt = deletionInfo().maxTimestamp();
+// Do we have colums that are either deleted by the container or 
gcable tombstone?
 for (IColumn column : columns)
-if (column.mostRecentLiveChangeAt() = deletedAt || 
column.hasIrrelevantData(gcBefore))
+if (deletionInfo().isDeleted(column) || 
column.hasIrrelevantData(gcBefore))
 return true;
 
 return false;



[jira] [Resolved] (CASSANDRA-3855) RemoveDeleted dominates compaction time for large sstable counts

2012-07-20 Thread Yuki Morishita (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuki Morishita resolved CASSANDRA-3855.
---

   Resolution: Fixed
Fix Version/s: 1.2
 Reviewer: jbellis

Committed to trunk.

 RemoveDeleted dominates compaction time for large sstable counts
 

 Key: CASSANDRA-3855
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3855
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.0
Reporter: Stu Hood
Assignee: Yuki Morishita
  Labels: compaction, deletes, leveled
 Fix For: 1.2

 Attachments: 3855.txt, with-cleaning-java.hprof.txt


 With very large numbers of sstables (2000+ generated by a `bin/stress -n 
 100,000,000` run with LeveledCompactionStrategy), 
 PrecompactedRow.removeDeletedAndOldShards dominates compaction runtime, such 
 that commenting it out takes compaction throughput from 200KB/s to 12MB/s.
 Stack attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4454) Add a notice on cqlsh startup about CQL2/3 switches

2012-07-20 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13419641#comment-13419641
 ] 

Michaël Figuière commented on CASSANDRA-4454:
-

Indeed the way it's currently mentioned is enough for experienced users. On the 
other hand newcomers might not even be familiar with the fact we're in the 
middle of a CQL grammar switch, then they run cqlsh with default settings, copy 
paste an example of CQL3 DDL with a composite column from a web page and end up 
with something like this:

{noformat}
cqlsh:mykeyspace CREATE TABLE timeline (
  ...  user_id varchar,
  ...  tweet_id uuid,
  ...  author varchar,
  ...  body varchar,
  ...  PRIMARY KEY (user_id, tweet_id));
Bad Request: line 6:40 mismatched input ')' expecting EOF
{noformat}

This is an example of confusing error you can have pushing CQL3 commands. The 
fact that several guys had difficulties with this situation tends to show that 
a message like {{Consider using -3 switch to enable CQL3}} would be useful, 
either at startup or when a CQL2/3 grammar mismatch occur. As the former is 
trivial to implement, I was suggesting it. 

 Add a notice on cqlsh startup about CQL2/3 switches
 ---

 Key: CASSANDRA-4454
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4454
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Affects Versions: 1.1.0
Reporter: Michaël Figuière
Assignee: paul cannon
Priority: Minor
  Labels: cqlsh
 Fix For: 1.1.3


 Several developers I've talked with seem not to have noticed the {{-3}} 
 switch immediately to run in CQL3 mode. If missing, cqlsh can obviously 
 appear buggy in its way to handle CQL3.
 I guess it would be worth to add a notice at startup about this important 
 detail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3855) RemoveDeleted dominates compaction time for large sstable counts

2012-07-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13419677#comment-13419677
 ] 

Hudson commented on CASSANDRA-3855:
---

Integrated in Cassandra #1734 (See 
[https://builds.apache.org/job/Cassandra/1734/])
fix incorrect hasIrrelevantData result for live CF; patch by yukim, 
reviewed by jbellis/slebresne for CASSANDRA-3855 (Revision 
d74103735126658d64cb92a16f4bb40f63d5e2e8)

 Result = ABORTED
yukim : 
Files : 
* src/java/org/apache/cassandra/db/AbstractColumnContainer.java


 RemoveDeleted dominates compaction time for large sstable counts
 

 Key: CASSANDRA-3855
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3855
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.0
Reporter: Stu Hood
Assignee: Yuki Morishita
  Labels: compaction, deletes, leveled
 Fix For: 1.2

 Attachments: 3855.txt, with-cleaning-java.hprof.txt


 With very large numbers of sstables (2000+ generated by a `bin/stress -n 
 100,000,000` run with LeveledCompactionStrategy), 
 PrecompactedRow.removeDeletedAndOldShards dominates compaction runtime, such 
 that commenting it out takes compaction throughput from 200KB/s to 12MB/s.
 Stack attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2116) Separate out filesystem errors from generic IOErrors

2012-07-20 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13419731#comment-13419731
 ] 

Jonathan Ellis commented on CASSANDRA-2116:
---

DD.createAllDirectories will stop trying to create as soon as the first 
directory fails, so it's not going to be appropriate for generic FSWriteError 
handling.  Suggest logging an error and explicitly shutting down instead.  
(This should only be called on startup.)

Looks like we should drop the throws IOException declaration from 
applyIndexUpdates (and have that chain throw FSWE as needed).

BatchCommitLogExecutorService.processWithSyncBatch should throw FSWE instead of 
RTE.  

CommitLogSegment.sync should turn IOE into FSWE.  Rest of sync heirarchy won't 
need throws declaration.

Note for CASSANDRA-2118: will need to unwrap exceptions looking for FSWE since 
CLES/PCLES can wrap in ExecutionException.  (Others might as well.  Easier to 
do unwrap check in 2118 than to audit all possible executors.)  On the other 
hand, this makes trying to catch the error before it hits the exception hook 
more of a pain, as in the next item...

CollationController needs to retain its try/catch, since we want to allow the 
read to succeed, even if the defragmenting write fails.  Since it could error 
w/ either FSWE or EE (from the commitlog add), probably need to catch generic 
Exception.  For 2118 we can add some way to submit this to the disk blacklister 
without re-throwing.

Looks like it would be worth adding a constructor for FSRW taking a Descriptor.

SSTR.createLinks should throw FSWE.

Methods called by SSTW constructor should throw FSWE.

SSTW methods should throw FSWE. (callers of append will want to catch + 
re-throw after cleanup.)

TruncateVerbHandler (and anyone else) shouldn't swallow potential FSWE by 
logging, need to rethrow.  (FBUtilities.unchecked is handy in such cases.)

I agree with your use of AssertionError in LCR.  Would prefer to use RTE in 
SSTableReader though, since we do some tricky reference counting around that 
and I wouldn't want to ignore problems there b/c someone turned off assertions. 
 (Surprisingly common...)

SSTII should throw IOException when it doesn't know what DataInput is.  Callers 
can transform to FSRE.  (Other constructors, or in the last case, 
IncomingStreamReader.)

Corrupt sstables (sstablescanner + others?) shouldn't be turned into FSRE, 
since it's usually bad memory or a bug and not the disk's fault.

FileUtils should throw FSWE.

BTW: congratulations on getting import ordering (almost) correct on the first 
try.  The only thing missing is, com.google.common goes above org.slf4j instead 
of being lumped in with everything else.


 Separate out filesystem errors from generic IOErrors
 

 Key: CASSANDRA-2116
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2116
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Goffinet
Assignee: Aleksey Yeschenko
 Fix For: 1.2

 Attachments: 
 0001-Issue-2116-Replace-some-IOErrors-with-more-informati.patch, 
 0001-Separate-out-filesystem-errors-from-generic-IOErrors.patch, 
 CASSANDRA-2116-v3.patch


 We throw IOErrors everywhere today in the codebase. We should separate out 
 specific errors such as (reading, writing) from filesystem into FSReadError 
 and FSWriteError. This makes it possible in the next ticket to allow certain 
 failure modes (kill the server if reads or writes fail to disk).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1123) Allow tracing query details

2012-07-20 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13419733#comment-13419733
 ] 

Jonathan Ellis commented on CASSANDRA-1123:
---

Nit: rest of the code base places static imports after non-static.

Inclined to think we should include the parameters along w/ the String request 
type on session start.  (Object... + toString would be adequate.)  Maybe even 
use the new List type to store the arguments (CASSANDRA-3647).

get_slice uses startSession instead of startSessionIfRequested.

a session named execute_cql_query is not very useful.  Should use queryString 
instead.  May want to just push the CQL tracing into (cql3) QueryProcessor.  
This will mean less code to duplicate in the native CQL protocol handler.

Tracing should be asynchronous.  StorageProxy.mutate waits for a response, this 
is not what we want.  Suggest a simple ExecutorService + queue.  (If queue gets 
full, throw out the tracing events and log a WARN.)

Would like tracing to log.debug the event as well.  This will cut down on 
duplicate debug/trace code, but also give us a fallback if we can't log it 
remotely.  This will also cut down on log spam for when we enable debug level 
globally -- only logging requests at debug where tracing was explicitly enabled 
will be a huge improvement.

CFMetaData definitions should be with the other hardcoded ones in CFMetaData.

Let's move helpers that are only used by test code like EVENT_TYPE into the 
Test class.

There's a no-op initialization of trace context in StorageService.

Still think threadlocals are not the way to go, and this will become more clear 
as you try to add useful trace entries.  I think you'll end up w/ a trace 
session registry like we have for MessagingService that we'll look up by 
session id.  In that vein, I'm not sure what the afterExecute business is 
supposed to be doing.  That stuff runs on the executor's thread, not the 
submitter's.

Naming: system_enable_query_details - system_enable_query__tracing.  
TraceSession, TraceSessionState - TraceSessionContext, 
TraceSessionContextThreadLocalState.  endSession - stopSession.  
getQueryDetails - isTracingEnabled.

Finally, a more generic keyspace name like dsystem (?) would be nice for all 
distributed system tables.  (We're thinking of using one for CASSANDRA-3706, 
for instance.)


 Allow tracing query details
 ---

 Key: CASSANDRA-1123
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1123
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: David Alves
 Fix For: 1.2

 Attachments: 1123-3.patch.gz, 1123.patch


 In the spirit of CASSANDRA-511, it would be useful to tracing on queries to 
 see where latency is coming from: how long did row cache lookup take?  key 
 search in the index?  merging the data from the sstables?  etc.
 The main difference vs setting debug logging is that debug logging is too big 
 of a hammer; by turning on the flood of logging for everyone, you actually 
 distort the information you're looking for.  This would be something you 
 could set per-query (or more likely per connection).
 We don't need to be as sophisticated as the techniques discussed in the 
 following papers but they are interesting reading:
 http://research.google.com/pubs/pub36356.html
 http://www.usenix.org/events/osdi04/tech/full_papers/barham/barham_html/
 http://www.usenix.org/event/nsdi07/tech/fonseca.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4275) Oracle Java 1.7 u4 does not allow Xss128k

2012-07-20 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13419734#comment-13419734
 ] 

Jonathan Ellis commented on CASSANDRA-4275:
---

I'm -1 on just shrugging and throwing numbers at it until it appears to work.  
I want to find out *why* 160 isn't enough, because there isn't supposed to be 
anything allocated on the stack anywhere near that size.  Let's either find the 
bug that is the root cause or advance our understanding of where that memory is 
going.

 Oracle Java 1.7 u4 does not allow Xss128k
 -

 Key: CASSANDRA-4275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.9, 1.1.0
Reporter: Edward Capriolo
Assignee: Sylvain Lebresne
 Fix For: 1.1.2

 Attachments: 4275.txt, trunk-cassandra-4275.1.patch.txt, 
 v1-0001-CASSANDRA-4275-Use-JVM-s-reported-minimum-stack-size-o.txt


 Problem: This happens when you try to start it with default Xss setting of 
 128k
 ===
 The stack size specified is too small, Specify at least 160k
 Error: Could not create the Java Virtual Machine.
 Error: A fatal exception has occurred. Program will exit.
 Solution
 ===
 Set -Xss to 256k
 Problem: This happens when you try to start it with Xss = 160k
 
 ERROR [Thrift:14] 2012-05-22 14:42:40,479 AbstractCassandraDaemon.java (line 
 139) Fatal exception in thread Thread[Thrift:14,5,main]
 java.lang.StackOverflowError
 Solution
 ===
 Set -Xss to 256k

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1123) Allow tracing query details

2012-07-20 Thread David Alves (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13419737#comment-13419737
 ] 

David Alves commented on CASSANDRA-1123:


Thank you for the thorough review.

bq. Inclined to think we should include the parameters along w/ the String 
request type on session start. (Object... + toString would be adequate.) Maybe 
even use the new List type to store the arguments (CASSANDRA-3647).

In one of the papers this same issue raised privacy concerns that are probably 
even more valid for CASSANDRA since it's an open source project. IMO we should 
at the very least make this optional if not drop it all together.

bq. Tracing should be asynchronous. StorageProxy.mutate waits for a response, 
this is not what we want. Suggest a simple ExecutorService + queue. (If queue 
gets full, throw out the tracing events and log a WARN.). 

bq. Would like tracing to log.debug the event as well. This will cut down on 
duplicate debug/trace code, but also give us a fallback if we can't log it 
remotely. This will also cut down on log spam for when we enable debug level 
globally – only logging requests at debug where tracing was explicitly enabled 
will be a huge improvement.

+1, nice idea, was looking into doing something similar.

bq. get_slice uses startSession instead of startSessionIfRequested.
bq. There's a no-op initialization of trace context in StorageService.
bq. Naming: system_enable_query_details - system_enable_query__tracing. 
TraceSession, TraceSessionState - TraceSessionContext, 
TraceSessionContextThreadLocalState. endSession - stopSession. getQueryDetails 
- isTracingEnabled.

Most already corrected in current version.

bq. CFMetaData definitions should be with the other hardcoded ones in 
CFMetaData.
bq. Let's move helpers that are only used by test code like EVENT_TYPE into the 
Test class.

will do!

bq. Still think threadlocals are not the way to go, and this will become more 
clear as you try to add useful trace entries. I think you'll end up w/ a trace 
session registry like we have for MessagingService that we'll look up by 
session id. In that vein, I'm not sure what the afterExecute business is 
supposed to be doing. That stuff runs on the executor's thread, not the 
submitter's.

Current version uses this to trace pre and post stage execution (which are the 
only trace events at the moment). I've just finished updating the cli and I'd 
like a chance to do some runs as is to get a sense of the usefulness of the 
current setup.

bq. Finally, a more generic keyspace name like dsystem  would be nice for all 
distributed system tables. (We're thinking of using one for CASSANDRA-3706, for 
instance.)

will look into/synchronize.





 Allow tracing query details
 ---

 Key: CASSANDRA-1123
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1123
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: David Alves
 Fix For: 1.2

 Attachments: 1123-3.patch.gz, 1123.patch


 In the spirit of CASSANDRA-511, it would be useful to tracing on queries to 
 see where latency is coming from: how long did row cache lookup take?  key 
 search in the index?  merging the data from the sstables?  etc.
 The main difference vs setting debug logging is that debug logging is too big 
 of a hammer; by turning on the flood of logging for everyone, you actually 
 distort the information you're looking for.  This would be something you 
 could set per-query (or more likely per connection).
 We don't need to be as sophisticated as the techniques discussed in the 
 following papers but they are interesting reading:
 http://research.google.com/pubs/pub36356.html
 http://www.usenix.org/events/osdi04/tech/full_papers/barham/barham_html/
 http://www.usenix.org/event/nsdi07/tech/fonseca.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1123) Allow tracing query details

2012-07-20 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13419739#comment-13419739
 ] 

Jonathan Ellis commented on CASSANDRA-1123:
---

bq. In one of the papers this same issue raised privacy concerns

there must be some other context there because that doesn't make sense here: 
these are parameters to requests *operating on the database, that we're 
proposing to store them in.*  in other words: given that Cassandra already has 
your data, storing it a second time is not a privacy concern.

think of it as a subset of auditing, if you prefer.

 Allow tracing query details
 ---

 Key: CASSANDRA-1123
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1123
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: David Alves
 Fix For: 1.2

 Attachments: 1123-3.patch.gz, 1123.patch


 In the spirit of CASSANDRA-511, it would be useful to tracing on queries to 
 see where latency is coming from: how long did row cache lookup take?  key 
 search in the index?  merging the data from the sstables?  etc.
 The main difference vs setting debug logging is that debug logging is too big 
 of a hammer; by turning on the flood of logging for everyone, you actually 
 distort the information you're looking for.  This would be something you 
 could set per-query (or more likely per connection).
 We don't need to be as sophisticated as the techniques discussed in the 
 following papers but they are interesting reading:
 http://research.google.com/pubs/pub36356.html
 http://www.usenix.org/events/osdi04/tech/full_papers/barham/barham_html/
 http://www.usenix.org/event/nsdi07/tech/fonseca.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1123) Allow tracing query details

2012-07-20 Thread David Alves (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13419741#comment-13419741
 ] 

David Alves commented on CASSANDRA-1123:


i get the point, the original paper mentioned that that was the case because 
the data was stored outside of the cluster. 

Still there is the question of authentication, even though access control is 
not completely implemented, the principle behind it is per keyspace access 
correct? this would meant that we're storing data belonging to a keyspace (that 
might have access control) in another keyspace (that must be outside 
accessible). am I wrong?




 Allow tracing query details
 ---

 Key: CASSANDRA-1123
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1123
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: David Alves
 Fix For: 1.2

 Attachments: 1123-3.patch.gz, 1123.patch


 In the spirit of CASSANDRA-511, it would be useful to tracing on queries to 
 see where latency is coming from: how long did row cache lookup take?  key 
 search in the index?  merging the data from the sstables?  etc.
 The main difference vs setting debug logging is that debug logging is too big 
 of a hammer; by turning on the flood of logging for everyone, you actually 
 distort the information you're looking for.  This would be something you 
 could set per-query (or more likely per connection).
 We don't need to be as sophisticated as the techniques discussed in the 
 following papers but they are interesting reading:
 http://research.google.com/pubs/pub36356.html
 http://www.usenix.org/events/osdi04/tech/full_papers/barham/barham_html/
 http://www.usenix.org/event/nsdi07/tech/fonseca.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira