[jira] [Commented] (CASSANDRA-47) SSTable compression

2011-05-14 Thread Terje Marthinussen (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13033477#comment-13033477
 ] 

Terje Marthinussen commented on CASSANDRA-47:
-

And yes, that traversal two times of the row to calculate serialized size 
before writing the row is reasonably expensive with compression.

Any good reason we absolutely have to do this in two passes instead of one?
Or just the way it is for historical purposes?

Yes, we would need to write indexes etc. after the row or in a separate file if 
we don't traverse it twice, but this would seems a relatively small change on 
first sight?


 SSTable compression
 ---

 Key: CASSANDRA-47
 URL: https://issues.apache.org/jira/browse/CASSANDRA-47
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Priority: Minor
  Labels: compression
 Fix For: 1.0


 We should be able to do SSTable compression which would trade CPU for I/O 
 (almost always a good trade).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-2651) Inferred Rack and DC Values Should be Unsigned

2011-05-14 Thread Jerry Pisk (JIRA)
Inferred Rack and DC Values Should be Unsigned
--

 Key: CASSANDRA-2651
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2651
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.5
Reporter: Jerry Pisk
Priority: Minor


RackInferringSnitch formats IP address octets as signed byte values when 
inferring rack and data center values.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-2651) Inferred Rack and DC Values Should be Unsigned

2011-05-14 Thread Jerry Pisk (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry Pisk updated CASSANDRA-2651:
--

Attachment: trunk-2651.txt

RackInferringSnitch patch

 Inferred Rack and DC Values Should be Unsigned
 --

 Key: CASSANDRA-2651
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2651
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.5
Reporter: Jerry Pisk
Priority: Minor
 Attachments: trunk-2651.txt


 RackInferringSnitch formats IP address octets as signed byte values when 
 inferring rack and data center values.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-1610) Pluggable Compaction

2011-05-14 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13033534#comment-13033534
 ] 

Jonathan Ellis commented on CASSANDRA-1610:
---

The suggested alternative compaction strategies don't sound very generally 
useful. We shouldn't maintain them in-tree.

So what we should do here is provide a CompactionStrategyProvider interface 
that will give us back CompactionStrategy objects implementing doCompaction, 
doCleanupCompaction, probably submitMinorIfNeeded, etc.  We shouldn't have any 
provider-specific logic left in CompactionManager itself, it should all be 
based on the pluggable Strategy.

 Pluggable Compaction
 

 Key: CASSANDRA-1610
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1610
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Goffinet
Assignee: Alan Liang
Priority: Minor
  Labels: compaction
 Fix For: 1.0

 Attachments: 0001-move-compaction-code-into-own-package.patch, 
 0002-Pluggable-Compaction-and-Expiration.patch


 In CASSANDRA-1608, I proposed some changes on how compaction works. I think 
 it also makes sense to allow the ability to have pluggable compaction per CF. 
 There could be many types of workloads where this makes sense. One example we 
 had at Digg was to completely throw away certain SSTables after N days.
 The goal of this ticket is to make compaction pluggable enough to support 
 compaction based on max timestamp ordering of the sstables while satisfying 
 max sstable size, min and max compaction thresholds. Another goal is to allow 
 expiration of sstables based on a timestamp.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-2652) Hinted handoff needs to adjust page size for lage columns to avoid OOM

2011-05-14 Thread Jonathan Ellis (JIRA)
Hinted handoff needs to adjust page size for lage columns to avoid OOM
--

 Key: CASSANDRA-2652
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2652
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 0.7.7
 Attachments: 2652.txt

Example OOM:
{noformat}
java.lang.OutOfMemoryError: Java heap space
at 
org.apache.cassandra.io.util.BufferedRandomAccessFile.readBytes(BufferedRandomAccessFile.java:269)
at 
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:356)
at 
org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:318)
at 
org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:99)
at 
org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:248)
at 
org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:268)
at 
org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:227)
at 
java.util.concurrent.ConcurrentSkipListMap.buildFromSorted(ConcurrentSkipListMap.java:1493)
at 
java.util.concurrent.ConcurrentSkipListMap.init(ConcurrentSkipListMap.java:1443)
at 
org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:379)
at 
org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:362)
at 
org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:322)
at 
org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:179)
at 
org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:121)
at 
org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:49)
at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
at 
org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:108)
at 
org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:283)
at 
org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326)
at 
org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230)
at 
org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:69)
at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
at 
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:116)
at 
org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(QueryFilter.java:130)
at 
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1390)
at 
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1267)
at 
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1195)
at 
org.apache.cassandra.db.HintedHandOffManager.sendMessage(HintedHandOffManager.java:138)
at 
org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:331)
at 
org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffManager.java:88)
{noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-2652) Hinted handoff needs to adjust page size for lage columns to avoid OOM

2011-05-14 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2652:
--

Attachment: 2652.txt

(also renames sendMessage to sendRow.)

 Hinted handoff needs to adjust page size for lage columns to avoid OOM
 --

 Key: CASSANDRA-2652
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2652
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 0.7.7

 Attachments: 2652.txt


 Example OOM:
 {noformat}
 java.lang.OutOfMemoryError: Java heap space
   at 
 org.apache.cassandra.io.util.BufferedRandomAccessFile.readBytes(BufferedRandomAccessFile.java:269)
   at 
 org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:356)
   at 
 org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:318)
   at 
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:99)
   at 
 org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:248)
   at 
 org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:268)
   at 
 org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:227)
   at 
 java.util.concurrent.ConcurrentSkipListMap.buildFromSorted(ConcurrentSkipListMap.java:1493)
   at 
 java.util.concurrent.ConcurrentSkipListMap.init(ConcurrentSkipListMap.java:1443)
   at 
 org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:379)
   at 
 org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:362)
   at 
 org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:322)
   at 
 org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:179)
   at 
 org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:121)
   at 
 org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:49)
   at 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
   at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
   at 
 org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:108)
   at 
 org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:283)
   at 
 org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326)
   at 
 org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230)
   at 
 org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:69)
   at 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
   at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
   at 
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:116)
   at 
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(QueryFilter.java:130)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1390)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1267)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1195)
   at 
 org.apache.cassandra.db.HintedHandOffManager.sendMessage(HintedHandOffManager.java:138)
   at 
 org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:331)
   at 
 org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffManager.java:88)
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (CASSANDRA-2034) Make Read Repair unnecessary when Hinted Handoff is enabled

2011-05-14 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reassigned CASSANDRA-2034:
-

Assignee: Sylvain Lebresne  (was: Jonathan Ellis)

We should probably make this aggressive HH optional to start with, in case of 
bugs if nothing else.

Or maybe change hinted_handoff_enabled from a bool to a string -- off, old, 
aggressive.

 Make Read Repair unnecessary when Hinted Handoff is enabled
 ---

 Key: CASSANDRA-2034
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2034
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Sylvain Lebresne
 Fix For: 1.0

   Original Estimate: 8h
  Remaining Estimate: 8h

 Currently, HH is purely an optimization -- if a machine goes down, enabling 
 HH means RR/AES will have less work to do, but you can't disable RR entirely 
 in most situations since HH doesn't kick in until the FailureDetector does.
 Let's add a scheduled task to the mutate path, such that we return to the 
 client normally after ConsistencyLevel is achieved, but after RpcTimeout we 
 check the responseHandler write acks and write local hints for any missing 
 targets.
 This would making disabling RR when HH is enabled a much more reasonable 
 option, which has a huge impact on read throughput.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-2653) index scan errors out when zero columns are requested

2011-05-14 Thread Jonathan Ellis (JIRA)
index scan errors out when zero columns are requested
-

 Key: CASSANDRA-2653
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2653
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 0.7.7


As reported by Tyler Hobbs as an addendum to CASSANDRA-2401,

{noformat}
ERROR 16:13:38,864 Fatal exception in thread Thread[ReadStage:16,5,main]
java.lang.AssertionError: No data found for 
SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], 
finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, count=0] 
in DecoratedKey(81509516161424251288255223397843705139, 
6b657931):QueryPath(columnFamilyName='cf', superColumnName='null', 
columnName='null') (original filter 
SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], 
finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, count=0]) 
from expression 'cf.626972746864617465 EQ 1'
at 
org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1517)
at 
org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:42)
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
{noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (CASSANDRA-2401) getColumnFamily() return null, which is not checked in ColumnFamilyStore.java scan() method, causing Timeout Exception in query

2011-05-14 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-2401.
---

Resolution: Fixed

Created CASSANDRA-2653 to address this, since it will probably be in a 
different release than the original 2401 fix.

 getColumnFamily() return null, which is not checked in ColumnFamilyStore.java 
 scan() method, causing Timeout Exception in query
 ---

 Key: CASSANDRA-2401
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2401
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0
 Environment: Hector 0.7.0-28, Cassandra 0.7.4, Windows 7, Eclipse
Reporter: Tey Kar Shiang
Assignee: Jonathan Ellis
 Fix For: 0.7.6

 Attachments: 2401-v2.txt, 2401-v3.txt, 2401.txt


 ColumnFamilyStore.java, line near 1680, ColumnFamily data = 
 getColumnFamily(new QueryFilter(dk, path, firstFilter)), the data is 
 returned null, causing NULL exception in satisfies(data, clause, primary) 
 which is not captured. The callback got timeout and return a Timeout 
 exception to Hector.
 The data is empty, as I traced, I have the the columns Count as 0 in 
 removeDeletedCF(), which return the null there. (I am new and trying to 
 understand the logics around still). Instead of crash to NULL, could we 
 bypass the data?
 About my test:
 A stress-test program to add, modify and delete data to keyspace. I have 30 
 threads simulate concurrent users to perform the actions above, and do a 
 query to all rows periodically. I have Column Family with rows (as File) and 
 columns as index (e.g. userID, fileType).
 No issue on the first day of test, and stopped for 3 days. I restart the test 
 on 4th day, 1 of the users failed to query the files (timeout exception 
 received). Most of the users are still okay with the query.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2648) Failing while starting Cassandar server : Corrupt (negative) value length encountered

2011-05-14 Thread Olivier Smadja (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13033584#comment-13033584
 ] 

Olivier Smadja commented on CASSANDRA-2648:
---

Yes, I removed the file and all way running ok.

About the file: I renamed it  with a .old extension but it not there anymore
:-( It seems Cassandra deleted it.

Thanks,
Olivier








 Failing while starting Cassandar server : Corrupt (negative) value length 
 encountered
 -

 Key: CASSANDRA-2648
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2648
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.7.4
 Environment: Linux
Reporter: Olivier Smadja

  INFO 16:50:35,307 reading saved cache 
 /home/tapix/data/cassandra/data/saved_caches/tapix_prod-StreamPostsStatistics-KeyCache
  INFO 16:50:35,316 Opening 
 /home/tapix/data/cassandra/data/tapix_prod/StreamPostsStatistics-f-121
  INFO 16:50:35,320 Opening 
 /home/tapix/data/cassandra/data/tapix_prod/StreamPostsStatistics-f-120
  INFO 16:50:35,329 Opening 
 /home/tapix/data/cassandra/data/tapix_prod/StreamLine-f-9
  INFO 16:50:35,352 Creating new commitlog segment 
 /home/tapix/data/cassandra/data/commitlog/CommitLog-1305316235352.log
  INFO 16:50:35,362 Replaying 
 /home/tapix/data/cassandra/data/commitlog/CommitLog-1303829569725.log, 
 /home/tapix/data/cassandra/data/commitlog/CommitLog-1303948043185.log, 
 /home/tapix/data/cassandra/data/commitlog/CommitLog-1304361402015.log, 
 /home/tapix/data/cassandra/data/commitlog/CommitLog-1304728796807.log, 
 /home/tapix/data/cassandra/data/commitlog/CommitLog-1305208776962.log, 
 /home/tapix/data/cassandra/data/commitlog/CommitLog-1305316204091.log
  INFO 16:50:35,536 Finished reading 
 /home/tapix/data/cassandra/data/commitlog/CommitLog-1303829569725.log
 ERROR 16:50:35,537 Exception encountered during startup.
 java.io.IOException: Corrupt (negative) value length encountered
   at 
 org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:269)
   at 
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:94)
   at 
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35)
   at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129)
   at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:120)
   at 
 org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:380)
   at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:253)
   at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:156)
   at 
 org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:173)
   at 
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:314)
   at 
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)
 Exception encountered during startup.
 java.io.IOException: Corrupt (negative) value length encountered
   at 
 org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:269)
   at 
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:94)
   at 
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35)
   at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129)
   at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:120)
   at 
 org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:380)
   at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:253)
   at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:156)
   at 
 org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:173)
   at 
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:314)
   at 
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-1610) Pluggable Compaction

2011-05-14 Thread Stu Hood (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13033607#comment-13033607
 ] 

Stu Hood commented on CASSANDRA-1610:
-

bq. The suggested alternative compaction strategies don't sound very generally 
useful. We shouldn't maintain them in-tree.
Time bucketing and expiration (as implemented here) are very, very useful for 
timeseries data, and are in fact a blocker for production use of our timeseries 
systems. The requirement is that column families which store events at varying 
resolutions need to decay at different rates: there is no point keeping minute 
level resolution data indefinitely. Additionally, using TTLs is much, much too 
fine grained, and requires extra storage for each value.

bq. We shouldn't have any provider-specific logic left in CompactionManager 
itself, it should all be based on the pluggable Strategy.
Agreed, but one approach that I think would be good middle ground would be to 
move doCompaction and doExpiration onto implementations of an 
AbstractCompactionTask, to be returned by the strategies that Alan has 
implemented. The 'task' concept already exists in this patch as an enum that 
CompactionManager switches on.

The interesting methods on CompactionStrategy are 
selectFor(Minor|Major)Compaction, which calculate the possible tasks to perform 
during a minor or major compaction. For the SizeTieredStrategy (aka, the 
strategy implemented in trunk), selectForMinor is the same as the previous 
getBuckets method.



The configuration changes in the patch are distracting, but it boils down to:
# Record the max client timestamp for sstables (useful for CASSANDRA-2498) 
# Allow for a compaction strategy to choose which files to perform a particular 
task on (bucketing)
# Implement a task for expiration (N files in, 0 files out)
# Add a strategy that uses the max client timestamp to expire old files

 Pluggable Compaction
 

 Key: CASSANDRA-1610
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1610
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Goffinet
Assignee: Alan Liang
Priority: Minor
  Labels: compaction
 Fix For: 1.0

 Attachments: 0001-move-compaction-code-into-own-package.patch, 
 0002-Pluggable-Compaction-and-Expiration.patch


 In CASSANDRA-1608, I proposed some changes on how compaction works. I think 
 it also makes sense to allow the ability to have pluggable compaction per CF. 
 There could be many types of workloads where this makes sense. One example we 
 had at Digg was to completely throw away certain SSTables after N days.
 The goal of this ticket is to make compaction pluggable enough to support 
 compaction based on max timestamp ordering of the sstables while satisfying 
 max sstable size, min and max compaction thresholds. Another goal is to allow 
 expiration of sstables based on a timestamp.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Cassandra Wiki] Update of FrontPage_PT-BR by RodrigoHjort

2011-05-14 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The FrontPage_PT-BR page has been changed by RodrigoHjort.
http://wiki.apache.org/cassandra/FrontPage_PT-BR

--

New page:
## Please edit system and help pages ONLY in the moinmaster wiki! For more
## information, please see MoinMaster:MoinPagesEditorGroup.
##master-page:FrontPage
#format wiki
#language en
#pragma section-numbers off
= Cassandra Wiki =
Cassandra é um armazém de dados estruturados do tipo chave-valor que é 
altamente escalável, futuramente consistente e distribuído. O Cassandra reune 
as tecnologias de sistemas distribuídos do 
[[http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf|Dynamo
 da Amazon]] e o modelo de dados do 
[[http://labs.google.com/papers/bigtable-osdi06.pdf|BigTable da Google]]. Tal 
como Dynamo, o Cassandra é 
[[http://www.allthingsdistributed.com/2008/12/eventually_consistent.html|futuramente
 consistente (''eventually consistent'')]]. E assim como !BigTable, o Cassandra 
fornece um modelo de dados baseado em famílias de colunas mais rico que nos 
típicos sistemas de chave/valor.

Cassandra teve seu código fonte aberto pelo Facebook em 2008, onde havia sido 
projetado por Avinash Lakshman (um dos autores do Dynamo da Amazon) e Prashant 
Malik (engenheiro do Facebook). De qualquer forma, você pode pensar no 
Cassandra como um Dynamo 2.0 ou um casamento do Dynamo com o !BigTable. 
Cassandra roda em produção no Facebook, porém ainda está sob pesado 
desenvolvimento.

== General Information ==
 * [[http://cassandra.apache.org/|Official Cassandra Website]] (download, 
bug-tracking, mailing-lists, etc)
 * [[ArticlesAndPresentations|Articles and Presentations]] about Cassandra.
 * [[DataModel|A description of the Cassandra data model]]
 * [[CassandraLimitations|Cassandra Limitations]]: where Cassandra is not a 
good fit

== Application developer and operator documentation ==
 * [[GettingStarted|Getting Started]]
 * [[http://www.datastax.com/docs|Datastax's Cassandra documentation]]
 * [[ClientOptions|Client options: ways to access Cassandra]] -- interfaces for 
Ruby, Python, Scala and more
 * [[IntegrationPoints]] -- list of ways Cassandra is integrated with other 
projects/products
 * [[RunningCassandra|Running Cassandra]]
 * [[ArchitectureOverview|Architecture Overview]]
 * [[UseCases|Simple Use Cases and Solutions]] -- please help complete
 * [[FAQ]]
 * [[Counters]]
 * [[SecondaryIndexes]]

== Advanced Setup and Tuning ==
 * [[StorageConfiguration|Storage Configuration]]
 * [[MultinodeCluster|Creating a multi-node cluster]]
 * [[Operations]]
 * [[Embedding]]
 * [[MemtableThresholds|Memtable Thresholds]] and other 
[[PerformanceTuning|Performance Tuning]]
 * [[CassandraHardware|Cassandra Hardware]]
 * [[CloudConfig|Configuration on Rackspace or Amazon Web Services]]
 * [[LargeDataSetConsiderations|Large data set considerations]]

== Client library developer information ==
 * [[API|Thrift API Documentation]] (In progress)

== Cassandra developer Documentation ==
 * [[HowToBuild|How To Build]]
 * ArchitectureInternals
 * [[CLI Design]]
 * [[HowToContribute|How To Contribute?]]
 * [[HowToCommit|How To Commit?]]
 * [[HowToPublishReleases|How To Release]] (Note: currently a work in progress) 
(Note: only relevant to Cassandra Committers)

== Mailing lists ==
 * Users: u...@cassandra.apache.org 
[[mailto:user-subscr...@cassandra.apache.org|(subscribe)]] 
[[http://www.mail-archive.com/user@cassandra.apache.org/|(archives)]] 
[[http://www.mail-archive.com/cassandra-user@incubator.apache.org/|(incubator 
archives)]]
 * Developers: d...@cassandra.apache.org 
[[mailto:dev-subscr...@cassandra.apache.org|(subscribe)]] 
[[http://www.mail-archive.com/dev@cassandra.apache.org/|(archives)]] 
[[http://www.mail-archive.com/cassandra-dev@incubator.apache.org/|(incubator 
archives)]]
 * Commits: commits@cassandra.apache.org 
[[mailto:commits-subscr...@cassandra.apache.org|(subscribe)]]

== Related Information ==
 * [[http://incubator.apache.org/thrift|Thrift]], used by Cassandra for client 
access
 * RelatedProjects: Projects using or extending Cassandra

== Google SoC 2010 Page ==
 * [[GoogleSoc2010|Google SoC]]

This wiki is powered by MoinMoin.  With the exception of a few immutable pages, 
anyone can edit it. Try SyntaxReference if you need help on wiki markup, and 
FindPage or SiteNavigation to search for existing pages before creating a new 
one. If you aren't sure where to begin, checkout RecentChanges to see what 
others have been working on, or RandomPage if you are feeling lucky.

== Other Languages ==
 * [[首页|SimpleChinese 简体中文]]
 * [[FrontPage_JP|Japanese 日本語]]


[Cassandra Wiki] Update of FrontPage by RodrigoHjort

2011-05-14 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The FrontPage page has been changed by RodrigoHjort.
The comment on this change is: Added link to FrontPage_PT-BR.
http://wiki.apache.org/cassandra/FrontPage?action=diffrev1=62rev2=63

--

  == Other Languages ==
   * [[首页|SimpleChinese 简体中文]]
   * [[FrontPage_JP|Japanese 日本語]]
+  * [[FrontPage_PT-BR|BrazilianPortuguese Português do Brasil]]
  


[Cassandra Wiki] Update of FrontPage_PT-BR by RodrigoHjort

2011-05-14 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The FrontPage_PT-BR page has been changed by RodrigoHjort.
http://wiki.apache.org/cassandra/FrontPage_PT-BR?action=diffrev1=1rev2=2

--

  ## information, please see MoinMaster:MoinPagesEditorGroup.
  ##master-page:FrontPage
  #format wiki
- #language en
+ #language pt-br
  #pragma section-numbers off
  = Cassandra Wiki =
  Cassandra é um armazém de dados estruturados do tipo chave-valor que é 
altamente escalável, futuramente consistente e distribuído. O Cassandra reune 
as tecnologias de sistemas distribuídos do 
[[http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf|Dynamo
 da Amazon]] e o modelo de dados do 
[[http://labs.google.com/papers/bigtable-osdi06.pdf|BigTable da Google]]. Tal 
como Dynamo, o Cassandra é 
[[http://www.allthingsdistributed.com/2008/12/eventually_consistent.html|futuramente
 consistente (''eventually consistent'')]]. E assim como !BigTable, o Cassandra 
fornece um modelo de dados baseado em famílias de colunas mais rico que nos 
típicos sistemas de chave/valor.
  
  Cassandra teve seu código fonte aberto pelo Facebook em 2008, onde havia sido 
projetado por Avinash Lakshman (um dos autores do Dynamo da Amazon) e Prashant 
Malik (engenheiro do Facebook). De qualquer forma, você pode pensar no 
Cassandra como um Dynamo 2.0 ou um casamento do Dynamo com o !BigTable. 
Cassandra roda em produção no Facebook, porém ainda está sob pesado 
desenvolvimento.
  
- == General Information ==
-  * [[http://cassandra.apache.org/|Official Cassandra Website]] (download, 
bug-tracking, mailing-lists, etc)
+ == Informações Gerais ==
+  * [[http://cassandra.apache.org/|Website Oficial do Cassandra]] (download, 
rastreamento de bugs, listas de discussão, etc)
-  * [[ArticlesAndPresentations|Articles and Presentations]] about Cassandra.
+  * [[ArticlesAndPresentations|Artigos e Apresentações]] sobre Cassandra.
-  * [[DataModel|A description of the Cassandra data model]]
-  * [[CassandraLimitations|Cassandra Limitations]]: where Cassandra is not a 
good fit
+  * [[DataModel|Uma descrição do modelo de dados do Cassandra]]
+  * [[CassandraLimitations|Limitações do Cassandra]]: onde o Cassandra não é 
bom de ser usado
  
- == Application developer and operator documentation ==
-  * [[GettingStarted|Getting Started]]
-  * [[http://www.datastax.com/docs|Datastax's Cassandra documentation]]
+ == Documentação para o desenvolvedor de aplicação e para o operador ==
+  * [[GettingStarted|Iniciando]]
+  * [[http://www.datastax.com/docs|Documentação do Cassandra pela Datastax]]
-  * [[ClientOptions|Client options: ways to access Cassandra]] -- interfaces 
for Ruby, Python, Scala and more
+  * [[ClientOptions|Opções de cliente: maneiras de acessar o Cassandra]] -- 
interfaces para Ruby, Python, Scala e mais
-  * [[IntegrationPoints]] -- list of ways Cassandra is integrated with other 
projects/products
+  * [[IntegrationPoints]] -- lista de maneiras em que o Cassandra é integrado 
a outros projetos e produtos
-  * [[RunningCassandra|Running Cassandra]]
+  * [[RunningCassandra|Executando o Cassandra]]
-  * [[ArchitectureOverview|Architecture Overview]]
-  * [[UseCases|Simple Use Cases and Solutions]] -- please help complete
+  * [[ArchitectureOverview|Visão Geral da Arquitetura]]
+  * [[UseCases|Simples Casos de Uso e Soluções]] -- por favor ajudem a 
completá-lo
   * [[FAQ]]
   * [[Counters]]
   * [[SecondaryIndexes]]


[Cassandra Wiki] Update of FrontPage_PT-BR by RodrigoHjort

2011-05-14 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The FrontPage_PT-BR page has been changed by RodrigoHjort.
http://wiki.apache.org/cassandra/FrontPage_PT-BR?action=diffrev1=2rev2=3

--

  
  == Informações Gerais ==
   * [[http://cassandra.apache.org/|Website Oficial do Cassandra]] (download, 
rastreamento de bugs, listas de discussão, etc)
-  * [[ArticlesAndPresentations|Artigos e Apresentações]] sobre Cassandra.
+  * [[ArticlesAndPresentations|Artigos e Apresentações]] sobre Cassandra
   * [[DataModel|Uma descrição do modelo de dados do Cassandra]]
   * [[CassandraLimitations|Limitações do Cassandra]]: onde o Cassandra não é 
bom de ser usado
  
@@ -29, +29 @@

   * [[Counters]]
   * [[SecondaryIndexes]]
  
- == Advanced Setup and Tuning ==
-  * [[StorageConfiguration|Storage Configuration]]
-  * [[MultinodeCluster|Creating a multi-node cluster]]
+ == Configurações Avançadas e Tuning ==
+  * [[StorageConfiguration|Configuração do Armazenamento]]
+  * [[MultinodeCluster|Criando um cluster com vários nós]]
   * [[Operations]]
   * [[Embedding]]
-  * [[MemtableThresholds|Memtable Thresholds]] and other 
[[PerformanceTuning|Performance Tuning]]
+  * [[MemtableThresholds|Limiares da Memtable]] e outras aspectos de 
[[PerformanceTuning|Performance Tuning]]
-  * [[CassandraHardware|Cassandra Hardware]]
+  * [[CassandraHardware|Hardware para o Cassandra]]
-  * [[CloudConfig|Configuration on Rackspace or Amazon Web Services]]
+  * [[CloudConfig|Configuração no Rackspace ou Amazon Web Services]]
-  * [[LargeDataSetConsiderations|Large data set considerations]]
+  * [[LargeDataSetConsiderations|Considerações para grandes volumes de dados]]
  
- == Client library developer information ==
-  * [[API|Thrift API Documentation]] (In progress)
+ == Informações para o desenvolvedor da biblioteca cliente ==
+  * [[API|Documentação da API Thrift]] (em andamento)
  
- == Cassandra developer Documentation ==
-  * [[HowToBuild|How To Build]]
-  * ArchitectureInternals
+ == Documentação para o desenvolvedor do Cassandra ==
+  * [[HowToBuild|Como Construir]]
+  * [[ArchitectureInternals|Detalhes internos da arquitetura]]
   * [[CLI Design]]
-  * [[HowToContribute|How To Contribute?]]
+  * [[HowToContribute|Como Contribuir?]]
-  * [[HowToCommit|How To Commit?]]
+  * [[HowToCommit|Como Fazer Commit?]]
-  * [[HowToPublishReleases|How To Release]] (Note: currently a work in 
progress) (Note: only relevant to Cassandra Committers)
+  * [[HowToPublishReleases|Como Fazer Release]] (Nota: atualmente em 
andamento) (Nota: relevante apenas aos committers do Cassandra)
  
- == Mailing lists ==
+ == Listas de Discussão ==
-  * Users: u...@cassandra.apache.org 
[[mailto:user-subscr...@cassandra.apache.org|(subscribe)]] 
[[http://www.mail-archive.com/user@cassandra.apache.org/|(archives)]] 
[[http://www.mail-archive.com/cassandra-user@incubator.apache.org/|(incubator 
archives)]]
+  * Usuários: u...@cassandra.apache.org 
[[mailto:user-subscr...@cassandra.apache.org|(inscrever-se)]] 
[[http://www.mail-archive.com/user@cassandra.apache.org/|(histórico)]] 
[[http://www.mail-archive.com/cassandra-user@incubator.apache.org/|(histórico 
da incubadora)]]
-  * Developers: d...@cassandra.apache.org 
[[mailto:dev-subscr...@cassandra.apache.org|(subscribe)]] 
[[http://www.mail-archive.com/dev@cassandra.apache.org/|(archives)]] 
[[http://www.mail-archive.com/cassandra-dev@incubator.apache.org/|(incubator 
archives)]]
+  * Desenvolvedores: d...@cassandra.apache.org 
[[mailto:dev-subscr...@cassandra.apache.org|(inscrever-se)]] 
[[http://www.mail-archive.com/dev@cassandra.apache.org/|(histórico)]] 
[[http://www.mail-archive.com/cassandra-dev@incubator.apache.org/|(histórico da 
incubadora)]]
-  * Commits: commits@cassandra.apache.org 
[[mailto:commits-subscr...@cassandra.apache.org|(subscribe)]]
+  * Commits: commits@cassandra.apache.org 
[[mailto:commits-subscr...@cassandra.apache.org|(inscrever-se)]]
  
- == Related Information ==
+ == Informação Relacionada ==
-  * [[http://incubator.apache.org/thrift|Thrift]], used by Cassandra for 
client access
+  * [[http://incubator.apache.org/thrift|Thrift]], usado pelo Cassandra para 
acesso ao cliente
-  * RelatedProjects: Projects using or extending Cassandra
+  * RelatedProjects: Projetos usando ou estendendo o Cassandra
  
  == Google SoC 2010 Page ==
   * [[GoogleSoc2010|Google SoC]]
  
- This wiki is powered by MoinMoin.  With the exception of a few immutable 
pages, anyone can edit it. Try SyntaxReference if you need help on wiki markup, 
and FindPage or SiteNavigation to search for existing pages before creating a 
new one. If you aren't sure where to begin, checkout RecentChanges to see what 
others have been working on, or RandomPage if you are feeling lucky.
+ Esta página wiki é baseada no MoinMoin. Com a exceção de algumas poucas 
páginas imutáveis, qualquer um pode editá-la. Acesse 

[Cassandra Wiki] Update of ClientOptions by RodrigoHjort

2011-05-14 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The ClientOptions page has been changed by RodrigoHjort.
The comment on this change is: Added link to Cassandrelle.
http://wiki.apache.org/cassandra/ClientOptions?action=diffrev1=126rev2=127

--

* Hector: http://github.com/rantav/hector (Examples 
https://github.com/zznate/hector-examples )
* Kundera http://code.google.com/p/kundera/
* Pelops: http://github.com/s7/scale7-pelops
+   * Cassandrelle (Demoiselle Cassandra): 
http://demoiselle.sf.net/component/demoiselle-cassandra/ (User guide: 
http://demoiselle.sourceforge.net/docs/guide-cassandra/)
   * Grails:
* grails-cassandra: https://github.com/wolpert/grails-cassandra
   * .NET: