date:20101206

If it's GCing frequently and each CMS is only collecting a small
fraction of the old gen, then your heap is probably too small.

(GCInspector only logs collections that take over 1s, which should
never include ParNew.)

On Mon, Dec 6, 2010 at 7:11 AM, Ying Tang ivytang0...@gmail.com wrote:
 And after checking node who gc concurrentMarkSweep frequently ,it's
 OC(Current old space capacity (KB)) is 1006016.0 ,it's OU(Old space
 utilization (KB).) is also 1006016.0  ,almost all memory.
 Dose this situation imply this heap size is set too low?

 On Mon, Dec 6, 2010 at 8:07 PM, Ying Tang ivytang0...@gmail.com wrote:

 I hava a two node , running cassandra ,both's memory is 4G.
 First i set heap size to 2G ,both run normal .
 The i set heap size to 1G ,  the client who insert data to and read data
 from cassandra began throw Read\Write Unavailable Exception . And one
 cassandra node began  logging  GC for ConcurrentMarkSweep frequently  ,every
 ConcurrentMarkSweep 's time is over 5000ms.
 How this happen ? Why frequently ConcurrentMarkSweep GC ,not ParNew GC?

 --
 Best regards,
 Ivy Tang





 --
 Best regards,
 Ivy Tang






-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Confused about consistency

You're right, they should be the same.

Next time this happens, set the log level to debug (from
StorageService jmx) on the surviving nodes and let a couple queries
fail, before restarting the 3rd (and setting level back to info).

On Sat, Dec 4, 2010 at 12:01 AM, Dan Hendry dan.hendry.j...@gmail.com wrote:
 Doesn't consistency level ALL=QUORUM at RF=2 ?

 I have not had a chance to test your fix but I don't THINK this is the
 issue. If it is the issue, how do consistency levels ALL and QUORUM differ
 at this replication factor?

 On Sat, Dec 4, 2010 at 12:03 AM, Jonathan Ellis jbel...@gmail.com wrote:

 I think you are running into
 https://issues.apache.org/jira/browse/CASSANDRA-1316, where when an
 inconsistency on QUORUM/ALL is discovered it always peformed the
 repair at QUORUM instead of the original CL.  Thus, reading at ALL you
 would see the correct answer on the 2nd read but you weren't
 guaranteed to see it on the first.

 This was fixed in 0.6.4 but apparently I botched the merge to the 0.7
 branch.  I corrected that just now, so when you update, you should be
 good to go.

 On Fri, Dec 3, 2010 at 9:19 PM, Dan Hendry dan.hendry.j...@gmail.com
 wrote:
  I am seeing fairly strange, behavior in my Cassandra cluster.
  Setup
   - 3 nodes (lets call them nodes 1 2 and 3)
   - RF=2
   - A set of servers (producers) which which write data to the cluster at
  consistency level ONE
   - A set of servers (consumers/processors) which read data from the
  cluster
  at consistency level ALL
   - Cassandra 0.7 (recent out of the svn branch, post beta 3)
   - Clients use the pelops library
  Situation:
   - Everything is humming along nicely
   - A Cassandra node (say 3) goes down (even with 24 GB of ram, OOM
  errors
  are the bain of my existence)
   - Producers continue to happily write to the cluster but consumers
  start
  complaining by throwing TimeOutExceptions and UnavailableExceptions.
   - I stagger out of bed in the middle of the night and restart Cassandra
  on
  node 3.
   - The consumers stop complaining and get back to business but generate
  garbage data for the period node 3 was down. Its almost like half the
  data
  is missing half the time. (Again, I am reading at consistency level
  ALL).
   - I force the consumers to reprocess data for the period node 3 was
  down.
  They generate accurate output which is different from the first time
  round.
  To be explicit, what seems to be happening is first read at consistency
  ALL
  gives A,C,E (for example) and the second read at consistency level ALL
  gives A,B,C,D,E. Is this a Cassandra bug? Is my knowledge of
  consistency
  levels flawed? My understanding is that you could achieve strongly
  consistent behavior by writing at ONE and reading at ALL.
  After this experience, my theory (uneducated, untested, and
  under-researched) is that strong consistency applies only to column
  values, not the set of columns (or super-columns in this case) which
  make up
  a row. Any thoughts?



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of Riptano, the source for professional Cassandra support
 http://riptano.com





-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: If one seed node crash, how can I add one seed node?

set it as a seed _after_ bootstrapping it into the cluster.

On Mon, Dec 6, 2010 at 5:01 AM, lei liu liulei...@gmail.com wrote:
 After one seed node crash, I want to add one node as seed node, I set
 auto_bootstrap to true, but the new node don't migrate data from other
 nodes.

 How can I add one new seed node and let the node to migrate data from other
 nodes?



 Thanks,

 LiuLei




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: If one seed node crash, how can I add one seed node?

2010-12-06 Thread lei liu

Thank Jonathan for your reply.

How  can I bootstrap the node into cluster, I know if the node is seed node,
I can't set AutoBootstrap to true.

2010/12/6 Jonathan Ellis jbel...@gmail.com

 set it as a seed _after_ bootstrapping it into the cluster.

 On Mon, Dec 6, 2010 at t5:01 AM, lei liu liulei...@gmail.com wrote:
  After one seed node crash, I want to add one node as seed node, I set
  auto_bootstrap to true, but the new node don't migrate data from other
  node s.
 
  How can I add one new seed node and let the node to migrate data from
 other
  nodes?
 
 
 
  Thanks,
 
  LiuLei
 



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of Riptano, the source for professional Cassandra support
 http://riptano.com

Sorting problem on supercolumns names using OPP on 0.6.2

Hi, I've the following schema defined:

EventsByUserDate : {
UserId : {
epoch: { // SC
 IID,
IID,
IID,
 IID
},
// and the other events in time
 epoch: {
IID,
IID,
 IID
}
}
}
ColumnFamily ColumnType=Super CompareWith=LongType
CompareSubcolumnsWith=BytesType Name=EventsByUserDate /

Where I'm expecting to store all the event ids for a user ordered by date
(it's seconds since epoch as long long), I'm using
OrdingPreservingPartitioner.

But a call to:

GetSuperRangeSlices(EventsByUserDate ,  --column family
,  --supercolumn
 userId, --startkey
userId, --endkey
 {
column_names = {},
   slice_range = {
 start = ,
  finish = ,
 reversed = true,
 count = 20} },
1 --total keys
   )

Is not sorting correctly by supercolumn (the supercolumn names come out
unsorted), this is a sample output for the pervious query using thrift
directly:

SC 1291648883
SC 1291588465
SC 1291588453
SC 1291586385
SC 1291587408
SC 1291588174
SC 1291585331
SC 1291587116
SC 1291651116
SC 1291586332
SC 1291588548
SC 1291588036
SC 1291648703
SC 1291583651
SC 1291583650
SC 1291583649
SC 1291583648
SC 1291583647
SC 1291583646
SC 1291587485


Anything I'm missing regarding sorting schemes?

Thanks,
Guille

RE: Various exceptions on 0.7

2010-12-06 Thread Dan Hendry

 bleeding edge code you are running (did you try rc1?) or you do have nodes
on different versions

All nodes are running code from
https://svn.apache.org/repos/asf/cassandra/branches/cassandra-0.7 which I
thought was essentially RC1 with fixes but I will give the actual release a
try.

 you have a hardware problem

Hard to say. I dont think so, everything else seems to be working fine. I
will try and run some diagnostics on the two nodes which seem to be acting
up.


Now for some new developments; the plot thickens. I am fairly sure there is
a corrupt ColumnFamily/SSTable. After a restart, two adjacent nodes both
show the following error. After which the CompactionManager pending tasks
never returns to zero. I am fairly sure this cf is not getting compacted but
compaction for other column families seems to continue. In order to get rid
of all these errors I have to perform a truncate operation using the cli,
after which I get the same IndexOutOfBounds exception. Can I just shut down
the node (draining first), and delete all data files related to this column
family on the two problematic nodes? The data they contain is reasonably
unimportant and I dont mind loosing it.


ERROR [CompactionExecutor:1] 2010-12-06 05:07:56,736
AbstractCassandraDaemon.java (line 90) Fatal exception in thread
Thread[CompactionExecutor:1,1,main]
java.lang.IndexOutOfBoundsException
at java.nio.Buffer.checkIndex(Buffer.java:520)
at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:340)
at
org.apache.cassandra.db.DeletedColumn.getLocalDeletionTime(DeletedColumn.jav
a:57)
at
org.apache.cassandra.db.ColumnFamilyStore.removeDeletedSuper(ColumnFamilySto
re.java:818)
at
org.apache.cassandra.db.ColumnFamilyStore.removeDeletedColumnsOnly(ColumnFam
ilyStore.java:781)
at
org.apache.cassandra.db.ColumnFamilyStore.removeDeleted(ColumnFamilyStore.ja
va:774)
at
org.apache.cassandra.io.PrecompactedRow.init(PrecompactedRow.java:93)
at
org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterato
r.java:138)
at
org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.jav
a:107)
at
org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.jav
a:42)
at
org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.jav
a:73)
at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator
.java:136)
at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131
)
at
org.apache.commons.collections.iterators.FilterIterator.setNextObject(Filter
Iterator.java:183)
at
org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterat
or.java:94)
at
org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.jav
a:321)
at
org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:124)
at
org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:97)
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.ja
va:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:9
08)
at java.lang.Thread.run(Thread.java:662)

Dan

-Original Message-
From: Jonathan Ellis [mailto:jbel...@gmail.com] 
Sent: December-04-10 22:45
To: user
Subject: Re: Various exceptions on 0.7

At least one of your nodes is sending garbage to the others.

Either there's a bug in the bleeding edge code you are running (did
you try rc1?) or you do have nodes on different versions or you have a
hardware problem.

On Sat, Dec 4, 2010 at 5:51 PM, Dan Hendry dan.hendry.j...@gmail.com
wrote:
 Here are two other errors which appear frequently:
 ERROR [MutationStage:29] 2010-12-04 17:47:46,931
RowMutationVerbHandler.java
 (line 83) Error in row mutation
 java.io.IOException: Invalid localDeleteTime read: 0
         at

org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:3
55)
         at

org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:3
12)
         at

org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFami
lySerializer.java:129)
         at

org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySeria
lizer.java:120)
         at

org.apache.cassandra.db.RowMutationSerializer.defreezeTheMaps(RowMutation.ja
va:383)
         at

org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:3
93)
         at

org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:3
51)
         at

org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler
.java:52)
         at

org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:63
)
         at

Re: If one seed node crash, how can I add one seed node?

2010-12-06 Thread Nick Bailey

The node can be set as a seed node at any time. It does not need to be a
seed node when it joins the cluster. You should remove it as a seed node,
set autobootstrap to true and let it join the cluster. Once it has joined
the cluster you should add it as a seed node in the configuration for all of
your nodes.

On Mon, Dec 6, 2010 at 9:59 AM, lei liu liulei...@gmail.com wrote:

 Thank Jonathan for your reply.

 How  can I bootstrap the node into cluster, I know if the node is seed
 node, I can't set AutoBootstrap to true.

 2010/12/6 Jonathan Ellis jbel...@gmail.com

 set it as a seed _after_ bootstrapping it into the cluster.

 On Mon, Dec 6, 2010 at t5:01 AM, lei liu liulei...@gmail.com wrote:
  After one seed node crash, I want to add one node as seed node, I set
  auto_bootstrap to true, but the new node don't migrate data from other
  node s.
 
  How can I add one new seed node and let the node to migrate data from
 other
  nodes?
 
 
 
  Thanks,
 
  LiuLei
 



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of Riptano, the source for professional Cassandra support
 http://riptano.com

Re: Sorting problem on supercolumns names using OPP on 0.6.2

What client are you using?  Is it storing the results in a hash map or some
other type of
non-order preserving dictionary?

- Tyler

On Mon, Dec 6, 2010 at 10:11 AM, Guillermo Winkler gwink...@inconcertcc.com
 wrote:

 Hi, I've the following schema defined:

 EventsByUserDate : {
 UserId : {
 epoch: { // SC
  IID,
 IID,
 IID,
  IID
 },
 // and the other events in time
  epoch: {
 IID,
 IID,
  IID
 }
 }
 }
 ColumnFamily ColumnType=Super CompareWith=LongType
 CompareSubcolumnsWith=BytesType Name=EventsByUserDate /

 Where I'm expecting to store all the event ids for a user ordered by date
 (it's seconds since epoch as long long), I'm using
 OrdingPreservingPartitioner.

 But a call to:

 GetSuperRangeSlices(EventsByUserDate ,  --column family
 ,  --supercolumn
  userId, --startkey
 userId, --endkey
  {
 column_names = {},
slice_range = {
  start = ,
   finish = ,
  reversed = true,
  count = 20} },
 1 --total keys
)

 Is not sorting correctly by supercolumn (the supercolumn names come out
 unsorted), this is a sample output for the pervious query using thrift
 directly:

 SC 1291648883
 SC 1291588465
 SC 1291588453
 SC 1291586385
 SC 1291587408
 SC 1291588174
 SC 1291585331
 SC 1291587116
 SC 1291651116
 SC 1291586332
 SC 1291588548
 SC 1291588036
 SC 1291648703
 SC 1291583651
 SC 1291583650
 SC 1291583649
 SC 1291583648
 SC 1291583647
 SC 1291583646
 SC 1291587485


 Anything I'm missing regarding sorting schemes?

 Thanks,
 Guille

LA, Tokyo Cassandra trainings this week

There are a few seats open for each:

LA training Wednesday: http://www.eventbrite.com/event/1002369113
Tokyo training Thursday: http://nosqlfd.eventbrite.com/

I will be teaching the LA class.  Tokyo will be taught by Nate McCall
(with pauseless translation to Japanese) and hosted by our friends at
Gemini Mobile.

See you there!

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Re: probability of node receiving (not be responsible for) the request

2010-12-06 Thread Brandon Williams

2010/12/6 魏金仙 sei_...@126.com

 thank you.
 but I mean the probability of a node to receive the request not process it
 eventually .


I see.  That depends on how the client-side load balancing is written.

-Brandon

Re: LA, Tokyo Cassandra trainings this week

2010-12-06 Thread Dave Gardner

We've also got Jake Luciani (@tjake) giving a talk at Cassandra London this
Wednesday - this is a great opportunity to meet with other Cassandra users.
There will be some free beer and food available.

http://www.meetup.com/Cassandra-London/calendar/15351291/

Dave

On 6 December 2010 17:05, Jonathan Ellis jbel...@gmail.com wrote:

 There are a few seats open for each:

 LA training Wednesday: http://www.eventbrite.com/event/1002369113
 Tokyo training Thursday: http://nosqlfd.eventbrite.com/

 I will be teaching the LA class.  Tokyo will be taught by Nate McCall
 (with pauseless translation to Japanese) and hosted by our friends at
 Gemini Mobile.

 See you there!

 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of Riptano, the source for professional Cassandra support
 http://riptano.com

Re: Sorting problem on supercolumns names using OPP on 0.6.2

I'm using thrift in C++ and inserting the results in a vector of pairs, so
client-side-mangling does not seem to be the problem.

Also I'm using a test column where I insert the same value I'm using as
super column name (in this case the same date converted to string) and when
queried using cassandra cli is unsorted too:

cassandra get Events.EventsByUserDate ['guille']
= (super_column=9088542550893002752,

(column=4342323443303834363833383437454339364433324530324538413039373736,
value=2010-12-06 17:43:36.000, timestamp=1291657416526732))
= (super_column=5990347482238812160,

(column=41414e4c6b54696d6532423656566e6869667a336f654b6147393d2d395a4e797441397a744f39686d3147392b406d61696c2e676d61696c2e636f6d,
value=2010-12-06 17:46:08.000, timestamp=1291657568569039))
= (super_column=-3089190841516818432,

(column=3634343644353236463830303437363542454245354630343845393533373337,
value=2010-12-06 17:44:47.000, timestamp=1291657487450738))
= (super_column=-4026221038986592256,

(column=62303232396330372d636430612d343662332d623834382d393632366136323061376532,
value=2010-12-06 17:39:50.000, timestamp=1291657190117981))




On Mon, Dec 6, 2010 at 3:02 PM, Tyler Hobbs ty...@riptano.com wrote:

 What client are you using?  Is it storing the results in a hash map or some
 other type of
 non-order preserving dictionary?

 - Tyler


 On Mon, Dec 6, 2010 at 10:11 AM, Guillermo Winkler 
 gwink...@inconcertcc.com wrote:

 Hi, I've the following schema defined:

 EventsByUserDate : {
  UserId : {
 epoch: { // SC
  IID,
 IID,
 IID,
  IID
 },
 // and the other events in time
  epoch: {
 IID,
 IID,
  IID
 }
 }
 }
 ColumnFamily ColumnType=Super CompareWith=LongType
 CompareSubcolumnsWith=BytesType Name=EventsByUserDate /

 Where I'm expecting to store all the event ids for a user ordered by date
 (it's seconds since epoch as long long), I'm using
 OrdingPreservingPartitioner.

 But a call to:

 GetSuperRangeSlices(EventsByUserDate ,  --column family
 ,  --supercolumn
  userId, --startkey
 userId, --endkey
  {
 column_names = {},
slice_range = {
  start = ,
   finish = ,
  reversed = true,
  count = 20} },
 1 --total keys
)

 Is not sorting correctly by supercolumn (the supercolumn names come out
 unsorted), this is a sample output for the pervious query using thrift
 directly:

 SC 1291648883
 SC 1291588465
 SC 1291588453
 SC 1291586385
 SC 1291587408
 SC 1291588174
 SC 1291585331
 SC 1291587116
 SC 1291651116
 SC 1291586332
 SC 1291588548
 SC 1291588036
 SC 1291648703
 SC 1291583651
 SC 1291583650
 SC 1291583649
 SC 1291583648
 SC 1291583647
 SC 1291583646
 SC 1291587485


 Anything I'm missing regarding sorting schemes?

 Thanks,
 Guille

Testathon at Twitter on December 13th

2010-12-06 Thread Ryan King

We're going to be hosting people at the Twitter offices the evening of
December 13th to focus on testing 0.7. If you're interested please
contact me offlist and I'll add you to the invite. Note that we're
trying to keep the group small and focused.

-ryan

Re: Sorting problem on supercolumns names using OPP on 0.6.2

How are you packing the longs into strings?  The large negative numbers
point to that being done incorrectly.

Bitshifting and putting each byte of the long into a char[8] then
stringifying the char[] is the best way to go.  Cassandra expects
big-ending longs, as well.

- Tyler

On Mon, Dec 6, 2010 at 11:55 AM, Guillermo Winkler gwink...@inconcertcc.com
 wrote:

 I'm using thrift in C++ and inserting the results in a vector of pairs, so
 client-side-mangling does not seem to be the problem.

 Also I'm using a test column where I insert the same value I'm using as
 super column name (in this case the same date converted to string) and when
 queried using cassandra cli is unsorted too:

 cassandra get Events.EventsByUserDate ['guille']
 = (super_column=9088542550893002752,

 (column=4342323443303834363833383437454339364433324530324538413039373736,
 value=2010-12-06 17:43:36.000, timestamp=1291657416526732))
 = (super_column=5990347482238812160,

 (column=41414e4c6b54696d6532423656566e6869667a336f654b6147393d2d395a4e797441397a744f39686d3147392b406d61696c2e676d61696c2e636f6d,
 value=2010-12-06 17:46:08.000, timestamp=1291657568569039))
 = (super_column=-3089190841516818432,

 (column=3634343644353236463830303437363542454245354630343845393533373337,
 value=2010-12-06 17:44:47.000, timestamp=1291657487450738))
 = (super_column=-4026221038986592256,

 (column=62303232396330372d636430612d343662332d623834382d393632366136323061376532,
 value=2010-12-06 17:39:50.000, timestamp=1291657190117981))




 On Mon, Dec 6, 2010 at 3:02 PM, Tyler Hobbs ty...@riptano.com wrote:

 What client are you using?  Is it storing the results in a hash map or
 some other type of
 non-order preserving dictionary?

 - Tyler


 On Mon, Dec 6, 2010 at 10:11 AM, Guillermo Winkler 
 gwink...@inconcertcc.com wrote:

 Hi, I've the following schema defined:

 EventsByUserDate : {
  UserId : {
 epoch: { // SC
  IID,
 IID,
 IID,
  IID
 },
 // and the other events in time
  epoch: {
 IID,
 IID,
  IID
 }
 }
 }
 ColumnFamily ColumnType=Super CompareWith=LongType
 CompareSubcolumnsWith=BytesType Name=EventsByUserDate /

 Where I'm expecting to store all the event ids for a user ordered by date
 (it's seconds since epoch as long long), I'm using
 OrdingPreservingPartitioner.

 But a call to:

 GetSuperRangeSlices(EventsByUserDate ,  --column family
 ,  --supercolumn
  userId, --startkey
 userId, --endkey
  {
 column_names = {},
slice_range = {
  start = ,
   finish = ,
  reversed = true,
  count = 20} },
 1 --total keys
)

 Is not sorting correctly by supercolumn (the supercolumn names come out
 unsorted), this is a sample output for the pervious query using thrift
 directly:

 SC 1291648883
 SC 1291588465
 SC 1291588453
 SC 1291586385
 SC 1291587408
 SC 1291588174
 SC 1291585331
 SC 1291587116
 SC 1291651116
 SC 1291586332
 SC 1291588548
 SC 1291588036
 SC 1291648703
 SC 1291583651
 SC 1291583650
 SC 1291583649
 SC 1291583648
 SC 1291583647
 SC 1291583646
 SC 1291587485


 Anything I'm missing regarding sorting schemes?

 Thanks,
 Guille

Re: Sorting problem on supercolumns names using OPP on 0.6.2

That should be big-endian.

On Mon, Dec 6, 2010 at 12:29 PM, Tyler Hobbs ty...@riptano.com wrote:

 How are you packing the longs into strings?  The large negative numbers
 point to that being done incorrectly.

 Bitshifting and putting each byte of the long into a char[8] then
 stringifying the char[] is the best way to go.  Cassandra expects
 big-ending longs, as well.

 - Tyler


 On Mon, Dec 6, 2010 at 11:55 AM, Guillermo Winkler 
 gwink...@inconcertcc.com wrote:

 I'm using thrift in C++ and inserting the results in a vector of pairs, so
 client-side-mangling does not seem to be the problem.

 Also I'm using a test column where I insert the same value I'm using as
 super column name (in this case the same date converted to string) and when
 queried using cassandra cli is unsorted too:

 cassandra get Events.EventsByUserDate ['guille']
 = (super_column=9088542550893002752,

 (column=4342323443303834363833383437454339364433324530324538413039373736,
 value=2010-12-06 17:43:36.000, timestamp=1291657416526732))
 = (super_column=5990347482238812160,

 (column=41414e4c6b54696d6532423656566e6869667a336f654b6147393d2d395a4e797441397a744f39686d3147392b406d61696c2e676d61696c2e636f6d,
 value=2010-12-06 17:46:08.000, timestamp=1291657568569039))
 = (super_column=-3089190841516818432,

 (column=3634343644353236463830303437363542454245354630343845393533373337,
 value=2010-12-06 17:44:47.000, timestamp=1291657487450738))
 = (super_column=-4026221038986592256,

 (column=62303232396330372d636430612d343662332d623834382d393632366136323061376532,
 value=2010-12-06 17:39:50.000, timestamp=1291657190117981))




 On Mon, Dec 6, 2010 at 3:02 PM, Tyler Hobbs ty...@riptano.com wrote:

 What client are you using?  Is it storing the results in a hash map or
 some other type of
 non-order preserving dictionary?

 - Tyler


 On Mon, Dec 6, 2010 at 10:11 AM, Guillermo Winkler 
 gwink...@inconcertcc.com wrote:

 Hi, I've the following schema defined:

 EventsByUserDate : {
  UserId : {
 epoch: { // SC
  IID,
 IID,
 IID,
  IID
 },
 // and the other events in time
  epoch: {
 IID,
 IID,
  IID
 }
 }
 }
 ColumnFamily ColumnType=Super CompareWith=LongType
 CompareSubcolumnsWith=BytesType Name=EventsByUserDate /

 Where I'm expecting to store all the event ids for a user ordered by
 date (it's seconds since epoch as long long), I'm using
 OrdingPreservingPartitioner.

 But a call to:

 GetSuperRangeSlices(EventsByUserDate ,  --column family
 ,  --supercolumn
  userId, --startkey
 userId, --endkey
  {
 column_names = {},
slice_range = {
  start = ,
   finish = ,
  reversed = true,
  count = 20} },
 1 --total keys
)

 Is not sorting correctly by supercolumn (the supercolumn names come out
 unsorted), this is a sample output for the pervious query using thrift
 directly:

 SC 1291648883
 SC 1291588465
 SC 1291588453
 SC 1291586385
 SC 1291587408
 SC 1291588174
 SC 1291585331
 SC 1291587116
 SC 1291651116
 SC 1291586332
 SC 1291588548
 SC 1291588036
 SC 1291648703
 SC 1291583651
 SC 1291583650
 SC 1291583649
 SC 1291583648
 SC 1291583647
 SC 1291583646
 SC 1291587485


 Anything I'm missing regarding sorting schemes?

 Thanks,
 Guille

Re: Sorting problem on supercolumns names using OPP on 0.6.2

Also, thought I should mention:

When you make a std::string out of the char[], make sure to use the
constructor with the size_t parameter (size 8).

- Tyler

On Mon, Dec 6, 2010 at 12:29 PM, Tyler Hobbs ty...@riptano.com wrote:

 That should be big-endian.


 On Mon, Dec 6, 2010 at 12:29 PM, Tyler Hobbs ty...@riptano.com wrote:

 How are you packing the longs into strings?  The large negative numbers
 point to that being done incorrectly.

 Bitshifting and putting each byte of the long into a char[8] then
 stringifying the char[] is the best way to go.  Cassandra expects
 big-ending longs, as well.

 - Tyler


 On Mon, Dec 6, 2010 at 11:55 AM, Guillermo Winkler 
 gwink...@inconcertcc.com wrote:

 I'm using thrift in C++ and inserting the results in a vector of pairs,
 so client-side-mangling does not seem to be the problem.

 Also I'm using a test column where I insert the same value I'm using as
 super column name (in this case the same date converted to string) and when
 queried using cassandra cli is unsorted too:

 cassandra get Events.EventsByUserDate ['guille']
 = (super_column=9088542550893002752,

 (column=4342323443303834363833383437454339364433324530324538413039373736,
 value=2010-12-06 17:43:36.000, timestamp=1291657416526732))
 = (super_column=5990347482238812160,

 (column=41414e4c6b54696d6532423656566e6869667a336f654b6147393d2d395a4e797441397a744f39686d3147392b406d61696c2e676d61696c2e636f6d,
 value=2010-12-06 17:46:08.000, timestamp=1291657568569039))
 = (super_column=-3089190841516818432,

 (column=3634343644353236463830303437363542454245354630343845393533373337,
 value=2010-12-06 17:44:47.000, timestamp=1291657487450738))
 = (super_column=-4026221038986592256,

 (column=62303232396330372d636430612d343662332d623834382d393632366136323061376532,
 value=2010-12-06 17:39:50.000, timestamp=1291657190117981))




 On Mon, Dec 6, 2010 at 3:02 PM, Tyler Hobbs ty...@riptano.com wrote:

 What client are you using?  Is it storing the results in a hash map or
 some other type of
 non-order preserving dictionary?

 - Tyler


 On Mon, Dec 6, 2010 at 10:11 AM, Guillermo Winkler 
 gwink...@inconcertcc.com wrote:

 Hi, I've the following schema defined:

 EventsByUserDate : {
  UserId : {
 epoch: { // SC
  IID,
 IID,
 IID,
  IID
 },
 // and the other events in time
  epoch: {
 IID,
 IID,
  IID
 }
 }
 }
 ColumnFamily ColumnType=Super CompareWith=LongType
 CompareSubcolumnsWith=BytesType Name=EventsByUserDate /

 Where I'm expecting to store all the event ids for a user ordered by
 date (it's seconds since epoch as long long), I'm using
 OrdingPreservingPartitioner.

 But a call to:

 GetSuperRangeSlices(EventsByUserDate ,  --column family
 ,  --supercolumn
  userId, --startkey
 userId, --endkey
  {
 column_names = {},
slice_range = {
  start = ,
   finish = ,
  reversed = true,
  count = 20} },
 1 --total keys
)

 Is not sorting correctly by supercolumn (the supercolumn names come out
 unsorted), this is a sample output for the pervious query using thrift
 directly:

 SC 1291648883
 SC 1291588465
 SC 1291588453
 SC 1291586385
 SC 1291587408
 SC 1291588174
 SC 1291585331
 SC 1291587116
 SC 1291651116
 SC 1291586332
 SC 1291588548
 SC 1291588036
 SC 1291648703
 SC 1291583651
 SC 1291583650
 SC 1291583649
 SC 1291583648
 SC 1291583647
 SC 1291583646
 SC 1291587485


 Anything I'm missing regarding sorting schemes?

 Thanks,
 Guille

Re: Sorting problem on supercolumns names using OPP on 0.6.2

2010-12-06 Thread David Replogle

+1

I'm doing this in my C++ client so contact me offlist if you need code

David
Sent from my iPhone

On Dec 6, 2010, at 1:33 PM, Tyler Hobbs ty...@riptano.com wrote:

 Also, thought I should mention:
 
 When you make a std::string out of the char[], make sure to use the 
 constructor with the size_t parameter (size 8).
 
 - Tyler
 
 On Mon, Dec 6, 2010 at 12:29 PM, Tyler Hobbs ty...@riptano.com wrote:
 That should be big-endian.
 
 
 On Mon, Dec 6, 2010 at 12:29 PM, Tyler Hobbs ty...@riptano.com wrote:
 How are you packing the longs into strings?  The large negative numbers point 
 to that being done incorrectly.
 
 Bitshifting and putting each byte of the long into a char[8] then 
 stringifying the char[] is the best way to go.  Cassandra expects
 big-ending longs, as well.
 
 - Tyler
 
 
 On Mon, Dec 6, 2010 at 11:55 AM, Guillermo Winkler gwink...@inconcertcc.com 
 wrote:
 I'm using thrift in C++ and inserting the results in a vector of pairs, so 
 client-side-mangling does not seem to be the problem.
 
 Also I'm using a test column where I insert the same value I'm using as 
 super column name (in this case the same date converted to string) and when 
 queried using cassandra cli is unsorted too:
 
 cassandra get Events.EventsByUserDate ['guille']
 = (super_column=9088542550893002752,
  
 (column=4342323443303834363833383437454339364433324530324538413039373736, 
 value=2010-12-06 17:43:36.000, timestamp=1291657416526732))
 = (super_column=5990347482238812160,
  
 (column=41414e4c6b54696d6532423656566e6869667a336f654b6147393d2d395a4e797441397a744f39686d3147392b406d61696c2e676d61696c2e636f6d,
  value=2010-12-06 17:46:08.000, timestamp=1291657568569039))
 = (super_column=-3089190841516818432,
  
 (column=3634343644353236463830303437363542454245354630343845393533373337, 
 value=2010-12-06 17:44:47.000, timestamp=1291657487450738))
 = (super_column=-4026221038986592256,
  
 (column=62303232396330372d636430612d343662332d623834382d393632366136323061376532,
  value=2010-12-06 17:39:50.000, timestamp=1291657190117981))
 
 
 
 
 On Mon, Dec 6, 2010 at 3:02 PM, Tyler Hobbs ty...@riptano.com wrote:
 What client are you using?  Is it storing the results in a hash map or some 
 other type of
 non-order preserving dictionary?
 
 - Tyler
 
 
 On Mon, Dec 6, 2010 at 10:11 AM, Guillermo Winkler gwink...@inconcertcc.com 
 wrote:
 Hi, I've the following schema defined:
 
 EventsByUserDate : { 
   UserId : {
   epoch: { // SC
   IID, 
   IID,
   IID,
   IID
   },
   // and the other events in time
   epoch: {
   IID,
   IID,
   IID
   }
   }
 }
 ColumnFamily ColumnType=Super CompareWith=LongType 
 CompareSubcolumnsWith=BytesType Name=EventsByUserDate /
 
 Where I'm expecting to store all the event ids for a user ordered by date 
 (it's seconds since epoch as long long), I'm using 
 OrdingPreservingPartitioner.
 
 But a call to:
 
 GetSuperRangeSlices(EventsByUserDate ,  --column family
   ,  --supercolumn
   userId, --startkey
   userId, --endkey
{ 
  column_names = {},
  slice_range = { 
start = , 
finish = , 
reversed = true, 
  count = 20}  
 },
 1 --total keys
  ) 
 
 Is not sorting correctly by supercolumn (the supercolumn names come out 
 unsorted), this is a sample output for the pervious query using thrift 
 directly:
 
 SC 1291648883
 SC 1291588465
 SC 1291588453
 SC 1291586385
 SC 1291587408
 SC 1291588174
 SC 1291585331
 SC 1291587116
 SC 1291651116
 SC 1291586332
 SC 1291588548
 SC 1291588036
 SC 1291648703
 SC 1291583651
 SC 1291583650
 SC 1291583649
 SC 1291583648
 SC 1291583647
 SC 1291583646
 SC 1291587485
 
 
 Anything I'm missing regarding sorting schemes? 
 
 Thanks,
 Guille

Newbie question about connecting to a cassandra server from another server using Fauna

Hi I'm trying to create a connection to a server running cassandra doing this:

compass = Cassandra.new('Compas', servers=223.798.456.123:9160)

But once I try to get some data I realize that there's no connection, any 
ideas?? I'm I missing something ?

Thanks

Re: Sorting problem on supercolumns names using OPP on 0.6.2

uh, ok I was just copying :P

string result;
result.resize(sizeof(long long));
memcpy(result[0], l, sizeof(long long));

I'll try and let you know

many thanks!


On Mon, Dec 6, 2010 at 4:29 PM, Tyler Hobbs ty...@riptano.com wrote:

 How are you packing the longs into strings?  The large negative numbers
 point to that being done incorrectly.

 Bitshifting and putting each byte of the long into a char[8] then
 stringifying the char[] is the best way to go.  Cassandra expects
 big-ending longs, as well.

 - Tyler


 On Mon, Dec 6, 2010 at 11:55 AM, Guillermo Winkler 
 gwink...@inconcertcc.com wrote:

 I'm using thrift in C++ and inserting the results in a vector of pairs, so
 client-side-mangling does not seem to be the problem.

 Also I'm using a test column where I insert the same value I'm using as
 super column name (in this case the same date converted to string) and when
 queried using cassandra cli is unsorted too:

 cassandra get Events.EventsByUserDate ['guille']
 = (super_column=9088542550893002752,

 (column=4342323443303834363833383437454339364433324530324538413039373736,
 value=2010-12-06 17:43:36.000, timestamp=1291657416526732))
 = (super_column=5990347482238812160,

 (column=41414e4c6b54696d6532423656566e6869667a336f654b6147393d2d395a4e797441397a744f39686d3147392b406d61696c2e676d61696c2e636f6d,
 value=2010-12-06 17:46:08.000, timestamp=1291657568569039))
 = (super_column=-3089190841516818432,

 (column=3634343644353236463830303437363542454245354630343845393533373337,
 value=2010-12-06 17:44:47.000, timestamp=1291657487450738))
 = (super_column=-4026221038986592256,

 (column=62303232396330372d636430612d343662332d623834382d393632366136323061376532,
 value=2010-12-06 17:39:50.000, timestamp=1291657190117981))




 On Mon, Dec 6, 2010 at 3:02 PM, Tyler Hobbs ty...@riptano.com wrote:

 What client are you using?  Is it storing the results in a hash map or
 some other type of
 non-order preserving dictionary?

 - Tyler


 On Mon, Dec 6, 2010 at 10:11 AM, Guillermo Winkler 
 gwink...@inconcertcc.com wrote:

 Hi, I've the following schema defined:

 EventsByUserDate : {
  UserId : {
 epoch: { // SC
  IID,
 IID,
 IID,
  IID
 },
 // and the other events in time
  epoch: {
 IID,
 IID,
  IID
 }
 }
 }
 ColumnFamily ColumnType=Super CompareWith=LongType
 CompareSubcolumnsWith=BytesType Name=EventsByUserDate /

 Where I'm expecting to store all the event ids for a user ordered by
 date (it's seconds since epoch as long long), I'm using
 OrdingPreservingPartitioner.

 But a call to:

 GetSuperRangeSlices(EventsByUserDate ,  --column family
 ,  --supercolumn
  userId, --startkey
 userId, --endkey
  {
 column_names = {},
slice_range = {
  start = ,
   finish = ,
  reversed = true,
  count = 20} },
 1 --total keys
)

 Is not sorting correctly by supercolumn (the supercolumn names come out
 unsorted), this is a sample output for the pervious query using thrift
 directly:

 SC 1291648883
 SC 1291588465
 SC 1291588453
 SC 1291586385
 SC 1291587408
 SC 1291588174
 SC 1291585331
 SC 1291587116
 SC 1291651116
 SC 1291586332
 SC 1291588548
 SC 1291588036
 SC 1291648703
 SC 1291583651
 SC 1291583650
 SC 1291583649
 SC 1291583648
 SC 1291583647
 SC 1291583646
 SC 1291587485


 Anything I'm missing regarding sorting schemes?

 Thanks,
 Guille

Re: Newbie question about connecting to a cassandra server from another server using Fauna

2010-12-06 Thread Aaron Morton

What function are you calling to get data and what is the error ?Try calling a function like keyspaces(), it should return a list of the keyspaces in your cluster and is a good way to test things are connected.If there is still no joy check you can connect to your cluster using the cassandra-cli command line app located in cassandra/binAaronOn 07 Dec, 2010,at 08:46 AM, Alberto Velandia betovelan...@gmail.com wrote:Hi I'm trying to create a connection to a server running cassandra doing this:compass = Cassandra.new('Compas', servers="223.798.456.123:9160")But once I try to get some data I realize that there's no connection, any ideas?? I'm I missing something ?Thanks

Re: [RELEASE] 0.7.0 rc1

2010-12-06 Thread Clint Byrum

On Thu, 2010-12-02 at 10:30 -0800, Clint Byrum wrote:
 On Wed, 2010-12-01 at 17:00 +0100, Olivier Rosello wrote:
   FYI, 0.7.0~rc1 debs are available in a new PPA for experimental
   releases:
   
   http://launchpad.net/~cassandra-ubuntu/+archive/experimental
   
  
  It seems there is a dependancy on libjets3t-java
  
  Is it really needed ? This dependancy cannot be resolved on Ubuntu Lucid :-(
  
 
 I'll be working on getting any of the packaged dependencies from natty
 added to the lucid PPA very soon. Please stay tuned!
 

Ok, jets3t has been uploaded and cassandra copied to the lucid and
maverick releases, so this PPA should work for users on 10.04 and later
now.

Re: Newbie question about connecting to a cassandra server from another server using Fauna

I've tried the keyspaces() function and got this on return:

 compass.keyspaces()
CassandraThrift::Cassandra::Client::TransportException: 
CassandraThrift::Cassandra::Client::TransportException
from 
/home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/socket.rb:53:in
 `rescue in open'
from 
/home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/socket.rb:36:in
 `open'
from 
/home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/buffered_transport.rb:37:in
 `open'
from 
/home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/connection/socket.rb:11:in
 `connect!'
from 
/home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:82:in
 `connect!'
from 
/home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:110:in
 `handled_proxy'
from 
/home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:57:in
 `get_string_property'
from 
/home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:302:in
 `all_nodes'
from 
/home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:285:in
 `reconnect!'
from 
/home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:280:in
 `client'
from 
/home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:86:in
 `keyspaces'
from (irb):4
from /home/compass/.rvm/rubies/ruby-1.9.2-p0/bin/irb:16:in `main'


about the cassandra-cli, should I run the command on the server from which I'm 
trying to connect?

Thanks for the help
On Dec 6, 2010, at 2:52 PM, Aaron Morton wrote:

 What function are you calling to get data and what is the error ?
 
 Try calling a function like keyspaces(), it should return a list of the 
 keyspaces in your cluster  and is a good way to test things are connected.
 
 If there is still no joy check you can connect to your cluster using the 
 cassandra-cli command line app located in cassandra/bin
 
 Aaron
 
 On 07 Dec, 2010,at 08:46 AM, Alberto Velandia betovelan...@gmail.com wrote:
 
 Hi I'm trying to create a connection to a server running cassandra doing 
 this:
 
 compass = Cassandra.new('Compas', servers=223.798.456.123:9160)
 
 But once I try to get some data I realize that there's no connection, any 
 ideas?? I'm I missing something ?
 
 Thanks

Re: Newbie question about connecting to a cassandra server from another server using Fauna

2010-12-06 Thread Aaron Morton

You can run the cassandra-cli from any machine. If you run it from the same machine as your ruby code it's a reliable way to check you can connect to the cluster.ok, next set of questions- what version of cassandra are you using ? Is it 0.7?- what require did you run ? was it require 'cassandra/0.7' ? see the Usage sectionhttps://github.com/fauna/cassandrafor details.Cassandra 0.7 defaults to frames transport, 0.6 does not.AaronOn 07 Dec, 2010,at 09:07 AM, Alberto Velandia betovelan...@gmail.com wrote:I've tried the keyspaces() function and got this on return:compass.keyspaces()CassandraThrift::Cassandra::Client::TransportException: CassandraThrift::Cassandra::Client::TransportException	from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/socket.rb:53:in `rescue in open'	from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/socket.rb:36:in `open'	from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/buffered_transport.rb:37:in `open'	from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/connection/socket.rb:11:in `connect!'	from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:82:in `connect!'	from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:110:in `handled_proxy'	from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:57:in `get_string_property'	from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:302:in `all_nodes'	from /home/compass/.rvm/gems/ruby-19.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:285:in `reconnect!'	from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:280:in `client'	from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:86:in `keyspaces'	from (irb):4	from /home/compass/.rvm/rubies/ruby-1.9.2-p0/bin/irb:16:in `main'about the cassandra-cli, should I run the command on the server from which I'm trying to connect?Thanks for the helpOn Dec 6, 2010, at 2:52 PM, Aaron Morton wrote:What function are you calling to get data and what is the error ?Try calling a function like keyspaces(), it should return a list of the keyspaces in your cluster and is a good way to test things are connected.If there is still no joy check you can connect to your cluster using the cassandra-cli command line app located in cassandra/binAaronOn 07 Dec, 2010,at 08:46 AM, Alberto Velandia betovelan...@gmail.com wrote:Hi I'm trying to create a connection to a server running cassandra doing this:compass = Cassandra.new('Compas', servers="223.798.456.123:9160")But once I try to get some data I realize that there's no connection, any ideas?? I'm I missing something ?Thanks

Re: Newbie question about connecting to a cassandra server from another server using Fauna

Hi I've successfully managed to connect to the server through the cassandra-cli 
command but still no luck on doing it from Fauna, I'm running cassandra 0.6.8 
and I did the usual require 'cassandra'

I've changed the ThriftAddress on the storage-conf.xml to the IP address of the 
server itself, do I need to change anything else?

Thanks once again 

On Dec 6, 2010, at 3:15 PM, Aaron Morton wrote:

 You can run the cassandra-cli from any machine. If you run it from the same 
 machine as your ruby code it's a reliable way to check you can connect to the 
 cluster. 
 
 ok, next set of questions
 
 - what version of cassandra are you using ? Is it 0.7?
 - what require did you run ? was it require 'cassandra/0.7' ? see the Usage 
 section https://github.com/fauna/cassandra for details. 
 
   Cassandra 0.7 defaults to frames transport, 0.6 does not. 
 
 Aaron
 
 On 07 Dec, 2010,at 09:07 AM, Alberto Velandia betovelan...@gmail.com wrote:
 
 I've tried the keyspaces() function and got this on return:
 
  compass.keyspaces()
 CassandraThrift::Cassandra::Client::TransportException: 
 CassandraThrift::Cassandra::Client::TransportException
  from 
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/socket.rb:53:in
  `rescue in open'
  from 
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/socket.rb:36:in
  `open'
  from 
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/buffered_transport.rb:37:in
  `open'
  from 
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/connection/socket.rb:11:in
  `connect!'
  from 
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:82:in
  `connect!'
  from 
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:110:in
  `handled_proxy'
  from 
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:57:in
  `get_string_property'
  from 
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:302:in
  `all_nodes'
  from 
 /home/compass/.rvm/gems/ruby-19.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:285:in
  `reconnect!'
  from 
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:280:in
  `client'
  from 
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:86:in
  `keyspaces'
  from (irb):4
  from /home/compass/.rvm/rubies/ruby-1.9.2-p0/bin/irb:16:in `main'
 
 
 about the cassandra-cli, should I run the command on the server from which 
 I'm trying to connect?
 
 Thanks for the help
 On Dec 6, 2010, at 2:52 PM, Aaron Morton wrote:
 
 What function are you calling to get data and what is the error ?
 
 Try calling a function like keyspaces(), it should return a list of the 
 keyspaces in your cluster  and is a good way to test things are connected.
 
 If there is still no joy check you can connect to your cluster using the 
 cassandra-cli command line app located in cassandra/bin
 
 Aaron
 
 On 07 Dec, 2010,at 08:46 AM, Alberto Velandia betovelan...@gmail.com 
 wrote:
 
 Hi I'm trying to create a connection to a server running cassandra doing 
 this:
 
 compass = Cassandra.new('Compas', servers=223.798.456.123:9160)
 
 But once I try to get some data I realize that there's no connection, any 
 ideas?? I'm I missing something ?
 
 Thanks

Re: Re: Cassandra 0.7 beta 3 outOfMemory (OOM)

2010-12-06 Thread Aaron Morton

Jake or anyone else got experience bulk loading into Lucandra ?Or does anyone have experience with JRocket ?Max, are you sending one document at a time into lucene. Can you send them in batches (like solr), if so does it reduce theamount of requests going to cassandra?Also, cassandra.bat is configured withXX:+HeapDumpOnOutOfMemoryError so you should be able to take a look at where all the memory if going. Riptano blog points tohttp://www.eclipse.org/mat/also seehttp://www.oracle.com/technetwork/java/javase/memleaks-137499.html#gdyrrHope that helps.AaronOn 07 Dec, 2010,at 09:17 AM, Aaron Morton aa...@thelastpickle.com wrote:Accidentallysent to me.Begin forwarded message:From: Max cassan...@ajowa.deDate: 07 December 2010 6:00:36 AMTo: Aaron Morton aa...@thelastpickle.comSubject: Re: Re: Re: Cassandra 0.7 beta 3 outOfMemory (OOM)Thank you both for your answer!
After several tests with different parameters we came to the
conclusion that it must be a bug.
It looks very similar to: https://issues.apache.org/jira/browse/CASSANDRA-1014

For both CFs we reduced thresholds:
- memtable_flush_after_mins = 60 (both CFs are used permanently,
therefore other thresholds should trigger first)
- memtable_throughput_in_mb = 40
- memtable_operations_in_millions = 0.3
- keys_cached = 0
- rows_cached = 0

- in_memory_compaction_limit_in_mb = 64

First we disabled caching, later we disabled compacting and after that we set
commitlog_sync: batch
commitlog_sync_batch_window_in_ms: 1

But our problem still appears:
During inserting files with Lucandra memory usage is slowly growing
until OOM crash after about 50 min.
@Peter: In our latest test we stopped writing suddenly but cassandra
didn\'t relax and remains even after minutes on ~90% heap usage.
http://oi54.tinypic.com/2dueeix.jpg

With our heap calculation we should need:
64 MB * 2 * 3 + 1 GB = 1,4 GB
All recent tests we run with 3 GB. I think that should be ok for a
test machine.
Also consistency level is one.

But Aaron is right, Lucandra produces even more than 200 inserts/s.
My 200 documents per second are about 200 operations (writecount) on
first CF and about 3000 on second CF.

But even with about 120 documents/s cassandra crashes.

Disk I/O monitored with Windows performance admin tools is on both
discs moderate (commitlog is on seperate harddisc).

Any ideas?
If it's really a bug, in my opinion it's very critical.

Aaron Morton aa...@thelastpickle.com wrote:

I remember you have 2 CF's but what are the settings for:

- memtable_flush_after_mins
-memtable_throughput_in_mb
-memtable_operations_in_millions
-keys_cached
-rows_cached

-in_memory_compaction_limit_in_mb

Can you do the JVM Heap Calculation here and see what it says
http://wiki.apache.org/cassandra/MemtableThresholds

What Consistency Level are you writing at? (Checking it's not Zero)

When you talk about 200 inserts per second is that storing 200
documents through lucandra or 200 request to cassandra. If it's the
first option I would assume that would generate a lot more actual
requests into cassandra. Open up jconsole and take a look at the
WriteCount settings for the
CF'shttp://wiki.apache.org/cassandra/MemtableThresholds

You could also try setting the compaction thresholds to 0 to disable
compaction while you are pushing this data in. Then use node tool to
compact and turn the settings back to normal. See cassandra.yam for
more info.

I would have thought you could get the writes through with the setup
you've described so far (even though a single 32bit node is unusual).
The best advice is to turn all the settings down (e.g. caches off,
mtable flush 64MB, compaction disabled) and if it still fails try:

- checking your IO stats, not sure on windows but JConsole has some IO
stats. If your IO cannot keep up then your server is not fast enough
for your client load.
- reducing the client load

Hope that helps.
Aaron

On 04 Dec, 2010,at 05:23 AM, Max cassan...@ajowa.de wrote:

Hi,

we increased heap space to 3 GB (with JRocket VM under 32-bit Win with
4 GB RAM)
but under "heavy" inserts Cassandra is still crashing with OutOfMemory
error after a GC storm

It sounds very similar to
https://issues.apache.org/jira/browse/CASSANDRA-1177

In our insert-tests the average heap usage is slowly growing up to the
3 GB border (jconsole monitor over 50 min
http://oi51.tinypic.com/k12gzd.jpg) and the CompactionManger queue is
also constantly growing up to about 50 jobs pending

We tried to decrease CF memtable threshold but after about half a
million inserts it's over.

- Cassandra 0.7.0 beta 3
- Single Node
- about 200 inserts/s ~500byte - 1 kb

Is there no other possibility instead of slowing down inserts/s ?

What could be an indicator to see if a node works stable with this
amount of inserts?

Thank you for your answer,
Max

Aaron Morton aa...@thelastpickle.com:

Sounds like you need to increase the Heap size and/or reduce the

Re: Sorting problem on supercolumns names using OPP on 0.6.2

o now it's behaving :)

#define ntohll(x) (((_int64)(ntohl((int)((x  32)  32)))  32) |
(unsigned int)ntohl(((int)(x  32

string result;
result.resize(sizeof(long long));
long long bigendian = htonll(l);
memcpy(result[0], bigendian, sizeof(long long));

= (super_column=1291668233,

(column=35646130653133632d333766642d343231312d386138382d393936383966326462643364,
value=2010-12-06 20:43:53.000, timestamp=1291668233034754)

(column=61323432323262622d353734342d346133322d393530312d626238343365346363376335,
value=2010-12-06 20:43:53.000, timestamp=1291668233169771)

(column=66633136333166382d373733622d343734652d393265362d376162633364316564383964,
value=2010-12-06 20:43:53.000, timestamp=1291668233302288))
= (super_column=1291668232,

(column=61343765353432352d613066392d343334392d613761392d336635313631633261303161,
value=2010-12-06 20:43:52.000, timestamp=1291668232563694)

(column=64343635396433382d316166302d343732662d623737392d336634303931323961373364,
value=2010-12-06 20:43:52.000, timestamp=1291668232889235))

Thanks again!
Guille




On Mon, Dec 6, 2010 at 5:45 PM, Guillermo Winkler
gwink...@inconcertcc.comwrote:

 uh, ok I was just copying :P

 string result;
 result.resize(sizeof(long long));
  memcpy(result[0], l, sizeof(long long));

 I'll try and let you know

 many thanks!


 On Mon, Dec 6, 2010 at 4:29 PM, Tyler Hobbs ty...@riptano.com wrote:

 How are you packing the longs into strings?  The large negative numbers
 point to that being done incorrectly.

 Bitshifting and putting each byte of the long into a char[8] then
 stringifying the char[] is the best way to go.  Cassandra expects
 big-ending longs, as well.

 - Tyler


 On Mon, Dec 6, 2010 at 11:55 AM, Guillermo Winkler 
 gwink...@inconcertcc.com wrote:

 I'm using thrift in C++ and inserting the results in a vector of pairs,
 so client-side-mangling does not seem to be the problem.

 Also I'm using a test column where I insert the same value I'm using as
 super column name (in this case the same date converted to string) and when
 queried using cassandra cli is unsorted too:

 cassandra get Events.EventsByUserDate ['guille']
 = (super_column=9088542550893002752,

 (column=4342323443303834363833383437454339364433324530324538413039373736,
 value=2010-12-06 17:43:36.000, timestamp=1291657416526732))
 = (super_column=5990347482238812160,

 (column=41414e4c6b54696d6532423656566e6869667a336f654b6147393d2d395a4e797441397a744f39686d3147392b406d61696c2e676d61696c2e636f6d,
 value=2010-12-06 17:46:08.000, timestamp=1291657568569039))
 = (super_column=-3089190841516818432,

 (column=3634343644353236463830303437363542454245354630343845393533373337,
 value=2010-12-06 17:44:47.000, timestamp=1291657487450738))
 = (super_column=-4026221038986592256,

 (column=62303232396330372d636430612d343662332d623834382d393632366136323061376532,
 value=2010-12-06 17:39:50.000, timestamp=1291657190117981))




 On Mon, Dec 6, 2010 at 3:02 PM, Tyler Hobbs ty...@riptano.com wrote:

 What client are you using?  Is it storing the results in a hash map or
 some other type of
 non-order preserving dictionary?

 - Tyler


 On Mon, Dec 6, 2010 at 10:11 AM, Guillermo Winkler 
 gwink...@inconcertcc.com wrote:

 Hi, I've the following schema defined:

 EventsByUserDate : {
  UserId : {
 epoch: { // SC
  IID,
 IID,
 IID,
  IID
 },
 // and the other events in time
  epoch: {
 IID,
 IID,
  IID
 }
 }
 }
 ColumnFamily ColumnType=Super CompareWith=LongType
 CompareSubcolumnsWith=BytesType Name=EventsByUserDate /

 Where I'm expecting to store all the event ids for a user ordered by
 date (it's seconds since epoch as long long), I'm using
 OrdingPreservingPartitioner.

 But a call to:

 GetSuperRangeSlices(EventsByUserDate ,  --column family
 ,  --supercolumn
  userId, --startkey
 userId, --endkey
  {
 column_names = {},
slice_range = {
  start = ,
   finish = ,
  reversed = true,
  count = 20} },
 1 --total keys
)

 Is not sorting correctly by supercolumn (the supercolumn names come out
 unsorted), this is a sample output for the pervious query using thrift
 directly:

 SC 1291648883
 SC 1291588465
 SC 1291588453
 SC 1291586385
 SC 1291587408
 SC 1291588174
 SC 1291585331
 SC 1291587116
 SC 1291651116
 SC 1291586332
 SC 1291588548
 SC 1291588036
 SC 1291648703
 SC 1291583651
 SC 1291583650
 SC 1291583649
 SC 1291583648
 SC 1291583647
 SC 1291583646
 SC 1291587485


 Anything I'm missing regarding sorting schemes?

 Thanks,
 Guille

questions about cassandra-1072/1546

2010-12-06 Thread David Hawthorne

Can we get an update?  After reading through the comments on 1072, it looks 
like this is getting close to finished, but it's hard for someone not knee-deep 
in the project to tell.  I'm primarily interested in the timeline you foresee 
for getting the increment support into trunk for 0.7, and some documentation 
around how counters will be supported from the user's perspective - chiefly 
what a Column and SuperColumn will look like with counters and what the thrift 
API will be.  Some documentation about the remaining issues and concerns we 
should be aware of when using counters would be good, too, since it looks like 
there were some in the comments.  Again, as someone not knee-deep in the 
project, it's hard to tell how severe they are or how or when they would apply 
in general use.

Re: Newbie question about connecting to a cassandra server from another server using Fauna

2010-12-06 Thread Ryan King

It would help if you give us more context. The code snippet you've
given us is incomplete and not very helpful.

-ryan

On Mon, Dec 6, 2010 at 12:33 PM, Alberto Velandia
betovelan...@gmail.com wrote:
 Hi I've successfully managed to connect to the server through the
 cassandra-cli command but still no luck on doing it from Fauna, I'm running
 cassandra 0.6.8 and I did the usual require 'cassandra'
 I've changed the ThriftAddress on the storage-conf.xml to the IP address of
 the server itself, do I need to change anything else?
 Thanks once again
 On Dec 6, 2010, at 3:15 PM, Aaron Morton wrote:

 You can run the cassandra-cli from any machine. If you run it from the same
 machine as your ruby code it's a reliable way to check you can connect to
 the cluster.
 ok, next set of questions
 - what version of cassandra are you using ? Is it 0.7?
 - what require did you run ? was it require 'cassandra/0.7' ? see the Usage
 section https://github.com/fauna/cassandra for details.
   Cassandra 0.7 defaults to frames transport, 0.6 does not.
 Aaron
 On 07 Dec, 2010,at 09:07 AM, Alberto Velandia betovelan...@gmail.com
 wrote:

 I've tried the keyspaces() function and got this on return:
  compass.keyspaces()
 CassandraThrift::Cassandra::Client::TransportException:
 CassandraThrift::Cassandra::Client::TransportException
 from
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/socket.rb:53:in
 `rescue in open'
 from
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/socket.rb:36:in
 `open'
 from
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/buffered_transport.rb:37:in
 `open'
 from
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/connection/socket.rb:11:in
 `connect!'
 from
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:82:in
 `connect!'
 from
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:110:in
 `handled_proxy'
 from
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:57:in
 `get_string_property'
 from
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:302:in
 `all_nodes'
 from
 /home/compass/.rvm/gems/ruby-19.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:285:in
 `reconnect!'
 from
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:280:in
 `client'
 from
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:86:in
 `keyspaces'
 from (irb):4
 from /home/compass/.rvm/rubies/ruby-1.9.2-p0/bin/irb:16:in `main'

 about the cassandra-cli, should I run the command on the server from which
 I'm trying to connect?
 Thanks for the help
 On Dec 6, 2010, at 2:52 PM, Aaron Morton wrote:

 What function are you calling to get data and what is the error ?
 Try calling a function like keyspaces(), it should return a list of the
 keyspaces in your cluster  and is a good way to test things are connected.

 If there is still no joy check you can connect to your cluster using the
 cassandra-cli command line app located in cassandra/bin
 Aaron
 On 07 Dec, 2010,at 08:46 AM, Alberto Velandia betovelan...@gmail.com
 wrote:

 Hi I'm trying to create a connection to a server running cassandra doing
 this:
 compass = Cassandra.new('Compas', servers=223.798.456.123:9160)
 But once I try to get some data I realize that there's no connection, any
 ideas?? I'm I missing something ?
 Thanks

Re: Newbie question about connecting to a cassandra server from another server using Fauna

I've found the solution, thanks for the help, I needed to change the addresses 
on the storage-conf.xml both ListenAddress and  ThriftAddress to the address of 
the server itself. Sorry about the snippet being incomplete btw


On Dec 6, 2010, at 4:18 PM, Ryan King wrote:

 It would help if you give us more context. The code snippet you've
 given us is incomplete and not very helpful.
 
 -ryan
 
 On Mon, Dec 6, 2010 at 12:33 PM, Alberto Velandia
 betovelan...@gmail.com wrote:
 Hi I've successfully managed to connect to the server through the
 cassandra-cli command but still no luck on doing it from Fauna, I'm running
 cassandra 0.6.8 and I did the usual require 'cassandra'
 I've changed the ThriftAddress on the storage-conf.xml to the IP address of
 the server itself, do I need to change anything else?
 Thanks once again
 On Dec 6, 2010, at 3:15 PM, Aaron Morton wrote:
 
 You can run the cassandra-cli from any machine. If you run it from the same
 machine as your ruby code it's a reliable way to check you can connect to
 the cluster.
 ok, next set of questions
 - what version of cassandra are you using ? Is it 0.7?
 - what require did you run ? was it require 'cassandra/0.7' ? see the Usage
 section https://github.com/fauna/cassandra for details.
   Cassandra 0.7 defaults to frames transport, 0.6 does not.
 Aaron
 On 07 Dec, 2010,at 09:07 AM, Alberto Velandia betovelan...@gmail.com
 wrote:
 
 I've tried the keyspaces() function and got this on return:
  compass.keyspaces()
 CassandraThrift::Cassandra::Client::TransportException:
 CassandraThrift::Cassandra::Client::TransportException
 from
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/socket.rb:53:in
 `rescue in open'
 from
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/socket.rb:36:in
 `open'
 from
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/buffered_transport.rb:37:in
 `open'
 from
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/connection/socket.rb:11:in
 `connect!'
 from
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:82:in
 `connect!'
 from
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:110:in
 `handled_proxy'
 from
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:57:in
 `get_string_property'
 from
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:302:in
 `all_nodes'
 from
 /home/compass/.rvm/gems/ruby-19.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:285:in
 `reconnect!'
 from
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:280:in
 `client'
 from
 /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:86:in
 `keyspaces'
 from (irb):4
 from /home/compass/.rvm/rubies/ruby-1.9.2-p0/bin/irb:16:in `main'
 
 about the cassandra-cli, should I run the command on the server from which
 I'm trying to connect?
 Thanks for the help
 On Dec 6, 2010, at 2:52 PM, Aaron Morton wrote:
 
 What function are you calling to get data and what is the error ?
 Try calling a function like keyspaces(), it should return a list of the
 keyspaces in your cluster  and is a good way to test things are connected.
 
 If there is still no joy check you can connect to your cluster using the
 cassandra-cli command line app located in cassandra/bin
 Aaron
 On 07 Dec, 2010,at 08:46 AM, Alberto Velandia betovelan...@gmail.com
 wrote:
 
 Hi I'm trying to create a connection to a server running cassandra doing
 this:
 compass = Cassandra.new('Compas', servers=223.798.456.123:9160)
 But once I try to get some data I realize that there's no connection, any
 ideas?? I'm I missing something ?
 Thanks

Pagination

2010-12-06 Thread Mark

How is pagination accomplished when you dont know a start key? For 
example, how can I jump to page 10?

Re: Pagination