Re: UUIDs whose alphanumeric order is the same as their chronological order

2010-06-23 Thread David Boxenhorn
Having a physical location encoded in the UUID *increases* the chance of a
collision, because it means fewer random bits. There definitely will be more
than one UUID created in the same clock unit on the same machine! The same
bits that you use to encode your few servers can be used for over 100
trillion random numbers!

As to ordering, if you wanted to use time-uuids, comparators that do
give time-based ordering are trivial, and no slower than lexical
sorting.

No slower isn't a good reason to use it! I am willing to take a
(reasonable) time *penalty* to use lexically ordered UUIDs that will work
both in Cassandra and Oracle (and which are human-readable - always good for
debugging)!

I am also willing to take a reasonable penalty to avoid using weird
third-party code for generating UUIDs in the first place.

On Tue, Jun 22, 2010 at 10:05 PM, Tatu Saloranta tsalora...@gmail.comwrote:

 On Tue, Jun 22, 2010 at 9:12 AM, David Boxenhorn da...@lookin2.com
 wrote:
  A little bit of time fuzziness on the order of a few milliseconds is fine
  with me. This is user-generated data, so it only has to be time-ordered
 at
  the level that a user can perceive.

 Ok, so mostly ordered. :-)

  I have no worries about my solution working - I'm sure it will work. I
 just
  wonder if TimeUUIDType isn't superior for some reason that I don't know
  about. (TimeUUIDType seems so bad in so many ways that I wonder why
 anyone
  uses it. There must be some reason!)

 I think that rationally thinking random-number based UUID is the best,
 provided one has a good random number generator.
 But there is something intuitive about rather using location +
 time-based alternative, based on tiny chance of collision that any
 (pseudo) random number based system has.
 So it just seems intuitive safer to use time-uuids, I think -- it
 isn't, it just feels that way. :-)

 Secondary reason is probably the ordering, and desire to stay
 standards compliant.
 As to ordering, if you wanted to use time-uuids, comparators that do
 give time-based ordering are trivial, and no slower than lexical
 sorting.
 Java Uuid Generator (2.0) defaults to such comparator, as I agree that
 this makes more sense than whatever sorting you would otherwise get.
 It is unfortunate that clock chunks are ordered in weird way by uuid
 specification; there is no reason it couldn't have been made right
 way so that hex representation would sort nicely.

 -+ Tatu +-



Re: UUIDs whose alphanumeric order is the same as their chronological order

2010-06-23 Thread David Boxenhorn
Secondary reason is probably the ordering, and desire to stay
standards compliant.

My UUIDs are standards-compliant. They are of type 4. The type is encoded in
the format: --4xxx-8xxx-
 .


On Wed, Jun 23, 2010 at 9:54 AM, David Boxenhorn da...@lookin2.com wrote:

 Having a physical location encoded in the UUID *increases* the chance of a
 collision, because it means fewer random bits. There definitely will be more
 than one UUID created in the same clock unit on the same machine! The same
 bits that you use to encode your few servers can be used for over 100
 trillion random numbers!


 As to ordering, if you wanted to use time-uuids, comparators that do
 give time-based ordering are trivial, and no slower than lexical
 sorting.

 No slower isn't a good reason to use it! I am willing to take a
 (reasonable) time *penalty* to use lexically ordered UUIDs that will work
 both in Cassandra and Oracle (and which are human-readable - always good for
 debugging)!

 I am also willing to take a reasonable penalty to avoid using weird
 third-party code for generating UUIDs in the first place.

 On Tue, Jun 22, 2010 at 10:05 PM, Tatu Saloranta tsalora...@gmail.comwrote:

 On Tue, Jun 22, 2010 at 9:12 AM, David Boxenhorn da...@lookin2.com
 wrote:
  A little bit of time fuzziness on the order of a few milliseconds is
 fine
  with me. This is user-generated data, so it only has to be time-ordered
 at
  the level that a user can perceive.

 Ok, so mostly ordered. :-)

  I have no worries about my solution working - I'm sure it will work. I
 just
  wonder if TimeUUIDType isn't superior for some reason that I don't know
  about. (TimeUUIDType seems so bad in so many ways that I wonder why
 anyone
  uses it. There must be some reason!)

 I think that rationally thinking random-number based UUID is the best,
 provided one has a good random number generator.
 But there is something intuitive about rather using location +
 time-based alternative, based on tiny chance of collision that any
 (pseudo) random number based system has.
 So it just seems intuitive safer to use time-uuids, I think -- it
 isn't, it just feels that way. :-)

 Secondary reason is probably the ordering, and desire to stay
 standards compliant.
 As to ordering, if you wanted to use time-uuids, comparators that do
 give time-based ordering are trivial, and no slower than lexical
 sorting.
 Java Uuid Generator (2.0) defaults to such comparator, as I agree that
 this makes more sense than whatever sorting you would otherwise get.
 It is unfortunate that clock chunks are ordered in weird way by uuid
 specification; there is no reason it couldn't have been made right
 way so that hex representation would sort nicely.

 -+ Tatu +-





Re: UUIDs whose alphanumeric order is the same as their chronological order

2010-06-23 Thread David Boxenhorn
Tatu, I did read your comments - and I appreciate them very much!

I want someone to argue with me (using good arguments) since what I'm doing
*does* seem weird to me - because no one else is doing it.

What I mean by readable is that the sort order of my UUIDs are obvious to
humans.

What I mean by weird code is mostly that it doesn't come with enough
authority that I would trust it as a black-box more than my own code. For
example, what happens when I want to port it to different kinds of machines?
But another thing weird about it is the complexity (and I think low speed)
of the algorithms I need in my *own* code to use it. Just look at it
http://wiki.apache.org/cassandra/FAQ#working_with_timeuuid_in_java !



On Wed, Jun 23, 2010 at 10:03 AM, Tatu Saloranta tsalora...@gmail.comwrote:

 On Tue, Jun 22, 2010 at 11:54 PM, David Boxenhorn da...@lookin2.com
 wrote:
  Having a physical location encoded in the UUID *increases* the chance of
 a
  collision, because it means fewer random bits. There definitely will be
 more
  than one UUID created in the same clock unit on the same machine! The
 same
  bits that you use to encode your few servers can be used for over 100
  trillion random numbers!

 You did not read what I wrote... I did not say it does, just that
 people feel as if it does.
 
  As to ordering, if you wanted to use time-uuids, comparators that do
  give time-based ordering are trivial, and no slower than lexical
  sorting.
 
  No slower isn't a good reason to use it! I am willing to take a
  (reasonable) time *penalty* to use lexically ordered UUIDs that will work
  both in Cassandra and Oracle (and which are human-readable - always good
 for
  debugging)!

 Huh? These are plain old UUIDs, as readable (or not) as any.
 Comparator refers to java.util.Comparator (or Comparable for class
 itself).

 But fear not, I am not trying to change your mind, just pointing out
 that there is nothing magical about getting things to sort.
 Just that sorting by standard String representation is not the only
 collation order there is.

 
  I am also willing to take a reasonable penalty to avoid using weird
  third-party code for generating UUIDs in the first place.

 To each his own -- lots of people use weird code, and generally use
 little bit less derogatory and patronizing terms when referring such
 libraries.

 And it seems to me that you are perfectly happy writing your own
 unweird code to generate them instead. :-)

 -+ Tatu +-



Re: nodetool loadbalance : Strerams Continue on Non Acceptance of New Token

2010-06-23 Thread Gary Dusbabek
On Tue, Jun 22, 2010 at 20:16, Arya Goudarzi agouda...@gaiaonline.com wrote:
 Hi,

 Please confirm if this is an issue and should be reported or I am doing 
 something wrong. I could not find anything relevant on JIRA:

 Playing with 0.7 nightly (today's build), I setup a 3 node cluster this way:

  - Added one node;
  - Loaded default schema with RF 1 from YAML using JMX;
  - Loaded 2M keys using py_stress;
  - Bootstrapped a second node;
  - Cleaned up the first node;
  - Bootstrapped a third node;
  - Cleaned up the second node;

 I got the following ring:

 Address       Status     Load          Range                                  
     Ring
                                       154293670372423273273390365393543806425
 10.50.26.132  Up         518.63 MB     69164917636305877859094619660693892452 
     |--|
 10.50.26.134  Up         234.8 MB      
 111685517405103688771527967027648896391    |   |
 10.50.26.133  Up         235.26 MB     
 154293670372423273273390365393543806425    |--|

 Now I ran:

 nodetool --host 10.50.26.132 loadbalance

 It's been going for a while. I checked the streams

 nodetool --host 10.50.26.134 streams
 Mode: Normal
 Not sending any streams.
 Streaming from: /10.50.26.132
   Keyspace1: 
 /var/lib/cassandra/data/Keyspace1/Standard1-tmp-d-3-Data.db/[(0,22206096), 
 (22206096,27271682)]
   Keyspace1: 
 /var/lib/cassandra/data/Keyspace1/Standard1-tmp-d-4-Data.db/[(0,15180462), 
 (15180462,18656982)]
   Keyspace1: 
 /var/lib/cassandra/data/Keyspace1/Standard1-tmp-d-5-Data.db/[(0,353139829), 
 (353139829,433883659)]
   Keyspace1: 
 /var/lib/cassandra/data/Keyspace1/Standard1-tmp-d-6-Data.db/[(0,366336059), 
 (366336059,450095320)]

 nodetool --host 10.50.26.132 streams
 Mode: Leaving: streaming data to other nodes
 Streaming to: /10.50.26.134
   /var/lib/cassandra/data/Keyspace1/Standard1-d-48-Data.db/[(0,366336059), 
 (366336059,450095320)]
 Not receiving any streams.

 These have been going for the past 2 hours.

 I see in the logs of the node with 134 IP address and I saw this:

 INFO [GOSSIP_STAGE:1] 2010-06-22 16:30:54,679 StorageService.java (line 603) 
 Will not change my token ownership to /10.50.26.132

A node will give this message when it sees another node (usually for
the first time) that is trying to claim the same token but whose
startup time is much earlier (i.e., this isn't a token replacement).
It would follow that you would see this during a rebalance.


 So, to my understanding from wikis loadbalance supposed to decommission and 
 re-bootstrap again by sending its tokens to other nodes and then bootstrap 
 again. It's been stuck in streaming for the past 2 hours and the size of ring 
 has not changed. The log in the first node says it has started streaming for 
 the past hours:

 INFO [STREAM-STAGE:1] 2010-06-22 16:35:56,255 StreamOut.java (line 72) 
 Beginning transfer process to /10.50.26.134 for ranges 
 (154293670372423273273390365393543806425,69164917636305877859094619660693892452]
  INFO [STREAM-STAGE:1] 2010-06-22 16:35:56,255 StreamOut.java (line 82) 
 Flushing memtables for Keyspace1...
  INFO [STREAM-STAGE:1] 2010-06-22 16:35:56,266 StreamOut.java (line 128) 
 Stream context metadata 
 [/var/lib/cassandra/data/Keyspace1/Standard1-d-48-Data.db/[(0,366336059), 
 (366336059,450095320)]] 1 sstables.
  INFO [STREAM-STAGE:1] 2010-06-22 16:35:56,267 StreamOut.java (line 135) 
 Sending a stream initiate message to /10.50.26.134 ...
  INFO [STREAM-STAGE:1] 2010-06-22 16:35:56,267 StreamOut.java (line 140) 
 Waiting for transfer to /10.50.26.134 to complete
  INFO [FLUSH-TIMER] 2010-06-22 17:36:53,370 ColumnFamilyStore.java (line 359) 
 LocationInfo has reached its threshold; switching in a fresh Memtable at 
 CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1277249454413.log',
  position=720)
  INFO [FLUSH-TIMER] 2010-06-22 17:36:53,370 ColumnFamilyStore.java (line 622) 
 Enqueuing flush of Memtable(LocationInfo)@1637794189
  INFO [FLUSH-WRITER-POOL:1] 2010-06-22 17:36:53,370 Memtable.java (line 149) 
 Writing Memtable(LocationInfo)@1637794189
  INFO [FLUSH-WRITER-POOL:1] 2010-06-22 17:36:53,528 Memtable.java (line 163) 
 Completed flushing /var/lib/cassandra/data/system/LocationInfo-d-9-Data.db
  INFO [MEMTABLE-POST-FLUSHER:1] 2010-06-22 17:36:53,529 
 ColumnFamilyStore.java (line 374) Discarding 1000


 Nothing more after this line.

 Am I doing something wrong?

If the output you get from `nodetool streams` isn't changing, then I'd
say we have a bug.  You're data sizes weren't that large--I'd expect 2
hrs would be more than enough time.

I've created https://issues.apache.org/jira/browse/CASSANDRA-1221 to
track this problem.

Gary.



 Best Regards,
 -Arya




Re: Uneven distribution using RP

2010-06-23 Thread Eric Evans
On Tue, 2010-06-22 at 17:47 -0400, James Golick wrote:
 It's also flushing memtables really quickly for a particular CF. Like,
 really quickly. Like, one every minute. I increased the thresholds by
 10x and it's still going fast.

What is MemtableFlushAfterMinutes set to?

-- 
Eric Evans
eev...@rackspace.com



Timeout when cluster node fails/restarts

2010-06-23 Thread Wouter de Bie
Hi,

I've currently setup a cluster of 11 nodes. When running a small application 
that uses Hector to read and write keys, and restarting one of the nodes (not 
the one the application is connected to), the application stalls, times out and 
reconnects. This takes roughly 10 seconds. When the node is marked as dead, the 
application seems to continue again. The application itself is only connecting 
to localhost on one of the nodes.
Maybe interesting to mention is the fact that all nodes in the cluster are 
configured as seeds and have all other nodes configured as seeds as well. I'm 
not sure if this is causing the problem and if it's even related.

I'm using cassandra 0.6.2 and Hector 0.6.0-15 (latest github branch)

What am I doing wrong here?

Regards,

Wouter

Re: hector or pelops

2010-06-23 Thread Maxim Kramarenko
I've switched to Pelops recently, no problems with it for now, code 
become a little more compact.


cassandra_browser not in contrib

2010-06-23 Thread Eben Hewitt
The python cassandra_browser is not in the contrib directory if I clone from
git, but it is present if I checkout with svn. Is there typically a lag
between svn trunk and git? Or is this intentional because
the cassandra_browser is not going to be included going forward?
Thanks
Eben

-- 
In science there are no 'depths'; there is surface everywhere.
--Rudolph Carnap


Re: hector or pelops

2010-06-23 Thread Ran Tavory
As the developer of hector I can only speak in favor of my child of love and
I haven't tried pelops so take the following with a grain of salt...
Hector sees wide adoption and has been coined the de-facto java client. It's
been in use in production critical systems since version 0.5.0 by a few
companies.
The development team is responsive and accepts patches from the community
and is busy with new features and improvements all the time. There's a bug
tracking system and all bugs are fixed very fast.
There are two active mailing lists one for the developers and one for the
users http://wiki.github.com/rantav/hector/mailing-lists (85 members)
The project is maintained on github (http://github.com/rantav/hector) and
the process in all is very transparent and open to the community.
Code is well tested with an embedded version of cassandra which I
contributed back to the main cassandra repository, it runs a mvn and an ant
build and all release versions are available at
http://github.com/rantav/hector/downloads including source code. We love
contributions and want to make it as easy as possible to contribute back.
I myself have made a few contributions to cassandra core so I'm well
familiar with its internals, which doesn't hurt when you write a client...
...and finally the features (just the high level):
- connection pooling
- datacenter friendly
- high level API
- all public cassandra versions in the last 6 months
- failover
- simple LB
- extensive JMX
- well documented, many examples, wiki, mailing list, team of developers and
contributors.

... and of course there's also thrift if you're into hacking on it...

On Wed, Jun 23, 2010 at 5:38 PM, Serdar Irmak sir...@protel.com.tr wrote:

  Hi

 Which java client library do you reccommend, hector or pelops and why ?



 Best Regards,


   http://www.protel.com.tr/

 --

 *- Bu e-posta mesaji kisiye özel olup, gizli bilgiler iceriyor olabilir.
 Eger bu e-posta mesaji size yanlislikla ulasmissa, e-posta mesajini
 kullaniciya hemen geri gonderiniz ve mesaj kutunuzdan siliniz. **Bu e-
 posta mesaji, **hicbir sekilde, herhangi bir amac için dagitilamaz,
 yayinlanamaz ve para karsiligi satilamaz. Yollayici, bu e-posta mesajinin-
 **virus  koruma sistemleri ile kontrol ediliyor olsa bile - **virus
 içermedigini garanti etmez ve meydana gelebilecek zararlardan dogacak
 hiçbir sorumlulugu kabul etmez.
 - The information contained in this message is confidential, intended
 solely for the use of the individual or entity to whom it is addressed and
 may be protected by professional secrecy. You should not copy, disclose or
 distribute this information for any purpose. If you are not the intended
 recipient of this message or you receive this mail in error, you should
 refrain from making any use of the contents and from opening any attachment.
 In that case, please notify the sender immediately and return the message to
 the sender, then, delete and destroy all copies. This e-mail message has
 been swept by anti-virus systems for the presence of computer viruses. In
 doing so, however, we cannot warrant that virus or other forms of data
 corruption may not be present and we do not take any responsibility in any
 occurrence.*
 --




Re: 10 minute cassandra pause

2010-06-23 Thread Benjamin Black
Are you seeing any sort of log messages from Cassandra at all?

On Wed, Jun 23, 2010 at 2:26 PM, Sean Bridges sean.brid...@gmail.com wrote:
 We were running a load test against a single 0.6.2 cassandra node.  24
 hours into the test,  Cassandra appeared to be nearly frozen for 10
 minutes.  Our write rate went to almost 0, and we had a large number
 of write timeouts.  We weren't swapping or gc'ing at the time.

 It looks like the problems were caused by our memtables flushing after
 24 hours (we have MemtableFlushAfterMinutes=1440).  Some of our column
 families are written to infrequently so that they don't hit the flush
 thresholds in MemtableOperationsInMillions and MemtableThroughputInMB.
  After 24 hours we had ~3000 commit log files.

 Is this flushing causing Cassandra to become unresponsive?  I would
 have thought Cassandra could flush in the background without blocking
 new writes.

 Thanks,

 Sean



Call for input of cassandra, thrift , hector, pelops example / sample / test code snippets

2010-06-23 Thread Gavan Hood
Hi all,

I have been researching the samples with some success but its taken a while.
I am very keen on Cassandra and love the work thats been done, well done
everyone involved.

I would like to get as many of the samples I can get organized into
something that makes it easier to kick of with for people taking the road I
am on.

If people on this list have code snippets, full example apps, test apps, API
test functions etc I would like to hear about them please. My work is in
Java so I really want to see those, the others are still of high interest as
I will post them all out as I mention below.

Ideally I would like to get a small test container set up to allow people to
poke and prod API's and see what happens, but like most of us time is the
challenge. If I do not get that far I would at least post the findings to
page(s)  that people can continue to add to, maybe if successful it could
then be consumed back into the apachi wiki...

If someone has already done this I would love to see the site.

Let me know your thoughts,  and better yet show me the code :-)

Regards
Gavan


Re: 10 minute cassandra pause

2010-06-23 Thread Sean Bridges
I see about 3000 lines of,

INFO [COMMIT-LOG-WRITER] 2010-06-23 16:40:29,107 CommitLog.java (line
412) Discarding obsolete commit
log:CommitLogSegment(/data1/cass/commitlog/CommitLog-1277302220723.log)

Then, http://pastebin.com/YQA0mpRG

It's around 16:50 that cassandra writes stop timing out.  Some writes
are getting through during this 10 minutes, but they shouldn't be
enough to cause the index memtables to flush.

Thanks,

Sean






On Wed, Jun 23, 2010 at 3:30 PM, Benjamin Black b...@b3k.us wrote:
 Are you seeing any sort of log messages from Cassandra at all?

 On Wed, Jun 23, 2010 at 2:26 PM, Sean Bridges sean.brid...@gmail.com wrote:
 We were running a load test against a single 0.6.2 cassandra node.  24
 hours into the test,  Cassandra appeared to be nearly frozen for 10
 minutes.  Our write rate went to almost 0, and we had a large number
 of write timeouts.  We weren't swapping or gc'ing at the time.

 It looks like the problems were caused by our memtables flushing after
 24 hours (we have MemtableFlushAfterMinutes=1440).  Some of our column
 families are written to infrequently so that they don't hit the flush
 thresholds in MemtableOperationsInMillions and MemtableThroughputInMB.
  After 24 hours we had ~3000 commit log files.

 Is this flushing causing Cassandra to become unresponsive?  I would
 have thought Cassandra could flush in the background without blocking
 new writes.

 Thanks,

 Sean




Re: forum application data model conversion

2010-06-23 Thread S Ahmed
Any thoughts?

On Tue, Jun 22, 2010 at 2:13 PM, S Ahmed sahmed1...@gmail.com wrote:

 Converting a Forum application to cassandra's data model.

 Tables:

 Posts [postID, threadID, userID, subject, body, created, lastmodified]

 So this table contains the actual question subject and body.

 When a user logs in, they want to see a list of their questions, and also
 order by the last-modified date (to see if people responed to their
 question).

 How would you do this best in Cassandra, seeing as the question/answer text
 is stored in another table.

 I know you could make a CF like:

 userID { postID1, postID2, ...}

 And somehow order by last-modified, but then on the actual web page you
 would have to first query for postID's owned by the user, and orderd by
 last-modified.

 THEN you would have to fetch the post data from the posts collection.

 Is this the only way?  I mean other than repeating the post subject+body in
 the user-to-postID index CF.





RE: Hector vs cassandra-java-client

2010-06-23 Thread Kenneth Bartholet

Agreed, but at what cost?
It's my understanding that the big deterrent is the lack of 3rd party 
dependencies in maven public repos (e.g. Thrift itself).

The option would be to publish a public maven repo containing all dependencies, 
which ends up being more responsibility then the client developers want to 
accept.
Any volunteers?

-Ken

 To: user@cassandra.apache.org
 From: bbo...@gmail.com
 Subject: Re: Hector vs cassandra-java-client
 Date: Tue, 22 Jun 2010 17:14:53 +0200
 
 Dop Sun su...@dopsun.com writes:
 
  Updated.
 
 the first Cassandra client lib to make it into the Maven repositories 
 will probably end up with a big audience.  :-)
 
 -Bjørn
 
  
_
Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_1