Re: [xwiki-devs] XWiki on Cassandra

Caleb James DeLisle Thu, 04 Aug 2011 22:27:12 -0700


On 08/03/2011 04:20 AM, Jerome Velociter wrote:
> Hi Caleb,
> 
> This is exciting news !
> 
> On Tue, Aug 2, 2011 at 11:51 PM, Caleb James DeLisle
> <[email protected]> wrote:
>> I have an instance of XWiki finally running on Cassandra.
>> http://kk.l.to:8080/xwikiOnCassandra/
>>
>> Cassandra is a "NoSQL" database, unlike a traditional SQL database it cannot 
>> do advanced queries but it can store data in a more flexible way eg: each 
>> row is like a hashtable where additional "columns" can be added at will.
> 
> Do we have a clear view of what XWQL queries will/will not be
> supported, or is it too soon to say ?
>


Since XWQL is based on the idea of JPQL and the schema I designed is the same 
as the schema which the XWQL interpreter pretended to be using.
As soon as I have JPQL functioning, simple XWQL queries such as:
"SELECT doc.fullName FROM XWikiDocument as doc WHERE doc.author = 
'XWiki.Admin'" will need only be changed to:
"SELECT doc.fullName FROM 
com.xpn.xwiki.store.datanucleus.PersistableXWikiDocument as doc WHERE 
doc.author = 'XWiki.Admin'"
and obviously, "where SQL" will be the same.

The traditional XWQL object querying notation:  FROM XWikiDocument as doc, 
doc.object(XWiki.XWikiComments) as cmt WHERE ...
maps over to the JPQL statement:  FROM 
com.xpn.xwiki.store.datanucleus.PersistableXWikiDocument as doc, 
IN(doc.objects) cmt WHERE cmt.className = 'XWiki.XWikiComments' AND ...

>> The most important feature of Cassandra is that multiple Cassandra nodes can 
>> be connected together into potentially very large "swarms" of nodes which 
>> reside in different racks or even data centers continents apart, yet all of 
>> them represent the same database.
>> Cassandra was developed by Facebook and their swarm was said to be over 200 
>> nodes strong.
>> In it's application with XWiki, each node can have an XWiki engine sitting 
>> on top of it and users can be directed to the geographically closest node or 
>> to the node which is most likely to have a cache of the page which they are 
>> looking for.
>> Where a traditional cluster is a group of XWiki engines sitting atop a 
>> single MySQL engine, this allows for a group of XWiki engines to sit atop a 
>> group of Cassandra engines in a potentially very scalable way.
>> In a cloud setting, one would either buy access to a provided NoSQL store 
>> such as Google's BigTable or they would setup a number of XWiki/Cassandra 
>> stacks in a less managed cloud such as Rackspace's or Amazon's.
>>
>> How it works:
>> XWiki objects in the traditional Hibernate based storage engine are 
>> persisted by breaking them up into properties which are then joined again 
>> when the object is loaded.
>> A user object which has a name and an age will occupy a row in each of three 
>> tables, the xwikiobjects table, the xwikistrings table, and the 
>> xwikiintegers table.
>> The object's metadata will be in the xwikiobjects table while the name will 
>> be in a row in the xwikistrings table and the age, a number, will go in the 
>> xwikiintegers table.
>> The NoSQL/Datanucleus based storage engine does this differently, the same 
>> object only occupies space in the XWikiDocument table where it takes 
>> advantage of Cassandra's flexibility by simply adding a new column for each 
>> property.
>> NOTE: this is not fully implemented yet, objects are still stored serialized.
>>
>> What works
>>
>> * Document storage
>> * Classes and Objects
>> * Attachments
>> * Links and Locks
>> * Basic querying with JDOQL
>>
>> What doesn't work
>>
>> * Querying inside of objects
>> * JPQL/XWQL queries
>> * Document history
>> * Permissions (requires unimplemented queries)
>> * The feature you want
>>
>>
>> I am interested in what the community thinks is the first priority, I can 
>> work on performance which will likely lead to patches being merged into 
>> master which will benefit everyone
> 
> You mean global performance of XWiki, or something in a specific area
> ? FYI in case you would have missed it there was a mail by Paul
> Libbrecht about a possible fine tuning of Hibernate cache that could
> boost performance ([xwiki-devs] hibernate cache optimization?).
> 
> To answer your question, as member of the community I am interested in
> performance of XWiki with big number of documents ; I'd say both
> generic performance improvements or experimental work on Cassandra
> fits in this line :)

Cool. Cassandra itself should scale well, what we really need IMO is to run 
through the line of execution and find out what takes the longest and fix.

Caleb

> 
> Jerome
> 
>> or I can work on more features which will benefit people who want to use 
>> XWiki as a traditional application wiki but use it on top of Cassandra.
>> You can reply here or add comments to the wiki ;)
>>
>> Caleb
>>
>> _______________________________________________
>> devs mailing list
>> [email protected]
>> http://lists.xwiki.org/mailman/listinfo/devs
>>
> _______________________________________________
> devs mailing list
> [email protected]
> http://lists.xwiki.org/mailman/listinfo/devs
> 

_______________________________________________
devs mailing list
[email protected]
http://lists.xwiki.org/mailman/listinfo/devs

Re: [xwiki-devs] XWiki on Cassandra

Reply via email to