Re: Data.js 0.4.0 released — A declarative interface to CouchDB

Michael Aufreiter Fri, 15 Jul 2011 07:02:25 -0700

Hi Mehdi,

glad you are interested in using our projects. Answers are inline.


> Hi Michael,
> 
> Congratulation for data.js, it really looks awesome (as well as dejavis and 
> substance).
> 
> I am working on a project quite similar to yours, but it is far less 
> advanced. I am building is a web-based reporting software (open source), 
> aiming at bringing innovation to the data-warehousing on several aspects, 
> such as taking advantage of nosql to use better data models than traditional 
> dimensional modelling (facts / dimensions).
> 
> data.js, substance, and datavis are components which I might use, I need to 
> evaluate them, as well as other options.
> 
> Right now I am developing a component of the software allowing to populate 
> data into couchdb from csv files (both online and local); version extracting 
> local files should be ready soon. html5 interface. Might be a starting point 
> to generate collections from user files.
> How solid is the project? Is it just you on your own, or is a company 
> supporting it?

I'd consider Data.js as being quiet solid already. We use it for Persistence 
(Substance, Dejavis) and for client-side/in-memory data manipulation (Dejavis).
Substance is our main project. I'm currently in the process of acquiring some 
public funding to back up development for the next months, which I hope will be 
successful. I also founded a company for that purpose, just days ago. :) 
Dejavis is a relatively new kid, as the version number 0.1.0-dev suggests. It 
however is already used in production by a client of us for analyzing sales 
data. 

> What interface would you provide to dejavis user to add data sources? What 
> transformations would be possible? 
> What types of sources would you support?
We currently use the Data.Collection interface, which forms a simplified 
interface (just one type) to an underlying Data.Graph. Supporting linked data 
through Data.Graph's is also an option for the future. However for sales data, 
the Collection interface does a perfect job right now. There's already an 
interface where a user can add data-sources at its own. Also we support support 
API tokens for securing external data-sources (web services providing up-to 
date data). Currently grouping (by N properties) and aggregating (SUM, AVG, 
MIN, MAX) is supported (right panel) as well as filtering data using faceted 
navigation (left panel).
Here's more background: http://substance.io/#dejavis/dejavis
> When I use data.js, do I still need to read / write documents to couchdb, or 
> does data.js take care of this transparently?
No Data.js takes care of it all. You typically implement filters (middleware) 
on the server-side to secure read and write operations (See 
https://github.com/michael/substance/blob/master/src/server/filters.js#L9). For 
the most common use cases you don't need to write a single line of code on the 
serverside. For complex querying and computation tasks it's perfectly fine and 
intended to combine it with CouchDB map/reduce.
> How well would data.js behave with processing of large volume of data, both 
> at client and at server side?
There are limits, of course. At this early stage it's getting slow if you're 
working with data-sets > ~50.000 data items depending on the number of 
properties. However we are optimistic that this can be further optimized. We 
already introduced a streaming interface for initially loading JSON data into 
memory in chunks so the UI doesn't block (see 
https://github.com/michael/dejavis/blob/master/public/javascripts/helpers.js#L54)
> Specifically, how is the performance when dealing with linked data? What kind 
> of algorithm are you using to perform the 'joins'?
As said linked data isn't supported in Dejavis yet. With 0.4.0 Data.js however 
uses multiple gets to resolve (and fetch) associated objects from Couch. Once 
in memory you can traverse the graph like so: 
graph.get('/location/vienna').get('districts').first().get('name') etc.
> Do I understand correctly that the data processing you do is either done at 
> node.js or at client side, but not inside couchdb via views? 
True, though Data.js uses CouchDB views internally for indexing. However you 
can and are encouraged to use a hybrid approach whenever it makes sense.
> How did you do to extract info from github about node.js? Is it with api v2? 
> From a quick reading of api v3 documentation it seemed not obvious to me, 
> since search feature seems to have disappeared.
We used v2. It's just meant as an example for a Dejavis data-source.
> Keep up the good work, tools such as the ones you are creating are definitely 
> useful.
> 
Thanks :)

Michael

Re: Data.js 0.4.0 released — A declarative interface to CouchDB

Reply via email to