Hi Mehdi, glad you are interested in using our projects. Answers are inline.
> Hi Michael, > > Congratulation for data.js, it really looks awesome (as well as dejavis and > substance). > > I am working on a project quite similar to yours, but it is far less > advanced. I am building is a web-based reporting software (open source), > aiming at bringing innovation to the data-warehousing on several aspects, > such as taking advantage of nosql to use better data models than traditional > dimensional modelling (facts / dimensions). > > data.js, substance, and datavis are components which I might use, I need to > evaluate them, as well as other options. > > Right now I am developing a component of the software allowing to populate > data into couchdb from csv files (both online and local); version extracting > local files should be ready soon. html5 interface. Might be a starting point > to generate collections from user files. > How solid is the project? Is it just you on your own, or is a company > supporting it? I'd consider Data.js as being quiet solid already. We use it for Persistence (Substance, Dejavis) and for client-side/in-memory data manipulation (Dejavis). Substance is our main project. I'm currently in the process of acquiring some public funding to back up development for the next months, which I hope will be successful. I also founded a company for that purpose, just days ago. :) Dejavis is a relatively new kid, as the version number 0.1.0-dev suggests. It however is already used in production by a client of us for analyzing sales data. > What interface would you provide to dejavis user to add data sources? What > transformations would be possible? > What types of sources would you support? We currently use the Data.Collection interface, which forms a simplified interface (just one type) to an underlying Data.Graph. Supporting linked data through Data.Graph's is also an option for the future. However for sales data, the Collection interface does a perfect job right now. There's already an interface where a user can add data-sources at its own. Also we support support API tokens for securing external data-sources (web services providing up-to date data). Currently grouping (by N properties) and aggregating (SUM, AVG, MIN, MAX) is supported (right panel) as well as filtering data using faceted navigation (left panel). Here's more background: http://substance.io/#dejavis/dejavis > When I use data.js, do I still need to read / write documents to couchdb, or > does data.js take care of this transparently? No Data.js takes care of it all. You typically implement filters (middleware) on the server-side to secure read and write operations (See https://github.com/michael/substance/blob/master/src/server/filters.js#L9). For the most common use cases you don't need to write a single line of code on the serverside. For complex querying and computation tasks it's perfectly fine and intended to combine it with CouchDB map/reduce. > How well would data.js behave with processing of large volume of data, both > at client and at server side? There are limits, of course. At this early stage it's getting slow if you're working with data-sets > ~50.000 data items depending on the number of properties. However we are optimistic that this can be further optimized. We already introduced a streaming interface for initially loading JSON data into memory in chunks so the UI doesn't block (see https://github.com/michael/dejavis/blob/master/public/javascripts/helpers.js#L54) > Specifically, how is the performance when dealing with linked data? What kind > of algorithm are you using to perform the 'joins'? As said linked data isn't supported in Dejavis yet. With 0.4.0 Data.js however uses multiple gets to resolve (and fetch) associated objects from Couch. Once in memory you can traverse the graph like so: graph.get('/location/vienna').get('districts').first().get('name') etc. > Do I understand correctly that the data processing you do is either done at > node.js or at client side, but not inside couchdb via views? True, though Data.js uses CouchDB views internally for indexing. However you can and are encouraged to use a hybrid approach whenever it makes sense. > How did you do to extract info from github about node.js? Is it with api v2? > From a quick reading of api v3 documentation it seemed not obvious to me, > since search feature seems to have disappeared. We used v2. It's just meant as an example for a Dejavis data-source. > Keep up the good work, tools such as the ones you are creating are definitely > useful. > Thanks :) Michael
