Hi Justin, for the early stage I'm only going to use it myself and invite some friends who have companies, so that shouldn't be a problem. Later on, it probably won't be possible that easily, because of the time it would take until the new query is submitted.
I'd rather have a system which allows some queries, which are most probably safe, and only tells me to check those which look suspicious. For instance, I could check the amount of space the query is consuming, as one factor. This could even help the customers improve their queries. For instance I just found out last week about the feature "*include_docs*" which would also help to keep the views small. Unfortunately, so far I didn't find anything which would tell me when the view was last time executed, or for how long it ran. Is there such a property? Otherwise I could try to run the view functions manually to see how long it takes them to execute. Thank, Cheers Peter On Fri Nov 28 2014 at 7:50:38 AM <[email protected]> wrote: > Hi Peter, > > An interesting concept ... > > This may sound simplistic, but is it viable for your application to > initially have a process where new queries written are vetted by a human > before they are run ? The advantage of this is two-fold, namely: > i) you'll be able to move on a prove your concept quickly > ii) while doing this, you may learn enough (and things may change > enough) for you to automate the vetting process > > Thanks, > Justin > > -----Original Message----- > From: Peter Grman [mailto:[email protected]] > Sent: 28 November 2014 02:12 > To: [email protected] > Subject: Re: Allow user-defined views > > No, I don't. The program should be for analysing logs (collected by > fluentd) - should be open source and on github, however there isn't much > done yet: https://github.com/logTank/ > > The index rebuilding shouldn't be a problem as CouchDB will be only used > for general stats and the user actually won't see the up to date data, but > always with a delay - another advantage of CouchDB, I can read the queries > without bothering the system, and once the data is outdated, I can update > the index. At least so far the theory, I'll need to run some performance > tests if that actually works, once I'll have a MVP. The other option is to > use MongoDB for ad-hoc queries, but I was thinking that CouchDB will be > more efficient as storage is so cheap. > > As I'm learning every time I look up info about CouchDB something new, and > something becomes more clear, I'm also glad about feedback on the idea in > general, how I want to use CouchDB. > > However I'd be also very happy if I could somehow solve the problem with > the possible DoS attacks :). Maybe there is something in CouchDB or evalcx > which I can configure - maximal runtime of a map/reduce function? > (shouldn't be more than 1ms). Or there are some logged data by CouchDB > about the resources required by views (CPU Time + HDD Space)? > > Cheers > Peter > > On Fri Nov 28 2014 at 2:54:51 AM Alexander Gabriel <[email protected]> > wrote: > > > sorry for being off-topic > > Alex > > > > > > 2014-11-28 2:52 GMT+01:00 Alexander Gabriel <[email protected]>: > > > > > sounds like a very interesting application > > > > > > seems like you dont care if the user has to wait for an index to be > > > built when the user creates a query > > > > > > Alex > > > > > > > > > 2014-11-28 2:23 GMT+01:00 Peter Grman <[email protected]>: > > > > > >> Hi Alex, > > >> > > >> Yes, the users would be able to import different sets of data, > > >> which > > isn't > > >> relational, and use the platform to analyse it. The analysed data > > >> would > > be > > >> in 99% of the cases append only (+ removing old data) and the data > > >> can > > be > > >> defined by the user, as well as be hierarchical. > > >> > > >> When I thought about the system in the beginning, CouchDB seemed > > >> like an awesome choice as there would be only a couple of well > > >> defined queries > > and > > >> storage is generally cheap, I thought that CouchDB views and their > > caching > > >> are what I'm looking for. > > >> > > >> The problem is again only with people who want to trick the system. > > >> I would be also happy with a solution which would detect bad views > > >> ones they > > have > > >> been deployed (uses too much space, takes too long to compute) and > > >> deactivates and marks them for me to check. This way I could check > > >> those few people who try a DoS attack and ban them from the service. > > >> > > >> The additional main problem was, if it is really impossible to get > > >> data from a different database inside the view and if the user > > >> won't be able > > to > > >> access the underlying system, ..., or if it is just very difficult > > >> => possible, if someone wants to do it they'll find a way. But > > >> after > > reading > > >> more and understanding more, how the views are executed using > > >> evalcx I think the other problems aren't a big concern for me > > >> anymore, is that correct?. > > >> > > >> Although I've found in the code "if possible, use evalcx (not > > >> always available)" - how can I check that evalcx is available on my > > >> system? Or > > is > > >> it just a note for older distributions, nothing to be concerned > > >> about anymore? > > >> > > >> Thank you > > >> > > >> Cheers > > >> Peter > > >> > > >> On Fri Nov 28 2014 at 1:37:57 AM Alexander Gabriel > > >> <[email protected]> > > >> wrote: > > >> > > >> > Hi Peter > > >> > > > >> > Will the users create their own datastructures too? > > >> > If not this sounds like sql on relational tables might be a > > >> > better > > tool > > >> for > > >> > the problem. > > >> > It seems to me you're hitting exactly the weak point of most > > >> > nosql solutions. > > >> > > > >> > Alex > > >> > > > >> > > > >> > 2014-11-28 0:49 GMT+01:00 Peter Grman <[email protected]>: > > >> > > > >> > > Hi, > > >> > > > > >> > > this might sound like a terrible idea to someone who knows > > >> > > CouchDB, > > >> and > > >> > if > > >> > > that's the case, please just take a minute or two, to explain > > >> > > why, otherwise, if the idea isn't so crazy after all, I hope > > >> > > I'll get > > some > > >> > > solutions to my problem: > > >> > > > > >> > > I'm thinking of creating a platform based on CouchDB, where > > >> > > each set > > >> of > > >> > > users (group, customer, ...) would get their own CouchDB > > >> > > Database, > > to > > >> > store > > >> > > and query data. I've heard in a podcast, roughly a year ago, > > >> > > that > > >> this is > > >> > > how CouchDB was meant to be - many smaller databases. > > >> > > > > >> > > To query the data, I want to allow them, to define their own > > >> > > custom queries. Now I could (and want to) create a form which > > >> > > allows to > > >> build a > > >> > > query and translates it to a JS view, but I was thinking about > > >> > > additionally, on top of that, allowing them to define their > > >> > > custom > > >> views > > >> > > directly in JS. They would basically be allowed to define their > > custom > > >> > > Map/Reduce functions. > > >> > > > > >> > > There is a lot which can go wrong with this the worst ones I > > >> > > came up > > >> > with: > > >> > > - DoS attack with endless loops inside the function > > >> > > - DoS attack by emitting too much data (potentially in a loop > > >> > > again) > > >> > > > > >> > > As far as I've understood, it's not possible to access other > > Databases > > >> > from > > >> > > within the view, is this understanding of mine correct? > > >> > > > > >> > > Is it possible to access the filesystem or network services in > > >> > > any > > way > > >> > from > > >> > > the CouchDB view or is the JavaScript engine, which is running > > >> > > the > > >> code, > > >> > > limiting enough? > > >> > > > > >> > > Are there any other things which could go wrong? - or did > > >> > > actually > > >> > somebody > > >> > > already use CouchDB like this, and it's perfectly normal? > > >> > > > > >> > > Is there any way I could prevent the problem with endless loops > > >> > > and > > >> data > > >> > > emitting from happening? - I can run JSLint, which maybe will > > >> > > detect > > >> an > > >> > > endless loop, but that won't help against a loop with a million > > >> > iterations, > > >> > > which will be called for every item inside CouchDB - still > > >> > > quite > > >> endless. > > >> > > > > >> > > Thank you for your help! > > >> > > > > >> > > Cheers, > > >> > > Peter > > >> > > > > >> > > > >> > > > > > > > > > >
