+1000, thanks for this Jan It very much models both I how I think about CouchDB and how I think + architect PouchDB (much of that is echoed on http://pouchdb.com/)
I have quite a lot of thoughts on this, most of them technical so I will do them in seperate threads, but wanted to voice my agreement in the overall thoughts of this. On 24 July 2013 22:36, Jan Lehnardt <[email protected]> wrote: > I have a dream… > > (pardon the plagiarism) > > I want to live in a world where people are empowered to understand > and are capable to decide where their data lives. I want to live in > a world where developers build apps that support that, not because > they went out of their way to implement it, but because it is a > feature of the software platform they are using. > > I want to be able to help people improve their lives in regions of > the world where ubiquitous network access isn’t — and sometimes that > is just a major western capital’s subway — but more likely is it a > lesser developed location, or a rural area that will never see mobile > broadband, let alone wired broadband because there is no financial > incentive. > > I want to live in a world where technology solves more problems than > it creates. One of those ways is allow people to use software wherever > they are in whatever context they need it in. More often than not, > that means far away from fast network access (Despite what @dhh is > trying to tell you). > > My primary motivation for working on Apache CouchDB is to help build > the world I want to live in. The same motivation drives my motivation > behind Hoodie (http://hood.ie), which builds on top of CouchDB and > wouldn’t be possible without it. > > > * * * > > In the past year I have interviewed a fair number of people, let’s > say 50, from those who have heard about CouchDB to users to core devs. > > The ONE feature that makes CouchDB relevant is multi-master replication. > There is no exception, this is the ONE thing that makes CouchDB > exceptional. NOBODY else has that, and even the decent proprietary > solutions that are just coming to market suck where we KICK ASS. > > There are many other things that people like about CouchDB: reliability, > no schema, HTTP interface, the view system, etc. But NONE of these people > would care if CouchDB didn’t have multi-master replication. > > > * * * > > The number one thing that people did NOT like about CouchDB is that it > is confused. CouchDB has a torn identity, half database, half > application server. It wasn’t clear (and I am part responsible for this) > what CouchDB is and wants to be. In everybody’s defence, I think, it > just took a while to figure it out. Now is a good time to put our > findings in writing and fix this. > > The number one request from people was to clear up CouchDB’s story, > to have a clear, bold vision that captures people and that they can > easily understand and share and support and move forward. > > > * * * > > Here is a narrative about what CouchDB has, that has formed in my head > in the past year. I have shared this with some people privately for some > feedback and they all liked it, so it has that going for it. I also tried > out bringing some of these issues up in presentations I have given, to > again great feedback. > > E.g.: > > http://www.youtube.com/watch?v=7mdG-iAizVc or > http://www.youtube.com/watch?v=edbi9jJZkpg > > Before I lay it out, I understand that I will be ruffling some feathers. > I think that is both necessary and healthy. I think the picture I am > going to paint will make a lot of people in the CouchDB community happy, > some with concessions, but I utterly and strongly believe that this > vision of what CouchDB is has the power to set the course for the next > five years of the project and attract a whole lot of new people both > as users and contributors. > > > > * * * > > CouchDB is a database that replicates. > > Think of it as git for your data-layer. Not in a sense where you manage > text files and diff and merge, but in the sense that you have a local > version of your data and one or multiple remote ones and you can > seamlessly move your data between them, back and forth and crossover. > > Imagine a local checkout of your data that you can work on, and then > share it with Lucie across the table, she finds some issues and fixes > up the data, and shares it with Tim across the room. Tim fixes two > more issues and you pull both their changes into your copy. We conclude > the whole thing is golden and we push it to staging, where our continuous > integration runs and decides that the data is good to go into production, > so it pushes it to production. There the data is picked up from various > clients, some mobile over there, some web over here, a backup system > in the Tokyo office… > > Or you have hospitals in remote regions in Africa that collect local > health data, like how many malaria infections a region has and they all > share their results over unreliable mobile connections and the data > still makes it eventually maybe with a few hours delay and the malaria > expert in the capital city sees an increased outbreak of some illness > and is able to send out medicine in time to arrive for the patients > to help. Where today the expert takes months to travel between the > hospitals to collect that data manually and find out that there was > a lethal outbreak two months ago and everybody died. > > (Somebody built this, CouchDB does save lives, I get teary every time > I tell this story (like now). Our work doesn’t get more noble than > this.) > > Or imagine millions of mobile users with access to terabytes of > data in the cloud, replicating the bits they need to their phones > and tablets, allowing super-fast low-latency access for a stellar > user experience, while giving access to sheer amounts of data and > allowing full write access on the mobile device to be replicated > back to the cloud when connections exist. > > (Our friends at Cloudant have a couple of those customers.) > > > That is the power of CouchDB. > > > * * * > > Replication is the PRIMARY feature of CouchDB. “is a database” means > “stores your data, safely and securely”, “that replicates” highlights > the primary feature. > > There are many more very cool features of CouchDB, even the details > on how we achieve reliability and data safety or how replication > works are mindblowingly cool. The simple HTTP interface, the JSON > store, the app-server features, map reduce views, all very excellent > things that make CouchDB unique, but it is very important to understand > that they are SECONDARY features. > > > * * * > > I want to learn from understanding what the PRIMARY and SECONDARY > features for CouchDB are. I already feel a bit bad about that the > PRIMARY ones are two (“a database” *and* “that replicates”), but I > think that is as little as it gets. > > I want CouchDB’s new identity to be a database that replicates. I want > to provide a slide deck for a “CouchDB in 25 minutes” presentation* that > everybody can take and give and customise, but I want that one of the > first things you say “CouchDB is a database that replicates”. I want > that if you ask anyone inside the CouchDB developer community (you!) > about what CouchDB is to answer “CouchDB is a database that replicates” > and then follow up explaining what we mean, and *then* add a few more > of the SECONDARY features that you particularly like. > > * https://dl.dropboxusercontent.com/u/82149/CouchDB-in-25-Minutes.pdf > Full talk at: http://vimeo.com/62599420 (sorry this one is German, > still trying to find an English version of this) > > I want that people who barely look at CouchDB comment on an unrelated > Hacker News thread write “…CouchDB is a database that replicates, maybe > that is a better fit for your problem”. > > I want that the CTO of the newly funded startup thinks “I seem to have > a replication problem to solve, maybe CouchDB can help.” > > I want to move CouchDB’s development forward, and when we ask ourselves > whether to add a feature, we run it by our PRIMARY feature set and ask > “does it support ‘CouchDB is a database that replicates’” and if it does > we go ahead and build it, and if it doesn’t we may consider it as a > SECONDARY feature, or we discard it altogether. > > (I don’t actually care what the final slogan will be, and please bike-shed > this to no avail, but it should capture what I mean with “CouchDB is a > database that replicates”, a phrase that we can burn into everybody’s > head that captures CouchDB’s PRIMARY feature, its PRIMARY value > proposition, the ONE thing that explains WHY we are excited about > CouchDB.) > > > * * * > > Now, you might be miffed that your pet feature didn’t make the PRIMARY > list. > Do not worry, I believe I have a solution for that. > > I have brought this up before, but I really do think the holy grail to all > this is a very well done plugin system that allows us to follow the “small > core, massive plugin repository” paradigm that other’s ever so successfully > pioneered. > > This allows us to focus on what CouchDB is for internal and external > communication, for roadmap discussions and attraction of developer talent. > > More importantly, it allows us to keep all the fringe things that makes > CouchDB so very appealing to a lot of different people. It also allows us > to open up development to people who feel intimidated working on core > CouchDB, but can easily write a little plugin or three (this is basically > me, I have like 20 branches on GitHub that are useful to maybe 5% of our > users and they don’t get used any). > > A wise person once said “Core is where features go to rot.”, and if you > look at a number of CouchDB features, you can see that we suffer from that. > > We need a kick-ass plugin system that allows us to easily create, publish, > maintain and update little pieces of code that allow our users to make > their CouchDB their own. (I am signing up to build that, but I will need > your help, there is a shit ton of work to do :) > > > * * * > > ALERT: OPINION (your opinion may differ and we need to hear it) > > There is a discussion we need to have what the “small core” means for > CouchDB. There is a discrepancy between the absolute minimum to fulfil > the “CouchDB is a database that replicates“ mantra and what would be > a useful-out-of-the-box product that our users could set up and be > productive with. > > My minimum set looks roughly like this: > > - core database management (crud dbs & json/mime-docs, clustering) > - remote & local replication > - MR-views & GeoCouch enabled by default (ideally abstracted > away with nice “query dsl”) > - HTTP interface > - Fu/Fauxton > - configuration > - stats > - docs > - plugin system with Erlang (and in the future JavaScript support > via Node.js) > > This makes for a useful CouchDB default setup. > > Everything else should be a plugin. A piece of code that can be installed > with a quick search and a click of a button in Futon (or a `curl`-call on > the HTTP interface). Not far away, definitely not “siberia” (if you get > the PHP reference), but close to the core and encouraged to be used. > > And yes, this explicitly includes things like shows and lists and update > functions and rewrites and vhosts. We should make it super simple to add > these, but for a default experience, they are very, very confusing. We > should have a single plugin “CouchApp Engine” which includes Benoit’s > vision of CouchApps done right that is just a click away to install. > > In terms of highlighting the strengths of the core CouchDB “product”, this > is what I’d put on the website: > > - Apache CouchDB implements the CouchDB vision: > It is a database that replicates. > > - Document Database: > - Data records are standard JSON. > - Unlimited Binary data storage with attachments. > - (alternatively arbitrary mime docs with special rules for JSON docs) > > - Fault-tolerant: > - Data is always safe. Tail-append storage ensures no messing with > already committed data. > - Errors are isolated, recovery is local and doesn’t affect other > parallel requests. > - Recovery of fatal errors is immediate. There is no “fixup phase” > after a restart. > - Software updates and bugfix deployment without downtime. > > - Highly Concurrent: > - Erlang makes good use of massively parallel network server > installations. > - Garbage collection happens roughly on a per-request basis. > GC in one request doesn’t affect other requests. > > - Cluster / BigCouch / Big Data: > - Includes a Dynamo-style clustering and cluster-management > feature that allows to spread data and load over multiple > physical machines. > - Scales up to Petabytes of data. > > - Secondary 2D and 3D indexing > - Using incremental and asynchronous index updates for > high-performance queries. > > - Makes good use of hardware: > - Tail-append storage allows for serial write access to > storage media, which is a best-case-scenario for spinning > disks and SSDs. > > - Small Core & Flexible Plugin System: > - Some features are only useful for a small group of people, these > can be installed with a super simple plugin management system that > is built into the admin interface. > - Get new features with a click or tap. > - Plugins can be written in Erlang (and in JavaScript in the future). > > - Cross Platform Support > - Runs on any POSIX UNIX as well as Windows. > - Support for some embedded devices like Android and RaspberryPi. > > > I think this would make for a compelling list of technical features. > > (I’d probably also add a blip about the ASF and the Apache 2.0 License > for good measure) > > ALERT END > > > * * * > > And then, CouchDB is one more thing. CouchDB isn’t just the Erlang > implementation of this whole replicating database idea. CouchDB is also > the wire protocol, the specification that makes all the magic work. > Apache CouchDB is the focal point for The Replicating Society*. > > (* cue your Blade Runner jokes) > > Apache CouchDB is THE standard for data freedom and exchange and is > the clearing house, the centre for an ecosystem that includes fantastic > projects like PouchDB and the TouchDBs, MAx Ogden’s `dat` and whichever > else follow these. Not saying we merge those projects in, they can stand > on their own, but we should embrace everything that makes the > interoperable replication world a reality. > > http://couchdb.apache.org is going to be the centre of the data > replication universe. > > > * * * > > Now all of this is my vision and I bringing it to this table now. > I have to admit that I am very nervous about this. A lot of things > aren’t very well thought out and at the same time, I care very > deeply about this project and it’s community and their future, so > there is a little anxiety doing this little emotional striptease > in front of all of you. > > What we will end up with, is not what I dream up and that’s that, > but I hope I can inform and set the direction of where we are going, > and then we can all together figure out the hard parts, and question > my assumptions and change little thing or lots. > > I don’t want to make this mine, but ours. To keep and to be proud of. > > The last thing I want is to stifle diversity, in thought and code, > and I am very sure that some of you will find a lot to disagree with > what I am saying, and that’s great, because this should, again, be > ours, not mine. > > But the one thing I am convinced of is the little pivot that this > project hinges on* between relative obscurity and blasting success > is that we need to find our version of a simplified, streamlined > and aligned way of defining, building and communicating what Apache > CouchDB is. > > (* I suck at metaphors) > > And yes that means that some thing that *YOU* think are important > are getting a second row seat instead of the front row. Heck even > some of my pet features get a second row seat, but that is fine > because they aren’t gone, there is still room for all the crazy > and not-so-crazy-but-not-essential stuff that people love in the > plugin system, one click away. All this so we can benefit from > being able to focus on building a modern, compelling, fun, humble > and clever database that we can build the future, our future, on. > > > * * * > > I want to live in a world where people are empowered to understand > and are capable to decide where their data lives. > > > I want to live in a world where technology solves more problems than > it creates. > > > My primary motivation for working on Apache CouchDB is to help build > the world I want to live in. > > > The ONE feature that makes CouchDB relevant is multi-master replication. > > > I want to learn from understanding what the PRIMARY and SECONDARY > features for CouchDB are. > > > Apache CouchDB is the focal point for The Replicating Society. > > > I don’t want to make this mine, but ours. To keep and to be proud of. > > > * * * > > > CouchDB is a database that replicates. > > I’m excited about your feedback! <3 > > Sincerely, > Jan > -- > > > > > > > Thanks to Noah for kicking off this way overdue discussion. > > > On Jul 24, 2013, at 15:28 , Noah Slater <[email protected]> wrote: > > > Okay, here are some rough thoughts. > > > > Why? > > > > - We believe that distributed data should be easy > > > > How? > > > > - Painless multi-master replication > > - Effortless clustering and sharding > > - Co-location of data, queries, and views > > - Deep browser and platform integration > > - Built of the Web > > > > What? > > > > - Erlang > > - HTTP > > - JSON > > - JavaScript > > - MapReduce > > > > (That last list could go on, and on, and on...) > > > > Anyway. This is just a rough sketch of the sort of hierarchy I am > thinking > > about. > > > > Whatever this ends up looking like, I think this is how we should talk > > about CouchDB. This structure could be a template for anything. A talk, a > > sales pitch, the homepage itself. The important thing is that we start > from > > "why?" and we build up from foundations. > > > > > > On 24 July 2013 13:15, Noah Slater <[email protected]> wrote: > > > >> I'm trying to imagine what our "I have a dream" speech would be like for > >> CouchDB. If we were the Wright brothers, we might stand up and say "I > have > >> a dream that one day man will fly." We might say, "I have a dream that > >> distributed data will be easy." (I mean, that about covers it, right? > >> Doesn't have to be complex. The hard part is making sure we actually > focus > >> in on the root dream we all have.) > >> > >> Jan mentioned a few months ago that CouchDB almost wants to be the Git, > >> for databases. What is Git? What would Git's "dream" be? I can imagine > >> Linus saying "I have a dream that distributed version control will be > >> easy." Same sorta thing, right? > >> > >> > >> On 24 July 2013 13:06, Noah Slater <[email protected]> wrote: > >> > >>> Benoit, > >>> > >>> You should defo watch that video and see what you think. Note that it > >>> does not matter if we are a company. This insight applies to companies, > >>> products, loose groups of people working towards one thing (like the > Wright > >>> brothers) and even individuals. (i.e. What is your personal "why" and > how > >>> are the things you are doing working towards that.) > >>> > >>> I also want to put you at ease by saying that having a single shared > >>> "why" doesn't mean that anybody's vision, or personal goals have to be > left > >>> by the wayside. People can still come to the project with their own > goals, > >>> and their own perspective. But the project itself should have a clear > sense > >>> of what we are trying to accomplish. > >>> > >>> I think the "why" we come up with can easily be something that inspires > >>> and is important to the Hoodie peeps, the Kanso peeps, the CouchApp > peeps, > >>> the "big data" peeps, the mobile platform peeps. Think about a why that > >>> might evolve out of "your data, everywhere". Who (in our existing > >>> communities) wouldn't love that and want to rally behind that? (But > this is > >>> just one idea.) > >>> > >>> Asking "what are the core features" misses the point. Why are these > core > >>> features? Why did we add them in the first place? What are we working > >>> towards? See, you hit on it in your final sentence: "relax we take care > >>> about your data and the way you exchange and render them wherever they > >>> are". This! This is the kind of thing that I think we should hone, and > >>> figure out, and document. > >>> > >>> Once we have that, it can inform our "how". When we're talking about > >>> features, about product direction (i.e. what we add, what we subtract) > we > >>> can say "well, how is this related to what we're trying to do here?" > Do you > >>> see what I mean? :) > >>> > >>> "Painless distributed systems" is also a step in the right direction > for > >>> answering the question "why?" > >>> > >>> So far we have: > >>> > >>> * Relax > >>> * Decentralised web > >>> * Peer-to-peer replication of apps and datasets > >>> * Your data, everywhere > >>> * Put the data where you need it > >>> * We handle your data / you handle display > >>> * Painless distributed systems > >>> > >>> Somewhere in here ^ (and perhaps in a follow up reply) is a single > shared > >>> value system. Something we all hold dear. > >>> > >>> > >>> > >>> > >>> On 24 July 2013 12:48, Benoit Chesneau <[email protected]> wrote: > >>> > >>>> Anyway, CouchDB is not like apple or dell. This isn't a company. And > we > >>>> don't have to share all the same vision, but only common values, a > core. > >>>> I'm not sure it enter in the what you describe. What kind of vision > are > >>>> you > >>>> speaking about? > >>>> > >>>> Also I would remove any pro-tip from your mail if we want to start > from a > >>>> neutral base. > >>>> > >>>> Couchdb is known for the replication but not only. Couchapps and the > way > >>>> people hack around is another (hoodie, kanso, erica/ couchapp all > >>>> differents visions of what is a couchapp but all are using couchdb the > >>>> same_.. Message hub is another (nodejistsu, hoodie are using couchdb > as a > >>>> message hub somehow, not only but a lot of their arch is based on > >>>> changes). > >>>> And now we we can add some kind of big data handling. Not forgetting > >>>> people > >>>> that are using apache couchdb on their mobile, they exists and the > >>>> patches > >>>> will be release. > >>>> > >>>> All have different visions. But they share some common features. I > don't > >>>> want to forget someone because of a vision of some. I only know that > >>>> couchdb has some strong features that could be improved. > >>>> > >>>> All that to say that rather than thinking to a vision, maybe we could > >>>> collect all the usages around and see what emerges from it. What are > the > >>>> core features, What couchdb should focus on and itterrate depending on > >>>> the > >>>> new usage. I guess it's some kind of philosophy: "relax we take care > >>>> about > >>>> your data and the way you exchange and render them wherever they are". > >>>> > >>>> - benoit > >>>> > >>>> > >>>> On Wed, Jul 24, 2013 at 1:24 PM, Noah Slater <[email protected]> > wrote: > >>>> > >>>>> Hi devs, > >>>>> > >>>>> I came across this video recently: > >>>>> > >>>>> Simon Sinek: How great leaders inspire action > >>>>> > >>>> > http://www.ted.com/talks/simon_sinek_how_great_leaders_inspire_action.html > >>>>> > >>>>> In it he sets out what he calls the Golden Circle: > >>>>> > >>>>> Why > >>>>> > >>>>> - What's your purpose? > >>>>> - What's your cause? > >>>>> - What's your belief? > >>>>> > >>>>> How > >>>>> > >>>>> - How do we do it? > >>>>> - How does our product differentiate? > >>>>> - How are we different? > >>>>> - How are we better? > >>>>> > >>>>> What > >>>>> > >>>>> - What do we do? > >>>>> - What do we make? > >>>>> > >>>>> He points out that the difference between companies like Apple and > >>>>> companies like Dell. > >>>>> > >>>>> Dell tells you what they do, and how. "We make great computers. > They're > >>>>> well designed and work well. Wanna buy a computer?" Most companies do > >>>> it > >>>>> like this. But they often miss out the "why". > >>>>> > >>>>> But then you look at Apple, and they do it the other way around. > Apple > >>>> tell > >>>>> you what their purpose is. The rest is almost an afterthought. "We > >>>> believe > >>>>> in challenging the status quo. We believe in thinking different. We > do > >>>> that > >>>>> with great design and a focus on the user experience. We just happen > to > >>>>> make computers." He then joking quips: "Ready to buy one yet?" > >>>>> > >>>>> (His talk gives several other examples, with his thesis being that > >>>> telling > >>>>> your story from the outside in is what separates all the great > >>>> companies > >>>>> and leaders. One of his main examples is the Wright brothers.) > >>>>> > >>>>> He comments that if you talk about what you believe, you will attract > >>>> those > >>>>> that believe what you believe. That when you talk about what you > >>>> believe, > >>>>> people will join you for their own reasons, for their own purpose. > And > >>>> that > >>>>> what you do simply serves as proof of what you believe. Or as he > quips: > >>>>> "Martin Luther King gave his 'I have a dream' speech, not his 'i > have a > >>>>> plan' speech." > >>>>> > >>>>> Why am I bringing this to the dev list? > >>>>> > >>>>> Because our message stinks. "Apache CouchDB™ is a database that uses > >>>> JSON > >>>>> for documents, JavaScript for MapReduce queries, and regular HTTP for > >>>> an > >>>>> API" is a terrible way to introduce who we are, what we stand for, > and > >>>> why > >>>>> we build this thing. (And I'm allowed to say all that, because I'm > the > >>>> one > >>>>> who wrote it, with lots of help from Jan.) > >>>>> > >>>>> So what am I proposing? I'm proposing that we figure out our why. > That > >>>> we > >>>>> figure out what we stand for, what we believe in. And then we figure > >>>> out > >>>>> how we're gonna do that (pro tip: replication is more important than > >>>> the > >>>>> data format we use). Not only will this define a consistent internal > >>>> vision > >>>>> for the project (what *are* we working towards anyway?) but it will > >>>> help us > >>>>> to attract people who believe in what we believe. > >>>>> > >>>>> So, if you have any thoughts about this, speak up! > >>>>> > >>>>> Thanks, > >>>>> > >>>>> -- > >>>>> NS > >>>>> > >>>> > >>> > >>> > >>> > >>> -- > >>> NS > >>> > >> > >> > >> > >> -- > >> NS > >> > > > > > > > > -- > > NS > >
