[Couchdb Wiki] Update of "The_CouchDB_Vision_Proposal_NS" by JoanTouzet

Apache Wiki Thu, 12 Apr 2018 07:10:00 -0700

Dear wiki user,

You have subscribed to a wiki page "Couchdb Wiki" for change notification.

The page "The_CouchDB_Vision_Proposal_NS" has been deleted by JoanTouzet:

https://wiki.apache.org/couchdb/The_CouchDB_Vision_Proposal_NS?action=diff&rev1=7&rev2=8

- ## page was renamed from The_CouchDB_Vision
- <<Include(EditTheWiki)>>

- = The CouchDB Vision Proposal (NS) =
-
- This is a WIP to move items from
[[http://markmail.org/search/?q=%22What%27s+our+Why%3F%22+list%3Aorg.apache.couchdb.dev+order%3Adate-forward|the
"What's our Why?" thread]] to a wiki page. I am hoping to form a concrete
proposal that I will bring to the dev@ list and vote on. I would like to
incorporate as much feedback and perspective as possible, but cannot promise to
accommodate everyone! If you have a comment, please post a note to the list. :)
— Noah
-
- <<TableOfContents(2)>>
-
- === Notes ===
-
- "We believe in challenging the status quo. We believe in thinking different.
We do that with great design and a focus on the user experience. We just happen
to make computers."
-
- I you talk about what you believe, you will attract those that believe what
you believe.
-
- When you talk about what you believe, people will join you for their own
reasons, for their own purpose.
-
- What you do simply serves as proof of what you believe.
-
- "Martin Luther King gave his 'I have a dream' speech, not his 'i have a plan'
speech."
-
- Our existing message stinks.
-
- We need to figure out what we stand for, what we believe in. And then we
figure out how we're gonna do that.
-
- This will define a consistent internal vision for the project and will help
us to attract people who believe in what we believe.
-
- Once we have our why, it can inform our how.
-
- When we're talking about product direction we can say "well, how is this
related to what we're trying to do here?"
-
- Whatever this ends up looking like, I think this is how we should talk about
CouchDB. This structure could be a template for anything. A talk, a sales
pitch, the homepage itself. The important thing is that we start from "why?"
and we build up from foundations.
-
- From Jan:
-
- "The number one thing that people did NOT like about CouchDB is that it is
confused. CouchDB has a torn identity, half database, half application server.
It wasn’t clear (and I am part responsible for this) what CouchDB is and wants
to be. In everybody’s defence, I think, it just took a while to figure it out.
Now is a good time to put our findings in writing and fix this."
-
- "The number one request from people was to clear up CouchDB’s story, to have
a clear, bold vision that captures people and that they can easily understand
and share and support and move forward."
-
- "Before I lay it out, I understand that I will be ruffling some feathers. I
think that is both necessary and healthy. I think the picture I am going to
paint will make a lot of people in the CouchDB community happy, some with
concessions, but I utterly and strongly believe that this vision of what
CouchDB is has the power to set the course for the next five years of the
project and attract a whole lot of new people both as users and contributors."
-
- -
-
- I want to learn from understanding what the PRIMARY and SECONDARY features
for CouchDB are. I already feel a bit bad about that the PRIMARY ones are two
(“a database” *and* “that replicates”), but I think that is as little as it
gets.
-
- I want CouchDB’s new identity to be a database that replicates. I want to
provide a slide deck for a “CouchDB in 25 minutes” presentation* that everybody
can take and give and customise, but I want that one of the first things you
say “CouchDB is a database that replicates”. I want that if you ask anyone
inside the CouchDB developer community (you!) about what CouchDB is to answer
“CouchDB is a database that replicates” and then follow up explaining what we
mean, and *then* add a few more of the SECONDARY features that you particularly
like.
-
- (Noah's commentary: how does this play with the idea that everything we do
should stem from our "why". why are we building a database that replicates?
what's our vision? what do we stand for? i think both models are compatible.
our existing approach is to say "what" couchdb is. jan's suggestion is to start
with "how", and then get to "what". i am suggesting that we add another one on
top of that, and start with "why". then say "how", and then say "what". i don't
think these are incompatible. apple might have "challenge the status quo" as a
"why", but it's marketing can still lead with the one sentence "how" in the
same vein as "couchdb is a database that replicates". some thinking /
discussion to do here. i think it will depend on context. homepage, talk, etc,
etc. even jan's talk that was linked starts, essentially, with the "why. his
"why" is listed as "i <3 the web", "i <3 reliable web infrastructure"! so jan
is already doing this in his talks! so maybe this is a template. "we/i love X.
we believe in Y. [BEAT] which is why i hack on couchdb. it's a database that
replicates". voila! @@ these needs bringing back to this list, or working into
a questions/issues section)
-
- @@ go through
https://dl.dropboxusercontent.com/u/82149/CouchDB-in-25-Minutes.pdf
-
- I want that people who barely look at CouchDB comment on an unrelated Hacker
News thread write “…CouchDB is a database that replicates, maybe that is a
better fit for your problem”.
-
- I want that the CTO of the newly funded startup thinks “I seem to have a
replication problem to solve, maybe CouchDB can help.”
-
- I want to move CouchDB’s development forward, and when we ask ourselves
whether to add a feature, we run it by our PRIMARY feature set and ask “does it
support ‘CouchDB is a database that replicates’” and if it does we go ahead and
build it, and if it doesn’t we may consider it as a SECONDARY feature, or we
discard it altogether.
-
- (Noah's commentary: again, i'm gonna go one step higher than this. and i'm
gonna suggest that we also ask ourselves "how does this help us work towards
our why?"
-
- (I don’t actually care what the final slogan will be, and please bike-shed
- this to no avail, but it should capture what I mean with “CouchDB is a
- database that replicates”, a phrase that we can burn into everybody’s
- head that captures CouchDB’s PRIMARY feature, its PRIMARY value
- proposition, the ONE thing that explains WHY we are excited about
- CouchDB.)
-
- [comments on plugin system elided]
-
- -
-
- Apache CouchDB is the focal point for The Replicating Society.
-
- -
-
- If I would like today define couchdb based on my rcouch experience and the
ports I did, I would say: "Apache Couchdb allows you to handle and synchronize
your data between different locations and devices in quasi realtime over and on
the web in a P2P manner without SPOF".
-
- So for me couchdb isn't only a database that replicate, it is also a way to
ease the usage of your data, the way you can view them in your applications or
directly on the web and over the web.
-
- == Why? ==
-
- * What's your purpose?
- * What's your cause?
- * What's your belief?
-
- Note: values should be verbs. i.e. "to make distributed data easy". you can't
do nouns. easy to worm out of. if your values are verbs, you can measure them,
commit to them, challenge people on them. see
http://www.startwithwhy.com/Learn/LearningLibrary.aspx?control=ViewGalleryPhotos&HideLink=1&GalleryID=10&photoID=29&cat=1
for more
-
- Suggestions made:
-
- * peer-to-peer replication of apps and datasets
- * your data, everywhere
- * "relax"
- * Painless distributed systems
- * Decentralised web
- * Put the data where you need it
- * "I have a dream that distributed data will be easy"
- * "CouchDB almost wants to be the Git, for databases"
- * "We believe that distributed data should be easy"
-
- From Jan:
-
- "I want to live in a world where people are empowered to understand and are
capable to decide where their data lives. I want to live in a world where
developers build apps that support that, not because they went out of their way
to implement it, but because it is a feature of the software platform they are
using."
-
- "I want to be able to help people improve their lives in regions of the world
where ubiquitous network access isn’t — and sometimes that is just a major
western capital’s subway — but more likely is it a lesser developed location,
or a rural area that will never see mobile broadband, let alone wired broadband
because there is no financial incentive."
-
- "I want to live in a world where technology solves more problems than it
creates. One of those ways is allow people to use software wherever they are in
whatever context they need it in. More often than not, that means far away from
fast network access[...]"
-
- "My primary motivation for working on Apache CouchDB is to help build the
world I want to live in[...]"
-
- -
-
- I want to live in a world where people are empowered to understand and are
capable to decide where their data lives.
-
- I want to live in a world where technology solves more problems than it
creates.
-
- My primary motivation for working on Apache CouchDB is to help build the
world I want to live in.
-
-
- == How? ==
-
- * How do we do it?
- * How does our product differentiate?
- * How are we different?
- * How are we better?
-
- Suggestions made:
-
- * Schema-less/document-oriented
- * Replication
- * "of the web"
- * "some kind of big data handling"
- * "couchdb on their mobile"
- * we take care of your data
- * we take care of exchanging your data
- * we take care of rendering your data
- * We handle your data / you handle display
- * Painless multi-master replication
- * Effortless clustering and sharding
- * Co-location of data, queries, and views
- * Deep browser and platform integration
- * Built of the Web
- * Database runs anywhere
-
- From Jan:
-
- "In the past year I have interviewed a fair number of people, let’s say 50,
from those who have heard about CouchDB to users to core devs."
-
- "The ONE feature that makes CouchDB relevant is multi-master replication.
There is no exception, this is the ONE thing that makes CouchDB exceptional.
NOBODY else has that, and even the decent proprietary solutions that are just
coming to market suck where we KICK ASS."
-
- "There are many other things that people like about CouchDB: reliability, no
schema, HTTP interface, the view system, etc. But NONE of these people would
care if CouchDB didn’t have multi-master replication."
-
- CouchDB is a database that replicates.
-
- Think of it as git for your data-layer. Not in a sense where you manage text
files and diff and merge, but in the sense that you have a local version of
your data and one or multiple remote ones and you can seamlessly move your data
between them, back and forth and crossover.
-
- Imagine a local checkout of your data that you can work on, and then share it
with Lucie across the table, she finds some issues and fixes up the data, and
shares it with Tim across the room. Tim fixes two more issues and you pull both
their changes into your copy. We conclude the whole thing is golden and we push
it to staging, where our continuous integration runs and decides that the data
is good to go into production, so it pushes it to production. There the data is
picked up from various clients, some mobile over there, some web over here, a
backup system in the Tokyo office…
-
- Or you have hospitals in remote regions in Africa that collect local health
data, like how many malaria infections a region has and they all share their
results over unreliable mobile connections and the data still makes it
eventually maybe with a few hours delay and the malaria expert in the capital
city sees an increased outbreak of some illness and is able to send out
medicine in time to arrive for the patients to help. Where today the expert
takes months to travel between the hospitals to collect that data manually and
find out that there was a lethal outbreak two months ago and everybody died.
-
- (Somebody built this, CouchDB does save lives, I get teary every time I tell
this story (like now). Our work doesn’t get more noble than this.)
-
- Or imagine millions of mobile users with access to terabytes of data in the
cloud, replicating the bits they need to their phones and tablets, allowing
super-fast low-latency access for a stellar user experience, while giving
access to sheer amounts of data and allowing full write access on the mobile
device to be replicated back to the cloud when connections exist.
-
- (Our friends at Cloudant have a couple of those customers.)
-
- That is the power of CouchDB.
-
- -
-
- Replication is the PRIMARY feature of CouchDB. “is a database” means “stores
your data, safely and securely”, “that replicates” highlights the primary
feature.
-
- do these bits belong here or in previous section? - There are many more very
cool features of CouchDB, even the details on how we achieve reliability and
data safety or how replication works are mindblowingly cool. The simple HTTP
interface, the JSON store, the app-server features, map reduce views, all very
excellent things that make CouchDB unique, but it is very important to
understand that they are SECONDARY features.
-
- (@@ does this bit go into the "what" bit? need to research difference. think
we can lead with replication as the primary feature, but include it in the
"what"?)
-
- -
-
- @@ where does this bit go? should it even be included? might be worth punting
the whole "couch-like" stuff to a separate doc, and only referencing it from
this vision statement?
-
- And then, CouchDB is one more thing. CouchDB isn’t just the Erlang
implementation of this whole replicating database idea. CouchDB is also the
wire protocol, the specification that makes all the magic work. Apache CouchDB
is the focal point for The Replicating Society*.
-
- (* cue your Blade Runner jokes)
-
- Apache CouchDB is THE standard for data freedom and exchange and is the
clearing house, the centre for an ecosystem that includes fantastic projects
like PouchDB and the TouchDBs, MAx Ogden’s `dat` and whichever else follow
these. Not saying we merge those projects in, they can stand on their own, but
we should embrace everything that makes the interoperable replication world a
reality.
-
- http://couchdb.apache.org is going to be the centre of the data replication
universe.
-
- (Noah's commentary: I think we should call this "Couch" and capitalise on the
"-DB" less prefix that people have used elsewhere. this should be a reclamation
effort on our part, to own, and define what a "couch-like" system is. this
needs further discussion on the list.)
-
- -
-
- The ONE feature that makes CouchDB relevant is multi-master replication.
-
- == What? ==
-
- * What do we do?
- * What do we make?
-
- Suggestions made:
-
- * Erlang
- * HTTP
- * JSON
- * JavaScript
- * MapReduce
- * hoodie
- * kanso
- * erica
- * couchapp
- * Message hub (nodejistsu, hoodie are using couchdb as a message hub somehow)
-
- Jan outlines his idea of a "core":
-
- * remote & local replication
- * MR-views & GeoCouch enabled by default (ideally abstracted away with nice
“query dsl”)
- * HTTP interface
- * Fu/Fauxton
- * configuration
- * stats
- * docs
- * plugin system with Erlang (and in the future JavaScript support via
Node.js)
-
- Also:
-
- * plugin system
-
- Note also:
-
- "And yes, this explicitly includes things like shows and lists and update
functions and rewrites and vhosts. We should make it super simple to add these,
but for a default experience, they are very, very confusing. We should have a
single plugin “CouchApp Engine” which includes Benoit’s vision of CouchApps
done right that is just a click away to install."
-
- Jan lays out our "specs":
-
- * Apache CouchDB implements the CouchDB vision: It is a database that
replicates.
-
- * Document Database:
- * Data records are standard JSON.
- * Unlimited Binary data storage with attachments.
- * (alternatively arbitrary mime docs with special rules for JSON docs)
-
- * Fault-tolerant:
- * Data is always safe. Tail-append storage ensures no messing with already
committed data.
- * Errors are isolated, recovery is local and doesn’t affect other parallel
requests.
- * Recovery of fatal errors is immediate. There is no “fixup phase” after a
restart.
- * Software updates and bugfix deployment without downtime.
-
- * Highly Concurrent:
- * Erlang makes good use of massively parallel network server installations.
- * Garbage collection happens roughly on a per-request basis. GC in one
request doesn’t affect other requests.
-
- * Cluster / BigCouch / Big Data:
- * Includes a Dynamo-style clustering and cluster-management feature that
allows to spread data and load over multiple physical machines.
- * Scales up to Petabytes of data.
-
- * Secondary 2D and 3D indexing
- * Using incremental and asynchronous index updates for high-performance
queries.
-
- * Makes good use of hardware:
- * Tail-append storage allows for serial write access to storage media,
which is a best-case-scenario for spinning disks and SSDs.
-
- * Small Core & Flexible Plugin System:
- * Some features are only useful for a small group of people, these can be
installed with a super simple plugin management system that is built into the
admin interface.
- * Get new features with a click or tap.
- * Plugins can be written in Erlang (and in JavaScript in the future).
-
- * Cross Platform Support
- * Runs on any POSIX UNIX as well as Windows.
- * Support for some embedded devices like Android and RaspberryPi.
-
- -
-
- The ONE feature that makes CouchDB relevant is multi-master replication.
-
- -
-
- An HTTP API, a small code base. The HTTP API is more important than some are
saying today. Of course we could use a binary protocol it would be faster. But
it is just a matter of time. With HTTP 2.0 coming at the end of the year, the
already working implementations using SPDY, the HTTP couchdb api will be
exchanged in binary stream. Using couchdb over and on the web is really one of
its key features.
-
- I didn't have to query them across multiple tables, simply map them then
query them to match some pattern. I didn't have to organize them at first in
tables or columns. I just had to store my document and create views (index) on
them. The views can be later edited or edited, but the documents, the way I
store the data don't change. Which was perfectly fit the way I code, iterating
over features ans sometimes completely change the way I'm using/view the data
in the code. CouchDB was giving me way to manipulate data I didn't have since a
while, since I played with hypercard or lotus notes.
-
- Incremented views and the way couchdb is storing the data are designed for
the new storages we have today (ssds and others)
-

[Couchdb Wiki] Update of "The_CouchDB_Vision_Proposal_NS" by JoanTouzet

Reply via email to