Hi Chintan,
On 10. Jul 2019, at 18:25, Chintan Mishra <chin...@rebhu.com> wrote:
On 09/07/19 9:33 PM, Joan Touzet wrote:
Hi Chintan,
Reading through your proposal, I have one main point to make.
At the Apache Software Foundation, the people who lead the projects
are
the people who do the work on them. We use the wrong word
"meritocracy"
to explain this principle; a better word would be "do-ocracy."
http://www.apache.org/foundation/how-it-works.html#decision-making
https://incubator.apache.org/guides/participation.html#as_a_developer
https://communitywiki.org/wiki/DoOcracy
That means that your project can completely proceed on its own if it
wants to; the only thing over which you're not in control is whether
that project gets to call itself CouchDB or not. That decision is
reached by the people who have built CouchDB into what it is today.
I appreciate that you shared these links. I now understand what I
have to do next.
-----
On that last point, there's a lot that would need to be done for
you to
convince the PMC that your vision is the one, true future of CouchDB.
What you propose is both a significant rewrite, as well as
requiring an
entirely new set of skills from the developer base (Rust, MQTT,
Kotlin,
Swift).
From Slack conversations, it appears the community has some
inclination towards building a Rust based CouchDB some day. As for
other technologies those changes are not happening today. I do not
propose to start with all the changes at once. Storage engine is a
good place to start.
Since I brought it up in Slack, let me clarify: I do not suggest that
we should move CouchDB to Rust today or any time later.
What I am suggesting is that we should look at the things required to
support your idea of an IoT-capable CouchDB-like thing. My suggestion
is to not change CouchDB, but to make a new CouchDB compatible project.
Devices are only getting smaller, so a lower level language is needed
to ensure performance and good battery use. That leaves C, C++, Go and
Rust.
When I’m looking at what likely people I could excite to contribute to
such a thing, in my filter bubble that is folks getting into Rust or
and the rest of the Rust community.
If it ends up being Go, C or C++, because someone who runs with this
prefers those, I don’t really mind.
* * *
In particular, we should look at a more detailed IoT use-case and how
CouchDB can help.
Correct me if I’m wrong, but this is mostly about devices with sensors
generating measurements over time that should be aggregated into a
cloud service for analysis.
In that world, a hypothetical API for an IoT app using our new
RustyCouch
could look like this:
db = RustyCouch.open('file.foo’)
db.save(measurement)
db.push('https://cloud.measurements.com’)
repeat
This is a very small subset of the CouchDB API, but it would cover the
majority of your billion IoT use-cases.
There are a few things to be considered about data persistence and
concurrency control, but in another email, you already mentioned
SQLite, which solves most of those for you already.
db.save() would generate a JSON document with a uuid as _id and
corresponding _rev and an entry in an index that allows us to query,
at a later point: in what order were these docs written, which we are
going to need for db.push()
db.push() then opens that index, checks with the cloud which docs
it already has (as per the standard CouchDB replication protocol)
and then sends all local docs that aren’t on the cloud yet in a
couple of _bulk_docs batches.
Voilá, a low-level, embeddable library that allows you to sync
stuff to CouchDB.
This is a scope that a single developer could make a prototype
of, even in a language that you are just starting out with.
With this in hand then, the next step is to talk to the folks
who build IoT platforms and applications to see if they want to
use something like that.
And once we have this, we can talk about changes to the replication
protocol.
* * *
If you want to take this further, and make a library that also
supports interactive querying, for say native applications on
phones and watches and whatnot, you already have a decent
foundation, but you’ll have a little more work to do.
* * *
But none of this requires changing CouchDB itself, or a 10 year
effort of porting something, while solving all the needs you
have.
* * *
Finally, I’d like to caution against being flippant about the
current project direction with FoundationDB. This is something
the team that has been doing this for over 10 years looked at
“in-depth” and decided it is the right thing to do.
The alternative would be to build a FoundationDB like thing
ourselves, which is a multi-million dollar investment that
I haven’t any one seen commit to at the moment.
In particular, I’m one of the champions for smaller CouchDB
installations in this project, and moving forward is always
a give-and-take. We are not in a position yet to gauge what
the problems are with an FDB-Couch for a single-node instance
but I’m sure going to work hard on making it easy for our
downstream users.
I’m the maintainer of the Mac binaries which are extremely
popular. Any database that can’t be set up with a download,
unzip and double click to start to get a dev environment is
going to have trouble attracting new developers. So I’ll make
sure we can retain this experience as much as possible.
* * *
Let’s be pragmatic and consider incremental change or small
scope side projects to move this forward. Grand visions,
in my experience almost never work out. The only reason I have
trust in an FDB transition is because someone with authority
and budget said “the team that ostensibly built CouchDB 2.x
is going to do this”. That’s the only way it could possible work.
And don’t mistake my RustyCouch suggestion about being
dismissive or sidelining. I’ve wanted something like this since
about 2008, and many people have tried with various attempts,
so my suggestion above is very serious *and* fed by the
experience or all these failed attempts.
CouchDB’s strength is its replication protocol. We didn’t
rewrite CouchDB in JavaScript because we suddenly realised
there are a billion browsers, but PouchDB came along with
a compatible data model and replication engine so that the
two projects complement each other perfectly and anyone on
the CouchDB will tell you that PouchDB is one of the biggest
drivers of CouchDB adoption.
How about we just re-run this strategy for IoT: build a
small thing that is useful for one use-case and make it work,
then make it more complicated to be useful for more use-cases.
At each point, make sure replication with CouchDB works.
That’s a winning strategy. We already know it.
Best
Jan
—
It is in direct competition with the proposal being worked on
this list for the FoundationDB backend swap. With the addition of
MQTT,
it sounds like the entire replication protocol and methodology would
need to be revisited, as the semantic changes you're proposing would
break existing client replication.
The HTTP replication protocol more or less remains the same in the
foreseeable future. A new MQTT replication strategy will be built
upon the existing method. The two will not work in parallel. Either
one of these will work per database.
Finally, the proposal to push into
the mobile space would directly compete with our sister project
PouchDB,
who have put in tens of thousands of development hours as well.
The community will evolve at some point. And bringing people from
sister project onto CouchDB will allow faster development. The
diagram in the proposal missed a part for Web Browser based CouchDB.
This missed part is an interface for JavaScript and CouchDB-Web
Browser. So, we will need some JavaScript developers too. And they
can help improve Fauxton.
This all
adds up to a much bigger scoped project than CouchDB is today, and I
daresay may be bigger than I think even you realize.
I do realize that I want CouchDB to be in a billion mobile and
embedded device by 2025. I understand this is a challenging scale. I
brought this here because I see how much we need a DB for a "Cluster
Of Unreliable Commodity Hardware". I assume proposed path will take
somewhere between 18-21 months to come to fruition for a team of 15
people working 40 hours/week.
With my PMC hat on, I have to ask:
* Do you already have developers versed in these skills you can
bring to
the project (beyond yourself)? Are they ready to commit the 40+
hours
a week each to making it a reality?
No, I do not have a team in place for this.
* Do you have experience in building a distributed system of this
scale,
using the specific technologies you propose?
I have been reading about distributed systems. I want to take up an
Open Source project which solves replication problem for devices
coming up with emerging technologies.CouchDB is the best fit as it
already solves theproblem of replication across remote devices.
* How do you plan to convince other developers of your approach
specifically?
What got us(you) here, won't get us(you) there! -- Marshall Goldsmith
CouchDB led the way by being years ahead. This is just the same
thing happening again in a newer market. CouchDB is already great at
replication. What I am proposing is taking this simple-but-powerful
methodology a step further and building it for planetary scale
use-cases(idea derived from Lasp-Lang).Here are some ways with which
we drive more developers, users, and eyes.
* Helping users realize that CouchDB lets them relax while building
applications for devices with any form factor.
* Reaching out to the developers who have built their own solution for
replicating stuff from their device of any form factor to CouchDB
* On-boarding developers who will become early adopter and test it out
on their IoT devices. Thus, proving an unmet market need.
* Promoting offline-first strategy among mobile and embedded
developers will drive contributors from these communities.
* Documenting comparisons between existing mobile and embedded
solutions which provide replication solutions like Realm, and
CouchBase.
* How do you intend to train up our existing developers on the new
languages and technologies involved?
If people are excited about the future they are building then this
is a smaller problem to tackle. People in this community when and if
they come to a consensus about the proposal then this can be tackled
by 'Each one, teach one' followed by Yamaha Motors. This is a buddy
system where people get new partners to tackle a problem/PR. They
share issues, their understanding of the codebase and language, etc.
with each other. As buddies rotate everyone gets on the same page
after a few cycles. I have found 3-pair buddy system works best in
software.But this may differ based on culture, language, timezone,
and availability.
* How do you perceive the advantages and disadvantages of your
approach
*specifically* vs. the FDB approach already outlined?
Value addition (Horizontal) > >
----
Proposal (Vertical) \/ \/
Pros
Cons
FoundationDB
* Improving what works for majority of existing users
* Iterates CouchDB to a better form
* Prospect of immediate consistency for ACID transactions
* Losing some small and mid-sized developers
* Fragments community
Polyglot-unification
* Growth by tapping newer prospects
* Reduces fragmentation of user community and codebase
* Reimagines CouchDB as if it was built in 2019
* Tons of work
* Uses RocksDB, overlooks FoundationDB migration
Email with subject "CouchDb Rewrite/Fork" by 'Reddy B.
<redd...@live.fr>' has mentioned some other concerns. This proposal
introduces a new story for CouchDB. This proposal would require
using RocksDB instead of FoundationDB.
-Joan
On 2019-07-09 10:28, Chintan Mishra wrote:
Hello team!!
Years of time and effort help move a product to the heights that
CouchDB
has reached. And as a non-contributor, rather a very new CouchDB
user(1.5 years) who failed to find some relevant emails, I came up
with
a version of the future for CouchDB that I thought would help us
grow.
But Jan and Robert helped me realize that it takes a village to
raise a
child(CouchDB). So this is a proposal to find a middle ground from
where
we are headed and where the market is going next. The proposal I
wrote
was solely driven by what I have read over the years about the
growth of
the product and the community. I have attached the file or if you
prefer
reading in a browser, then click
here<https://gitlab.com/snippets/1873543>.
It will roughly take 4-5 minutes of your time. A proposed
direction is
to start an entirely new project. That is not what I desire. I
want to
join the community behind CouchDB not build a new one using it. My
goal
from this proposal is to generate leverage by creating early mover
advantage and help grow the community.
Thanking you.
--
Chintan Mishra
Rebhu Computing
Founder and CEO