Re: [Rdkit-discuss] planning a crowdsourcing project for mmpdb development

2019-07-31 Thread Greg Landrum
Dear all,

This is a somewhat unusual message for this list and I wanted people to
know that Andrew and I had talked about it (and his proposal) first and
that I completely agree that it's an appropriate message for the
rdkit-discuss mailing list. We don't agree about everything here, but
Andrew is exploring options for how to support himself while working with,
and producing open-source software, and I have a ton of respect for that
and think that it's something that's likely to be a long-term good for the
RDKit community.

So, if you have an objection to this kind of thing showing up on the list:
please reply to Andrew and I privately. I don't think a public email
"debate" about the topic is likely to be fruitful. If you want to talk
about it in person and are planning on being at the UGM: Andrew and I will
both be there and we can have a long fun argument in the real world.

Best,
-greg




On Wed, Jul 31, 2019 at 7:02 PM Andrew Dalke 
wrote:

> Hello RDKit users,
>
>  As some of you know, I have been exploring ways to fund commercial open
> source software projects in cheminformatics. I plan to try a crowdsourcing
> model to help develop and support the mmpdb matched molecular pair system.
> In short, I'm looking for several people or organizations interested in
> funding specific new mmpdb features.
>
> At this point I am looking to see if there is enough interest to be worth
> my time and effort. If you are potentially interested, or have feedback,
> please respond in private email.
>
> I am starting with a small trial.
>
> One company contacted me asking if I could add Postgres support and a way
> to export the property rules. I can deliver that to them. However, it
> typically takes 2-3 times as much effort to develop software for general
> use than software for a single customer. (For example, the people who asked
> me to develop a feature don't need much documentation about what the
> feature does.) It doesn't seem right to ask them to pay twice as much just
> so I can add the result back to mmpdb.
>
> In addition, others may be interested in those features, and be willing to
> pay for them. If I contribute the new features back to mmpdb immediately,
> then they have no reason to pay me since the features are there. While if I
> wait, I might get more consulting work for redoing the same work.
>
> Even better, if several companies pay me for a feature, then the surplus
> can help fund general mmpdb maintenance beyond those specific features.
>
> Crowdsourcing
> =
>
> I am planning to run the trial as a Kickstarter-like system. For those who
> are unfamiliar with Kickstarter, it's a crowdfunding platform. A project
> creator comes up with a funding goal, and a deadline. If enough people
> commit funding, the project starts. If there isn't enough funding, then no
> one does or pays anything.
>
> For example, if each of three companies commits to paying EUR 5 000, then
> the total commitment is EUR 15 000. That exceeds the basic funding goal of
> EUR 14 000, which means I will develop a new version of mmpdb with the
> features listed below, and deliver the new version to those backers.
>
> Like most Kickstarter projects, I also have a set of concrete stretch
> goals. I honestly don't think I'll get EUR 50 000 of funding for this
> trial, but if I do, then I'll be able to make many improvements to the
> mmpdb project, and make it all available through the main mmpdb repository
> as soon as it's ready.
>
> Goals and Funding levels
> 
>
> Here are the provisional goals and corresponding funding levels. At this
> point I am looking for feedback before committing to them.
>
>   o  Basic level - EUR 14 000
>
> This pays for my time to add support for:
>
>  - Postgres database support
>  - rule and property rule export
>  - provide environment fingerprints as SMILES fragments, instead of or in
> addition to SHA256 hashes
>
> Everyone who joins the crowdsourcing consortium would receive the new
> features, under the existing 3-clause BSD license. However, in order to
> incentivize people to join, this new code will not be sent upstream to the
> main mmpdb repository hosted by the RDKit project. (See the Q&A below.)
>
> Feedback question: are there any other minor modifications that several
> people would like, and are willing to pay for?
>
>   o  Delayed upstream level - EUR 20 000
>
> Within 6 months of completion, I will contribute the new features back
> upstream to the main mmpdb repository for anyone to download and use.
>
> The delay is meant to encourage people to fund the project now, rather
> than wait until it is available for free.
>
> Feedback questions: is this a reasonable enticement? Is the delay too
> short or too long? Perhaps I should consider something else? The main
> problem with open source funding has always been in trying to get people to
> pay for things which are already available for free.
>
>   o  Documentation level - EUR 25 000
>
> At this le

[Rdkit-discuss] planning a crowdsourcing project for mmpdb development

2019-07-31 Thread Andrew Dalke
Hello RDKit users,

 As some of you know, I have been exploring ways to fund commercial open source 
software projects in cheminformatics. I plan to try a crowdsourcing model to 
help develop and support the mmpdb matched molecular pair system. In short, I'm 
looking for several people or organizations interested in funding specific new 
mmpdb features.

At this point I am looking to see if there is enough interest to be worth my 
time and effort. If you are potentially interested, or have feedback, please 
respond in private email.

I am starting with a small trial.

One company contacted me asking if I could add Postgres support and a way to 
export the property rules. I can deliver that to them. However, it typically 
takes 2-3 times as much effort to develop software for general use than 
software for a single customer. (For example, the people who asked me to 
develop a feature don't need much documentation about what the feature does.) 
It doesn't seem right to ask them to pay twice as much just so I can add the 
result back to mmpdb.

In addition, others may be interested in those features, and be willing to pay 
for them. If I contribute the new features back to mmpdb immediately, then they 
have no reason to pay me since the features are there. While if I wait, I might 
get more consulting work for redoing the same work.

Even better, if several companies pay me for a feature, then the surplus can 
help fund general mmpdb maintenance beyond those specific features.

Crowdsourcing
=

I am planning to run the trial as a Kickstarter-like system. For those who are 
unfamiliar with Kickstarter, it's a crowdfunding platform. A project creator 
comes up with a funding goal, and a deadline. If enough people commit funding, 
the project starts. If there isn't enough funding, then no one does or pays 
anything.

For example, if each of three companies commits to paying EUR 5 000, then the 
total commitment is EUR 15 000. That exceeds the basic funding goal of EUR 14 
000, which means I will develop a new version of mmpdb with the features listed 
below, and deliver the new version to those backers.

Like most Kickstarter projects, I also have a set of concrete stretch goals. I 
honestly don't think I'll get EUR 50 000 of funding for this trial, but if I 
do, then I'll be able to make many improvements to the mmpdb project, and make 
it all available through the main mmpdb repository as soon as it's ready.

Goals and Funding levels


Here are the provisional goals and corresponding funding levels. At this point 
I am looking for feedback before committing to them.

  o  Basic level - EUR 14 000

This pays for my time to add support for:

 - Postgres database support
 - rule and property rule export
 - provide environment fingerprints as SMILES fragments, instead of or in 
addition to SHA256 hashes

Everyone who joins the crowdsourcing consortium would receive the new features, 
under the existing 3-clause BSD license. However, in order to incentivize 
people to join, this new code will not be sent upstream to the main mmpdb 
repository hosted by the RDKit project. (See the Q&A below.)

Feedback question: are there any other minor modifications that several people 
would like, and are willing to pay for?

  o  Delayed upstream level - EUR 20 000

Within 6 months of completion, I will contribute the new features back upstream 
to the main mmpdb repository for anyone to download and use.

The delay is meant to encourage people to fund the project now, rather than 
wait until it is available for free.

Feedback questions: is this a reasonable enticement? Is the delay too short or 
too long? Perhaps I should consider something else? The main problem with open 
source funding has always been in trying to get people to pay for things which 
are already available for free.

  o  Documentation level - EUR 25 000

At this level of support, I rewrite and update the documentation from the 
current README into a better form that can be hosted on "Read the Docs". Of 
this, USD 400 will go to sponsoring Read the Docs.

  o  mmpdb/GitHub support - EUR 29 000

At this level of support, I will be able to handle support questions on the 
mmpdb project page on GitHub for a year.

  o  Instant upstream level - EUR 37 000

At this level of support, the code is distributed upstream at once, with no 
further delay.

  o  Test suite level - EUR 50 000

At this level of support, I develop a reasonable automated test suite for the 
project. Currently there are only a few tests to assure that the basic 
functionality works. This must be fleshed out in order to support long-term 
development. Good test suites often find new bugs, which I'll also fix.


How to join
===

To start, email me to say that you are interested, and how much you are willing 
to fund.

Kickstarter allows people to set their own donation levels, which is likely 
difficult to explain to the financial department. I can unde