Another anonymous response:
<quote>
I had never looked at the accumulo front page until this morning. I
think it does ok with "who are you?", but should to better at "*why* are
you?". it indirectly mentions the security model and iterators, but I
think it should make those front and center. and ingest performance is
huge.
I don't know how aggressive you want to get, but I think you really
ought to directly compare to hbase and cassandra, on various dimensions.
What market segments would you love accumulo to get in to? (health care?
...). If I were a developer looking to spend my hobby time, the front
page might lead me to check out the other projects, and maybe not come
back (and a google of "hbase vs" lists a number of comparisons that did
not even include accumulo).
In general, I think getting more users would get more developers:
- I think that points to the marketing side of things
- NiFi is doing a stunningly good job with blog posts about low-pain
setup and examples, right out of the gate
Iterators are terrifying to implement/deploy:
- they are clearly a novel paradigm when reading the paper/docs, but
implementing and deploying a complex new iterator, or even an update to
an iterator that's been working for a long time, on a large cloud,
always makes me hold my breath until i'm about to pass out
- Even after i've added every possible unit test I can think of, I
still assume that I will see a storm of crashing tservers when I push
out to a large cloud.
- Some sort of systematic safety harness for vetting a new iterator
or combination of iterators would be great
- I think it's mostly scary because we don't really have a small live
playground in to which we can copy data and make mistakes. Maybe the
solution is to create the playground (with real, non-cherry-picked
data), and be able to make mistakes that don't cost days to undo but
that takes a good deal of work, and tools could be written to support that.
<quote>
Some personal thoughts:
Good points about being more assertive WRT marketing. I think it's fair
to say that we get "walked" often because we're not aggressive enough in
stating that Accumulo is a player.
We should make an iterator fuzzing framework. We know what the system
does that is unexpected and can likely codify that in a test
environment. It would take a little bit of effort to implement well, but
I do think it's feasible. Clone()'ing a table is one option if you have
real data in a real environment -- that will at least prevent you from
destroying existing data, but it doesn't protect you against tanking
your Accumulo instance with some thread/memory leak :)
Josh Elser wrote:
I meant to send this out closer to the new year (to ride on the new year
resolution stereotype), but I slacked. Forgive me.
As should be aware by those paying attention, we have had very little
growth within the project over the past 6-9 months. We've had our normal
spattering of contributions, a few from some repeat people, but I don't
think we've grown as much as we could.
I wanted to see if anyone has any suggestions on what we could try to do
better in the coming year to help more people get involved with the
project. I don't want this to turn into a "we do X wrong" discussion, so
please try to stay positive and include suggestion(s) for every problem
presented when possible.
Also, everyone should feel welcome to participate in the discussion
here. If you fall into the "bucket" described, I'd love to hear from
you. If anyone doesn't want to publicly respond, please feel free to
email me privately and I'll anonymously post to the list on your behalf.
Some ideas to start off discussion:
* Help reduce barrier to entry for new developers
- Ensure imple/easy-to-process instructions for getting and building
code in common environments
- Instructions on running tests and reporting issues
* More high-level examples
- Maybe we start too deep in distributed-systems land and we scare away
devs who think they "don't know enough to help"
- Recording "newbie" tickets and providing adequate information for
anyone to come along and try to take it on
- Encourage/help/promote "concrete" ideas/code in the project. Something
that is more tangible for devs to wrap their head around (also can help
with adoption from new users)
* Better documentation and "marketing"
- We do "ok" with the occasional blog post, and the user manual is
usually thorough, but we can obviously do better.
- Can we create more "literature" to encourage more users and devs to
get involved, trying to lower the barrier to entry?
Thanks all.