A week or so ago I decided to try to figure out what was going on with the
Giraph upgrade - some of you may recall that we backed off doing that
upgrade for 3.3.0 at the last minute because it was hanging up Hadoop.
After messing with it for a bit, I started to wonder why this work was
necessary at all. Why not just suggest to the community that we drop
support for Giraph in 3.4.0?
1. This weird Hadoop hanging problem is preventing upgrade. I suppose I
could try to figure it out, but Marko already tried and didn't succeed, so
I'm not sure what new insight I would bring since he was the expert.
2. It massively slows our integration tests. we could be many times more
productive without giraph in the build.
3. The Giraph community seems dormant (not one post on user mailing list
since November 2017 and very little happening on dev though i just learned
they have an LGPL dependency in their stuff while looking at recent posts -
no one has jumped on it to address it - the PR from community is still open
https://github.com/apache/giraph/pull/61 and over 2 weeks old). It just
doesn't seem like anyone is really maintaining or advancing the project.
4. I could be wrong, but I sense that most organizations are using
gremlin-spark to do their work and giraph is largely unused.
Anyway, it would be good to hear any feedback on this fairly major
decision. If there are no objections here on the dev list, I will take this
discussion to the user list and twitter to see if what the user community