A week or so ago I decided to try to figure out what was going on with the Giraph upgrade - some of you may recall that we backed off doing that upgrade for 3.3.0 at the last minute because it was hanging up Hadoop. After messing with it for a bit, I started to wonder why this work was necessary at all. Why not just suggest to the community that we drop support for Giraph in 3.4.0?
Some reasoning: 1. This weird Hadoop hanging problem is preventing upgrade. I suppose I could try to figure it out, but Marko already tried and didn't succeed, so I'm not sure what new insight I would bring since he was the expert. 2. It massively slows our integration tests. we could be many times more productive without giraph in the build. 3. The Giraph community seems dormant (not one post on user mailing list since November 2017 and very little happening on dev though i just learned they have an LGPL dependency in their stuff while looking at recent posts - no one has jumped on it to address it - the PR from community is still open https://github.com/apache/giraph/pull/61 and over 2 weeks old). It just doesn't seem like anyone is really maintaining or advancing the project. 4. I could be wrong, but I sense that most organizations are using gremlin-spark to do their work and giraph is largely unused. Anyway, it would be good to hear any feedback on this fairly major decision. If there are no objections here on the dev list, I will take this discussion to the user list and twitter to see if what the user community thinks. Thanks, stephen
