A number of users have asked for the ability to query a Gremlin Server and check for its health ahead of sending queries:
https://stackoverflow.com/questions/46505790/gremlin-server-health-check-endpoint-for-aws-elb https://stackoverflow.com/questions/59396980/gremlin-query-to-check-connection-health Various TinkerPop implementers also have architectures that require using multiple Gremlin Servers for high availability. While some of the Gremlin Language Variants include the ability to define a Gremlin Server cluster using multiple endpoints, not all do. Some architectures require the use of a load balancer in front of a Gremlin Server cluster. In this configuration, a load balancer needs to poll each of the backend Gremlin Server instances to determine their health and whether or not it should route requests to a given server. Today, Gremlin Server has no means to supply health other than returning the "no gremlin script supplied" error message or via using a simple Gremlin query such as g.inject(0). I'm proposing that we add a /status API to Gremlin Server for the purpose of providing the health of a Gremlin server instance. The /status API can also be used to return additional telemetry of a given Gremlin Server, such as the uptime, TinkerPop version, and configuration information. Here is a proposed response structure, unless others have better ideas of what might be included: { "status":"healthy", "startTime":"Wed Jul 12 13:50:20 UTC 2023", "gremlin-server-version":"3.6.0", "settings":{ "channelizer":"WebSocketChannelizer", "host":"localhost", "port":8182 } } This could be extended to include additional parameters from any TinkerPop implementation. As an example, we should include the ability for projects like Janusgraph to include a Janusgraph version, and/or additional configuration details of the underlying storage layer being used with Janusgraph. A reference implementation could be included with the project to include a status response for TinkerGraph and return the various enabled features of the underlying graph object. Given the length of such a response, we could potentially parameterize the call to include a summary as a default with additional details via something like `/status/?details=true`. Another feature that this could potentially expose is the new(ish) Service Registry. Users could leverage the /status API to fetch a list of services available to be used with the call() step. There are a lot of options this could enable. Looking for consensus from the community on whether we should implement such an API in this manner. One additional item we will need to consider: Today, any http requests are accepted regardless of the server route being used. This means that connections can be established to the root route ('/'), the default Gremlin route ('/gremlin') or any arbitrary route ('/hokey-pokey'). This behavior would need to be altered to allow for defined routes. We may be able to support accepting query/connection requests on the root route to avoid any breaking changes for applications that may have failed to use /gremlin. Looking forward to feedback. Thank you, Taylor Riggan