Just a quick note, but I don't think we would go far wrong using Formatter for now. It is after all an "interpreter for printf-style format strings", where printf <https://en.wikipedia.org/wiki/Printf_format_string> has a POSIX specification <https://pubs.opengroup.org/onlinepubs/9699919799/> that is implemented in many programming languages. The Formatter docs go on to say that while Formatter is "inspired by" printf, it departs from printf in ways that are idiosyncratic to Java. My googling did not turn up a handy list of features for which Formatter differs from printf, nor did it turn up a POSIXly-correct printf library for Java. Likely that both of these things exist somewhere. Otherwise if we really wanted to be picky about portability, we might have to write that custom printf in each target language, including Java.
W.r.t. outputting elements as JSON, IMO this is another area where a formal data model is going to help, and the output need not be limited to JSON. Thrift, Protobuf, Avro, JSON, GraphQL... any data model we have an appropriate API for and can describe in terms of primitive types, sum, and products, we can map schemas and data into. The target of format() would not be "JSON" but (for example) "JSON conforming to the JSON Schema equivalent of your graph schema". In the default graph schema, I think there will be one kind of vertex, and one kind of edge, with labels more like properties than types. The generated JSON for a graph with this flat schema could be made to look something like GraphSON, though I wouldn't expect the default representation of a vertex to contain "inE" or "outE" because a vertex doesn't own/contain its incident edges. You could output something that *contains* a vertex and also contains the incident edges. Josh On Wed, Jan 22, 2020 at 12:11 PM Stephen Mallette <[email protected]> wrote: > We've long had the issue of how to deal with string better. Typically the > concern lies with concatenation, but there are other use cases that have > come up along the way as well. I started playing around with a format() > step to try to capture all the odds and ends I have notes on in relation to > this: > > Quickly hacked together I have something that allows: > > gremlin> g.V().hasLabel('person').format("%s is %s years > old").by('name').by('age') > ==>marko is 29 years old > ==>vadas is 27 years old > ==>josh is 32 years old > ==>peter is 35 years old > > The engine behind the string formatting is the standard Java Formatter. I > just wanted to see what it could look like so Formatter was an easy choice. > Of course, Formatter might not be best - part of me would prefer a more > non-JVM centric sort of templating, perhaps something like: > > g.V().hasLabel('person').format("{} is {} years old").by('name').by('age') > > which is fairly commonplace across languages (even used in Java in > libraries like slf4j). That of course made me realize that it wouldn't be > hard to overload format() to take a formatting engine as an argument so > that it's extensible: > > g.V().hasLabel('person').format("{} is {} years old").by('name').by('age') > > g.V().hasLabel('person').format(JAVA, "%s is %s years > old").by('name').by('age') > > The notion of a formatting engine argument made me think about another > thing folks tend to want in relation to strings - clean output to JSON (not > GraphSON exactly with all the embedded types - like think back to GraphSON > 1 format) and other string formats: > > g.V().hasLabel('person').format(JSON) > > or perhaps it is just GraphSON?? > > g.V().hasLabel('person').format(GraphSON_1) > > Providers who require special serializers could easily just override the > FormatStep to configure the engines as necessary. > > I think format() helps solve a lot of the common issues with strings and > Gremlin. Even with the basic Formatter you can do a poor man's sort of > substring: > > gremlin> g.V().hasLabel('person').format("%1.1s").by('name') > ==>m > ==>v > ==>j > ==>p > > I'd imagine that with a more advanced engine we could get something more > full featured if we wanted to cover even wider general function use cases. > Not sure if things like substring should be more like first class citizens > in Gremlin or not though. Anyway, happy to hear any thoughts on the idea of > format() and what it might mean to Gremlin. >
