Re: [DISCUSS] string formatting

Joshua Shinavier Thu, 23 Jan 2020 21:51:04 -0800

Just a quick note, but I don't think we would go far wrong using Formatter
for now. It is after all an "interpreter for printf-style format strings",
where printf <https://en.wikipedia.org/wiki/Printf_format_string> has a POSIX
specification <https://pubs.opengroup.org/onlinepubs/9699919799/> that is
implemented in many programming languages. The Formatter docs go on to say
that while Formatter is "inspired by" printf, it departs from printf in
ways that are idiosyncratic to Java. My googling did not turn up a handy
list of features for which Formatter differs from printf, nor did it turn
up a POSIXly-correct printf library for Java. Likely that both of these
things exist somewhere. Otherwise if we really wanted to be picky about
portability, we might have to write that custom printf in each
target language, including Java.


W.r.t. outputting elements as JSON, IMO this is another area where a formal
data model is going to help, and the output need not be limited to JSON.
Thrift, Protobuf, Avro, JSON, GraphQL... any data model we have an
appropriate API for and can describe in terms of primitive types, sum, and
products, we can map schemas and data into. The target of format() would
not be "JSON" but (for example) "JSON conforming to the JSON Schema
equivalent of your graph schema". In the default graph schema, I think
there will be one kind of vertex, and one kind of edge, with labels more
like properties than types. The generated JSON for a graph with this flat
schema could be made to look something like GraphSON, though I wouldn't
expect the default representation of a vertex to contain "inE" or "outE"
because a vertex doesn't own/contain its incident edges. You could output
something that *contains* a vertex and also contains the incident edges.

Josh






On Wed, Jan 22, 2020 at 12:11 PM Stephen Mallette <[email protected]>
wrote:

> We've long had the issue of how to deal with string better. Typically the
> concern lies with concatenation, but there are other use cases that have
> come up along the way as well. I started playing around with a format()
> step to try to capture all the odds and ends I have notes on in relation to
> this:
>
> Quickly hacked together I have something that allows:
>
> gremlin> g.V().hasLabel('person').format("%s is %s years
> old").by('name').by('age')
> ==>marko is 29 years old
> ==>vadas is 27 years old
> ==>josh is 32 years old
> ==>peter is 35 years old
>
> The engine behind the string formatting is the standard Java Formatter. I
> just wanted to see what it could look like so Formatter was an easy choice.
> Of course, Formatter might not be best - part of me would prefer a more
> non-JVM centric sort of templating, perhaps something like:
>
> g.V().hasLabel('person').format("{} is {} years old").by('name').by('age')
>
> which is fairly commonplace across languages (even used in Java in
> libraries like slf4j). That of course made me realize that it wouldn't be
> hard to overload format() to take a formatting engine as an argument so
> that it's extensible:
>
> g.V().hasLabel('person').format("{} is {} years old").by('name').by('age')
>
> g.V().hasLabel('person').format(JAVA, "%s is %s years
> old").by('name').by('age')
>
> The notion of a formatting engine argument made me think about another
> thing folks tend to want in relation to strings - clean output to JSON (not
> GraphSON exactly with all the embedded types - like think back to GraphSON
> 1 format) and other string formats:
>
> g.V().hasLabel('person').format(JSON)
>
> or perhaps it is just GraphSON??
>
> g.V().hasLabel('person').format(GraphSON_1)
>
> Providers who require special serializers could easily just override the
> FormatStep to configure the engines as necessary.
>
> I think format() helps solve a lot of the common issues with strings and
> Gremlin. Even with the basic Formatter you can do a poor man's sort of
> substring:
>
> gremlin> g.V().hasLabel('person').format("%1.1s").by('name')
> ==>m
> ==>v
> ==>j
> ==>p
>
> I'd imagine that with a more advanced engine we could get something more
> full featured if we wanted to cover even wider general function use cases.
> Not sure if things like substring should be more like first class citizens
> in Gremlin or not though. Anyway, happy to hear any thoughts on the idea of
> format() and what it might mean to Gremlin.
>

Re: [DISCUSS] string formatting

Reply via email to