On 20.03.24 13:03, Matthias van de Meent wrote:
On Wed, 20 Mar 2024 at 12:49, Peter Eisentraut <pe...@eisentraut.org> wrote:

On 19.03.24 17:13, Peter Eisentraut wrote:
On 11.03.24 21:52, Matthias van de Meent wrote:
* v7-0003-gen_node_support.pl-Mark-location-fields-as-type-.patch

This looks sensible, but maybe making Location a global type is a bit
much?  Maybe something more specific like ParseLocation, or ParseLoc, to
keep it under 12 characters.
I've gone with ParseLoc in the attached v8 patchset.

I have committed this one.

Next, I was looking at
v8-0003-pg_node_tree-Don-t-store-query-text-locations-in-.patch.

[...]

So anyway, my idea was that we should turn this around and make
nodeToString() always drop location information, and instead add
nodeToStringWithLocations() for the few debugging uses.  And this would
also be nice because then it matches exactly with the existing
stringToNodeWithLocations().

That seems reasonable, yes.

I have committed that one.

This takes care of your patches v8-0002 and v8-0003.

About the rest of your patch set:

As long as we have only one output format, we need to balance several uses, including debugging, storage size, (de)serialization performance.

Your patches v8-0005 and up are clearly positive for storage size but negative for debugging. So I think we can't consider them now.

Your patches v8-0001 ("pg_node_tree: Omit serialization of fields with default values.") and v8-0004 ("gen_node_support.pl: Add a TypMod type for signalling TypMod behaviour") are also good for storage size. I don't know how they affect serialization performance. I also don't know how good they are for debugging. I have argued here and there that omitting unset fields can make node dumps more readable. But that's just me. I have looked at a lot of Query and RangeTblEntry nodes lately, which contain many rarely used fields. But other people might have completely different experiences, with other node and tree types. We didn't get much feedback from anyone else in this thread, so I'm very hesitant to impose this on everyone without any consensus.

I could see "Omit serialization of fields with default values" as a separate toggle for debug node dumps.

Also, there is clearly some lingering interesting in a separate binary-ish serialization format for internal use. This should probably also take a look at (de)serialization performance, which we haven't really done in this thread. In a way, with the omit default values patch, the serialization and deserialization does more work, so it could have an adverse impact. But we don't know.

I think to proceed we need more buy-in on the "what do I want from my node dumps" side, and more performance numbers on the other side. Saving a few hundred kilobytes on view storage is fine but isn't by itself that useful, I think, if it potentially negatively affects other uses.


Reply via email to