On 2023-10-24 Tu 12:08, Robert Haas wrote:

It looks like each file entry in the manifest takes about 150 bytes, so
1 GB would allow for 1024**3/150 = 7158278 files.  That seems fine for now?
I suspect a few people have more files than that. They'll just have to Maybe 
someone on the list can see some way o
wait to use this feature until we get incremental JSON parsing (or
undo the decision to use JSON for the manifest).


Robert asked me to work on this quite some time ago, and most of this work was done last year.

Here's my WIP for an incremental JSON parser. It works and passes all the usual json/b tests. It implements Algorithm 4.3 in the Dragon Book. The reason I haven't posted it before is that it's about 50% slower in pure parsing speed than the current recursive descent parser in my testing. I've tried various things to make it faster, but haven't made much impact. One of my colleagues is going to take a fresh look at it, but maybe someone on the list can see where we can save some cycles.

If we can't make it faster, I guess we could use the RD parser for non-incremental cases and only use the non-RD parser for incremental, although that would be a bit sad. However, I don't think we can make the RD parser suitable for incremental parsing - there's too much state involved in the call stack.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachment: json_incremental_parser-2023-09-25.patch.gz
Description: application/gzip

Reply via email to