Following up on Jim’s and other responses I would venture to say that one good solution memory and performance wise would be: - loop thru records using NEXT RECORD, or GOTO SELECTED RECORD - preallocate a BLOB to build your text/json; say 5MB is your limit, or maybe 50Mb, or 500Mb, you pick your number - fill in the BLOB with your text/json, one record at a time, using INSERT IN BLOB - when the BLOB is full, or you’re done with the selection, APPEND the blob to your output file, and reset the BLOB offset to 0
running server side on SSD, that should probably a good alternative, I think ;-) (no testing done) hth julio > On Sep 15, 2018, at 7:33 PM, Jim Crate via 4D_Tech <4d_tech@lists.4d.com> > wrote: > > On Sep 15, 2018, at 12:07 AM, David Adams via 4D_Tech <4d_tech@lists.4d.com> > wrote: >> >> Short version: >> I need to load some fields from records into a big text thingy. >> >> The code runs on the server-side only. >> >> I'm keen to preserve RAM. > > [snip] > >> TL;DR version >> I'm working in V17 and I'm hoping that someone has done some real-world >> tests already that could help me out with a question. Here's the setup: I >> need to load up some fields from lots of records and push them into an >> external system. It's going to Postgres, but that's not an important >> detail, the result is a ginormous text object. The result could just as >> well be a text or JSON file dump. The main constraint is available memory. > > > Are you generating large enough text objects to be constrained by memory? If > so, is more memory an option? It’s relatively cheap, probably cheaper than > the time taken to run tests. > > Since you’re running server side, looping through records is going to be fast > enough that you shouldn’t see any meaningful difference in timing between > selection to array and GOTO SELECTED RECORD, especially if the server is > using SSDs. In any case, it should be faster to loop through reading records > than for Postgres to do a search+insert/update; for this particular case, do > you need to build text faster than Postgres can process it? If the Postgres > server is not on a LAN, then WAN speed could end up being the primary > bottleneck and how fast you can build text may be irrelevant. > > I don’t think anyone could have meaningful test results, since you’re asking > about memory usage and also mentioning timing. But, my opinion is if you’re > building text blocks under 2 MB or so, there is no point to test anything. If > you’re building text blocks over 5MB small pieces at a time, the only test > that really matters is appending to a text vs putting text into a pre-sized > blob, especially under memory pressure. If your text object has to be copied > to a different place in memory 10K times plus forcing memory reorganization > several times (assuming 4D even reorganizes its own memory) because you’re > appending to a large text variable in a low-memory situation, then whether > you used SELECTION TO ARRAY or GOTO SELECTED RECORD on the server probably > doesn’t matter. > > If you want to use arrays, the query is fast, and timing is not as critical, > using SQL with LIMIT/OFFSET gets you arrays without the memory load of having > to load an entire selection of millions of records into arrays. > > If the server runs an SSD, memory is truly a problem, and the SSD has plenty > of space, then building a text file is probably a reasonably fast solution. > > >> So I bundle multiple rows >> into a mutli-value insert, and then several of those into a transaction. It >> goes pretty quickly and for smaller pushes, it probably doesn't matter how >> I build up the text, it just isn't that big. (Note for the PG people in the >> crowd, I'm using an UPSERT, so COPY TO isn't an option.) > > Using a staging table, you can effectively combine upsert with copy. Besides > the potential speed increase on the Postgres side, not having to build SQL > statements with 4D’s weak text-building tools is a huge plus. > > https://www.flydata.com//blog/how-to-improve-performance-upsert-amazon-redshift/ > > Jim Crate > > ********************************************************************** > 4D Internet Users Group (4D iNUG) > Archive: http://lists.4d.com/archives.html > Options: https://lists.4d.com/mailman/options/4d_tech > Unsub: mailto:4d_tech-unsubscr...@lists.4d.com > ********************************************************************** -- Julio Carneiro jjfo...@gmail.com ********************************************************************** 4D Internet Users Group (4D iNUG) Archive: http://lists.4d.com/archives.html Options: https://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **********************************************************************