On Sep 15, 2018, at 12:07 AM, David Adams via 4D_Tech <4d_tech@lists.4d.com> 
wrote:
> 
> Short version:
> I need to load some fields from records into a big text thingy.
> 
> The code runs on the server-side only.
> 
> I'm keen to preserve RAM.

[snip]

> TL;DR version
> I'm working in V17 and I'm hoping that someone has done some real-world
> tests already that could help me out with a question. Here's the setup: I
> need to load up some fields from lots of records and push them into an
> external system. It's going to Postgres, but that's not an important
> detail, the result is a ginormous text object. The result could just as
> well be a text or JSON file dump. The main constraint is available memory.


Are you generating large enough text objects to be constrained by memory? If 
so, is more memory an option? It’s relatively cheap, probably cheaper than the 
time taken to run tests.

Since you’re running server side, looping through records is going to be fast 
enough that you shouldn’t see any meaningful difference in timing between 
selection to array and GOTO SELECTED RECORD, especially if the server is using 
SSDs.  In any case, it should be faster to loop through reading records than 
for Postgres to do a search+insert/update; for this particular case, do you 
need to build text faster than Postgres can process it? If the Postgres server 
is not on a LAN, then WAN speed could end up being the primary bottleneck and 
how fast you can build text may be irrelevant. 

I don’t think anyone could have meaningful test results, since you’re asking 
about memory usage and also mentioning timing. But, my opinion is if you’re 
building text blocks under 2 MB or so, there is no point to test anything. If 
you’re building text blocks over 5MB small pieces at a time, the only test that 
really matters is appending to a text vs putting text into a pre-sized blob, 
especially under memory pressure. If your text object has to be copied to a 
different place in memory 10K times plus forcing memory reorganization several 
times (assuming 4D even reorganizes its own memory) because you’re appending to 
a large text variable in a low-memory situation, then whether you used 
SELECTION TO ARRAY or GOTO SELECTED RECORD on the server probably doesn’t 
matter.

If you want to use arrays, the query is fast, and timing is not as critical, 
using SQL with LIMIT/OFFSET gets you arrays without the memory load of having 
to load an entire selection of millions of records into arrays.

If the server runs an SSD, memory is truly a problem, and the SSD has plenty of 
space, then building a text file is probably a reasonably fast solution. 


> So I bundle multiple rows
> into a mutli-value insert, and then several of those into a transaction. It
> goes pretty quickly and for smaller pushes, it probably doesn't matter how
> I build up the text, it just isn't that big. (Note for the PG people in the
> crowd, I'm using an UPSERT, so COPY TO isn't an option.)

Using a staging table, you can effectively combine upsert with copy. Besides 
the potential speed increase on the Postgres side, not having to build SQL 
statements with 4D’s weak text-building tools is a huge plus.

https://www.flydata.com//blog/how-to-improve-performance-upsert-amazon-redshift/

Jim Crate

**********************************************************************
4D Internet Users Group (4D iNUG)
Archive:  http://lists.4d.com/archives.html
Options: https://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**********************************************************************

Reply via email to