Following up on Jim’s and other responses I would venture to say that one good 
solution memory and performance wise would be:
- loop thru records using NEXT RECORD, or GOTO SELECTED RECORD
- preallocate a BLOB to build your text/json; say 5MB is your limit, or maybe 
50Mb, or 500Mb, you pick your number
- fill in the BLOB with your text/json, one record at a time, using INSERT IN 
BLOB
- when the BLOB is full, or you’re done with the selection, APPEND the blob to 
your output file, and reset the BLOB offset to 0

running server side on SSD, that should probably a good alternative, I think 
;-) (no testing done)

hth
julio

> On Sep 15, 2018, at 7:33 PM, Jim Crate via 4D_Tech <4d_tech@lists.4d.com> 
> wrote:
> 
> On Sep 15, 2018, at 12:07 AM, David Adams via 4D_Tech <4d_tech@lists.4d.com> 
> wrote:
>> 
>> Short version:
>> I need to load some fields from records into a big text thingy.
>> 
>> The code runs on the server-side only.
>> 
>> I'm keen to preserve RAM.
> 
> [snip]
> 
>> TL;DR version
>> I'm working in V17 and I'm hoping that someone has done some real-world
>> tests already that could help me out with a question. Here's the setup: I
>> need to load up some fields from lots of records and push them into an
>> external system. It's going to Postgres, but that's not an important
>> detail, the result is a ginormous text object. The result could just as
>> well be a text or JSON file dump. The main constraint is available memory.
> 
> 
> Are you generating large enough text objects to be constrained by memory? If 
> so, is more memory an option? It’s relatively cheap, probably cheaper than 
> the time taken to run tests.
> 
> Since you’re running server side, looping through records is going to be fast 
> enough that you shouldn’t see any meaningful difference in timing between 
> selection to array and GOTO SELECTED RECORD, especially if the server is 
> using SSDs.  In any case, it should be faster to loop through reading records 
> than for Postgres to do a search+insert/update; for this particular case, do 
> you need to build text faster than Postgres can process it? If the Postgres 
> server is not on a LAN, then WAN speed could end up being the primary 
> bottleneck and how fast you can build text may be irrelevant. 
> 
> I don’t think anyone could have meaningful test results, since you’re asking 
> about memory usage and also mentioning timing. But, my opinion is if you’re 
> building text blocks under 2 MB or so, there is no point to test anything. If 
> you’re building text blocks over 5MB small pieces at a time, the only test 
> that really matters is appending to a text vs putting text into a pre-sized 
> blob, especially under memory pressure. If your text object has to be copied 
> to a different place in memory 10K times plus forcing memory reorganization 
> several times (assuming 4D even reorganizes its own memory) because you’re 
> appending to a large text variable in a low-memory situation, then whether 
> you used SELECTION TO ARRAY or GOTO SELECTED RECORD on the server probably 
> doesn’t matter.
> 
> If you want to use arrays, the query is fast, and timing is not as critical, 
> using SQL with LIMIT/OFFSET gets you arrays without the memory load of having 
> to load an entire selection of millions of records into arrays.
> 
> If the server runs an SSD, memory is truly a problem, and the SSD has plenty 
> of space, then building a text file is probably a reasonably fast solution. 
> 
> 
>> So I bundle multiple rows
>> into a mutli-value insert, and then several of those into a transaction. It
>> goes pretty quickly and for smaller pushes, it probably doesn't matter how
>> I build up the text, it just isn't that big. (Note for the PG people in the
>> crowd, I'm using an UPSERT, so COPY TO isn't an option.)
> 
> Using a staging table, you can effectively combine upsert with copy. Besides 
> the potential speed increase on the Postgres side, not having to build SQL 
> statements with 4D’s weak text-building tools is a huge plus.
> 
> https://www.flydata.com//blog/how-to-improve-performance-upsert-amazon-redshift/
> 
> Jim Crate
> 
> **********************************************************************
> 4D Internet Users Group (4D iNUG)
> Archive:  http://lists.4d.com/archives.html
> Options: https://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
> **********************************************************************

--
Julio Carneiro
jjfo...@gmail.com



**********************************************************************
4D Internet Users Group (4D iNUG)
Archive:  http://lists.4d.com/archives.html
Options: https://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**********************************************************************

Reply via email to