On Tuesday, January 19, 2016 at 1:13:44 AM UTC-5, Brian Adkins wrote:
> I've finalized the sequential version of my program to convert a large 
> fixed-length field file into two distinct output files (one per table) 
> suitable for bulk import into postgres.
> 
> https://gist.github.com/lojic/413f972bcaf1a6b156e2
> 
> On a single core, the runtime is 5.35x longer than the C program and sustains 
> ~ 15 MB/s file output. So, I'd now like to create a parallel version using 
> places, and I would like some general, high-level design advice from anyone 
> who has been in the trenches with applications using places.
> 
> What organizational structure would you recommend for a program that reads 
> byte string lines from one file, transforms them, and writes them to two 
> output files?
> 
> One option is the following:
> 
> InputFile -> InputPlace -> InputQueue -> N ProcessPlaces -> OutputQueue -> 
> OutputPlace -> [ OutputFile1, OutputFile2 ]
> 
> Another option would be to have N shared byte strings (one per process place) 
> and rotate through them with incoming lines. I'm not sure if they can be 
> synchronized on directly, or not.
> 
> Another option would be to have the input place write lines to the channel of 
> specific process places in a round robin manner.
> 
> I'll also need some mechanism to determine when to write the special last 
> line of each of the two output files.
> 
> I don't mind experimenting with a few approaches, but if I can glean some 
> community wisdom from folks who have been through the trial & error already, 
> that would be great!
> 
> Thanks,
> Brian

Main loop is lines 22 through 59 of parser.rkt in the gist:

https://gist.github.com/lojic/413f972bcaf1a6b156e2

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to