If I understand correctly, you're ultimately looking for a general way that you can write this kind of record processing code simply in the future. And that, right now, you're investing some one-time experimental effort, to assess feasibility and to find an approach/guidelines that you can apply more simply in the future.

Regarding feasibility, unless I misunderstand this pilot application, I think that it could be done in Racket in a way that scales almost perfectly with each additional core added, limited only by the ultimate filesystem I/O. That might involve Places, perhaps with larger work units or more efficient communication and coordination; or, as I think you said earlier, you could simply fork separate processes after partitioning (just via file positions) the input file. These approaches could get towards the shallow end of what people do when writing really performance-sensitive systems, like some Internet or transaction processing servers (or database engines, or operating system kernels) -- they are not necessarily simple. Though you can generalize the hard work for simple future use (e.g., `for-each-bytes-line-from-file/unordered/indexes`, `for-each-foo-format-record`, `query-from-foo-format-file-to-bar-file`, `define-foo-file-transformation`, etc.). Sometimes people on this list will spend 5-30 minutes figuring out a code solution to a problem posted on the list, but harder performance-sensitive stuff can take days or longer to do well, and it has to be done well, by definition.

Going back to "simply", rather than simply-after-upfront-hard-work-done-by-application-programmer, maybe there's opportunity for simply-after-further-hard-work-done-by-core-Racket-programmers... For example, perhaps some of the core Racket string routines could be optimized further (I imagine they're already got a working-over, when Unicode was added), so that even simple programs run faster. And maybe there are Places facilities that could be optimized further, or some new facility added. And maybe there's a research project for better parallelizing support.

BTW, there might still be a few relatively simple efficiency tweaks in your current approach (e.g., while skimming, I think I saw a snippet of code doing something like `(write-bytes (bytes-append ...))`, perhaps to try to keep a chunk of bytes contiguous, for interleaving with other threads' writes).

If the work was parsing HTML or JSON, then the places version would probably be worth it on a 4 core machine.

For HTML and JSON parsing, unlike your records application, I think the parser itself has to be one thread, but you could probably put some expensive application-specific behavior that happens during the parse in other threads. Neither my HTML nor JSON parsers was designed to be used that way, but my streaming JSON parser might be amenable to it. The HTML parser is intended to build a potentially-big AST in one shot, so no other threads while it's working, though it should be reasonably fast about it (it was written on a 166MHz Pentium laptop with 48MB RAM, usually on battery power).

Neil V.

--
You received this message because you are subscribed to the Google Groups "Racket 
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to