If I understand correctly, you're ultimately looking for a general way
that you can write this kind of record processing code simply in the
future. And that, right now, you're investing some one-time
experimental effort, to assess feasibility and to find an
approach/guidelines that you can apply more simply in the future.
Regarding feasibility, unless I misunderstand this pilot application, I
think that it could be done in Racket in a way that scales almost
perfectly with each additional core added, limited only by the ultimate
filesystem I/O. That might involve Places, perhaps with larger work
units or more efficient communication and coordination; or, as I think
you said earlier, you could simply fork separate processes after
partitioning (just via file positions) the input file. These approaches
could get towards the shallow end of what people do when writing really
performance-sensitive systems, like some Internet or transaction
processing servers (or database engines, or operating system kernels) --
they are not necessarily simple. Though you can generalize the hard
work for simple future use (e.g.,
`for-each-bytes-line-from-file/unordered/indexes`,
`for-each-foo-format-record`, `query-from-foo-format-file-to-bar-file`,
`define-foo-file-transformation`, etc.). Sometimes people on this list
will spend 5-30 minutes figuring out a code solution to a problem posted
on the list, but harder performance-sensitive stuff can take days or
longer to do well, and it has to be done well, by definition.
Going back to "simply", rather than
simply-after-upfront-hard-work-done-by-application-programmer, maybe
there's opportunity for
simply-after-further-hard-work-done-by-core-Racket-programmers... For
example, perhaps some of the core Racket string routines could be
optimized further (I imagine they're already got a working-over, when
Unicode was added), so that even simple programs run faster. And maybe
there are Places facilities that could be optimized further, or some new
facility added. And maybe there's a research project for better
parallelizing support.
BTW, there might still be a few relatively simple efficiency tweaks in
your current approach (e.g., while skimming, I think I saw a snippet of
code doing something like `(write-bytes (bytes-append ...))`, perhaps to
try to keep a chunk of bytes contiguous, for interleaving with other
threads' writes).
If the work was parsing HTML or JSON, then the places version would
probably be worth it on a 4 core machine.
For HTML and JSON parsing, unlike your records application, I think the
parser itself has to be one thread, but you could probably put some
expensive application-specific behavior that happens during the parse in
other threads. Neither my HTML nor JSON parsers was designed to be used
that way, but my streaming JSON parser might be amenable to it. The HTML
parser is intended to build a potentially-big AST in one shot, so no
other threads while it's working, though it should be reasonably fast
about it (it was written on a 166MHz Pentium laptop with 48MB RAM,
usually on battery power).
Neil V.
--
You received this message because you are subscribed to the Google Groups "Racket
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.