On Fri, Aug 1, 2008 at 9:50 AM, Matthew Brand <[EMAIL PROTECTED]> wrote:
> I was wondering whether it is possible to write an adverb that auto
> splits whatever is coming in to it on the basis of availible memory or to
> make some kind of chunkifiaction happen automatically... but don't think it
> would be easy to do. One might input a set of asserts to guide the splitting
> process and use info about the ram size to determine wether it is neccesary
> and how many splits to do.

The concept of available memory seems rather ambiguous, given that the
problem has to do with J using available virtual memory.

However, using a pre-determined block size is not all that bad.  The only
issue, really, is that you want to work on complete records (complete
lines) rather than on arbitrarily sized blocks.

Anyways, I would suggest using something like this:
blocksOfLines=: 2 :0
:
  NB. u: monad
  NB. n: base block size
  NB. y: src file name
  NB. x: dest file name
  blks=.  (,.~ [EMAIL PROTECTED]) -.&0 n (| ,~ [ #~ <[EMAIL PROTECTED]) 1!:4<y
  pfx=. ''
  '' 1!:2<x
  for_blk. blks do.
    raw=. pfx,1!:11 y;blk
    end=. raw ([EMAIL PROTECTED] 0:^:< >:@i:) LF
    pfx=. end }. raw
    (u end {. raw) 1!:3<x
  end.
  if.#pfx do.(u pfx) 1!:3<x end.
  1!:4<x
)

chun=: blocksOfLines 1e6

For example:
   'test.out' ([: ; <@('> ' , ]);.2) chun 'test.in'
would create a file test.out which copied everything from test.in and
prepended the character sequence '> ' in front of each line.  And
it would do this against blocks of approximately 1 million characters
each.  (Note that I have assumed, for this example, that the file
was well formed -- that the final character was a line feed).

FYI,

-- 
Raul
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to