Thanks Rob.

I only intend to take a few samples, so the waste of ressources is not really significant. I was just nearly sure that there had to be a "pipeline-only" trick to do the job (and very curious to know it!), and the "put the record number before the line and play with it" trick was exactly what I was searching for.

But I would never have thought of your second solution (I thought "diskrand" had to be the first stage in a pipeline; In fact, I never really noticed the "Placement" note in the stages' help; The placements are not as obvious as I thought they were ...).

Thanks again
Michaël

-----Message d'origine-----
De : Rob van der Heij <[email protected]>
Envoyé : 22/12/2009 09:49
À : [email protected] <[email protected]>
Cc :
Objet : Re: Keeping only records number 1, 1+N, 1+2N, ...
On Tue, Dec 22, 2009 at 9:40 AM, DUGALEIX Michaël
<[email protected]> wrote:
Hello all,

I'd like to take a sample of a big "real" file (each line of the file having
the same structure) to test some programs, by putting in my sample only the
lines (for example) 1, 1001, 2001, 3001, ...

(1) something like "pipe < inputFile | spec 1-* 1 read read read ... read |
..." which would get rid of the 999 lines I don't want
(2) something like "pipe diskrand 1 1001 2001 ... | ..."

Well, if you really only want 0.1% of the data, it's probably a bit
wasting resources to read the entire file and skip the remaining
records. Even when you can do that in an interesting way like this:

  | spec number 1.10 r 1-* n | pick 10 == ,1, | substr 11-*

How about this?

 \ literal | dup 99 | spec number by 1000 1 | diskrand big file a  | ...

Sir Rob the Plumber

Reply via email to