Re: [CMS-PIPELINES] Keeping only records number 1, 1+N, 1+2N, ...

DUGALEIX Michaël Tue, 22 Dec 2009 02:21:55 -0800

Thanks Rob.

I only intend to take a few samples, so the waste of ressources is notreally significant.I was just nearly sure that there had to be a "pipeline-only" trick todo the job (and very curious to know it!), and the "put the recordnumber before the line and play with it" trick was exactly what I wassearching for.

But I would never have thought of your second solution (I thought"diskrand" had to be the first stage in a pipeline; In fact, I neverreally noticed the "Placement" note in the stages' help; The placementsare not as obvious as I thought they were ...).


Thanks again
Michaël

-----Message d'origine-----
De : Rob van der Heij <[email protected]>
Envoyé : 22/12/2009 09:49

À : [email protected]<[email protected]>

Cc :
Objet : Re: Keeping only records number 1, 1+N, 1+2N, ...

On Tue, Dec 22, 2009 at 9:40 AM, DUGALEIX Michaël
<[email protected]> wrote:

Hello all,

I'd like to take a sample of a big "real" file (each line of the file having
the same structure) to test some programs, by putting in my sample only the
lines (for example) 1, 1001, 2001, 3001, ...

(1) something like "pipe < inputFile | spec 1-* 1 read read read ... read |
..." which would get rid of the 999 lines I don't want
(2) something like "pipe diskrand 1 1001 2001 ... | ..."


Well, if you really only want 0.1% of the data, it's probably a bit
wasting resources to read the entire file and skip the remaining
records. Even when you can do that in an interesting way like this:

  | spec number 1.10 r 1-* n | pick 10 == ,1, | substr 11-*

How about this?

 \ literal | dup 99 | spec number by 1000 1 | diskrand big file a  | ...

Sir Rob the Plumber

Re: [CMS-PIPELINES] Keeping only records number 1, 1+N, 1+2N, ...

Reply via email to