Hello:

I hope this is not double posting.

I want to do something simple:

I have a data file, mydata.log,  formatted like this:

a1 | b1 | c=foo&d=bar | e1
a2 | b2 | c=john&d=doe | e2
a3 | b3 | c=foo&d=doe | e3
...

and I want to LOAD the data USING <something> in order to get the AS to be
(A,B,C,D, E) i.e. extract 2 fields from the third one.

For example :

data = LOAD 'mydata.log' USING <something> AS (A, B, C, D, E);

i.e. I want the third field (i.e. the one formatted as 'cx=foox&dx=barx') to
be parsed to yield the C and D in my AS list of fields
so that later on I can do things like:

data_cfoo = FILTER data BY c == 'foo';
data_cfoo_ddoe = FILTER data_cfoo BY d='doe';


There has to have a simple way way to do that ?
Passing a regex, a ruby script or what else as a parameter to PigStorage, or
using something else than PigStorage?

Many thanks

Yves

YVES
DE FJORD

   YVES ROY DÉVELOPPEUR LOGICIEL DE FJORD
2100, RUE DRUMMOND, MONTRÉAL, QUÉBEC H3G 1X1 CANADA
T 514 270 8782 #4572 / F 514 270 4162 / cossette.com

Reply via email to