Dump of A:
(100,123.98.11.123,google.com,{(google)},20121201_G,20121201)
(95,500.98.11.123,yahoo.com,{(yahoo)},20121201_Y,20121201)
(107,123.98.11.123,google.com,{(google)},20121201_G,20121201)
(156,123.98.11.123,cnn.com,{(cnn)},20121201_C,20121201)
(100,500.98.11.123,ndtv.com,{(ndtv)},20121201_N,20121201)
(200,123.98.11.123,google.com,{(google)},20121202_G,20121202)
(283,500.98.11.123,yahoo.com,{(yahoo)},20121202_Y,20121202)
(283,500.98.11.123,pinterest.com,{(pinterest)},20121202_P,20121202)
(204,600.10.100.221,bbc.com,{(bbc)},20121202_B,20121202)


Dump of B:
(100,g,20121201)
(95,y,20121201)
(107,g,20121201)
(156,c,20121201)
(100,n,20121201)
(200,g,20121202)
(283,y,20121202)
(283,p,20121202)
(204,b,20121202)

ILLUSTRATE B:

| B     | ip:chararray     | domain_first_char:chararray     |
filename:chararray
|        | 123.98.11.123 | g                                           |
20121202

As seen in Dump B, instead of printing the ip value as the first field (as
in illustrate B), it prints the ts field.


On Sun, Feb 3, 2013 at 11:56 AM, Prabu Dhakshinamurthy <
[email protected]> wrote:

> I am using the -tagsource option while loading the input data in order to
> identify the input source. It seems that, later while I project only
> selected fields from the input tuple, there are some assumptions and
> certain fields get projected all the time though I try to ignore them.
>
> Take a look at my script.
>
> rawdata = load 'data/201212*' using PigStorage(' ', '-tagsource') as
> (filename:chararray, ts: int, ip: chararray, domain: chararray, answer:
> chararray);
>
> A = foreach rawdata generate ts, ip, domain, answer,
> CONCAT(CONCAT(filename, '_'), UPPER(SUBSTRING(domain, 0, 1))) as
> domain_index, filename as filename;
> B = foreach A generate ip as ip, SUBSTRING(domain, 0, 1) as
> domain_first_char, filename;
> dump A;
> dump B;
> ILLUSTRATE B;
>
> While creating B, I am trying to include only selected fields from A.
> However, if I dump B, the 'ts' field (the first field in A) keeps appearing
> in B. But in ILLUSTRATE B, everything looks nice as expected.
>
> I appreciate any help. Thanks!
>
> --
>
> Prabu D
>
>


-- 

Prabu Dhakshinamurthy
Graduate student | CSE | UCSD

Reply via email to