I tried to subscribe but a mail client box came up, not what I wanted, so
we'll see if this works.

I wrote this script:

register s3n://uw-cse344-code/myudfs.jar


-- load the test file into Pig
--raw = LOAD 's3n://uw-cse344-test/cse344-test-file' USING TextLoader as
(line:chararray);
-- later you will load to other files, example:
raw = LOAD 's3n://uw-cse344/btc-2010-chunk-000' USING TextLoader as
(line:chararray);

-- parse each line into ntriples
ntriples = foreach raw generate FLATTEN(myudfs.RDFSplit3(line)) as
(subject:chararray,predicate:chararray,object:chararray);

--filter 1
subjects1 = filter ntriples by subject matches '.*rdfabout\\.com.*'
PARALLEL 50;
--filter 2
subjects2 = subjects1;

but I got the error:
2012-03-10 01:19:18,039 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 1200: <line 4, column 21>  mismatched input ';' expecting LEFT_PAREN
Details at logfile: /home/hadoop/pig_1331342327467.log

how do I simply set one variable equal to another?

I also tried subjects2 = dump subjects1;

thanks!


-- 
~Colleen Ross

Reply via email to