[ https://issues.apache.org/jira/browse/PIG-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rohini Palaniswamy resolved PIG-5018. ------------------------------------- Resolution: Invalid > Mohan.V > ------- > > Key: PIG-5018 > URL: https://issues.apache.org/jira/browse/PIG-5018 > Project: Pig > Issue Type: Bug > Reporter: mohan > > I am trying to write Hadoop Pig script which will take 2 files and filter > based on string i.e > words.txt > google > facebook > twitter > linkedin > tweets.json > {"created_time": "18:47:31 ", "text": "RT @Joey7Barton: ..give a facebook > about whether the americans wins a Ryder cup. I mean surely he has slightly > more important matters. #fami ...", "user_id": 450990391, "id": > 252479809098223616, "created_date": "Sun Sep 30 2012"} > SCRIPT > twitter = LOAD 'Twitter.json' USING JsonLoader('created_time:chararray, > text:chararray, user_id:chararray, id:chararray, created_date:chararray'); > filtered = FILTER twitter BY (text MATCHES '.*facebook.*'); > extracted = FOREACH filtered GENERATE 'facebook' AS pattern,id, user_id, > created_time, created_date, text; > final = GROUP extracted BY pattern; > dump final; > OUTPUT > (facebook,{(facebook,252545104890449921,291041644,23:06:59 ,Sun Sep 30 > 2012,RT @Joey7Barton: ..give a facebook about whether the americans wins a > Ryder cup. I mean surely he has slightly more important matters. #fami ...)}) > the output that im getting is, without loading the words.txt file i.e by > filtering the tweet directly. > I need to get the output as > (facebook)(complete tweet of that facebook word contained) > i.e it should read the words.txt and as words are reading according to that > it should get all the tweets from tweets.json file > Any help > Mohan.V -- This message was sent by Atlassian JIRA (v6.3.4#6332)