Hi,
I'm trying to use regular expressions in PIG, but it's failing. Based on the
documentation http://pig.apache.org/docs/r0.12.0/func.html#regex-extract I am
trying this:
[watrous@c0003913 ~]$ pig -x local
which: no hadoop in
R u planning to use
org.apache.pig.builtin.REGEX_EXTRACT
?
On 12/4/13 9:28 AM, Watrous, Daniel daniel.t.watr...@hp.com wrote:
Hi,
I'm trying to use regular expressions in PIG, but it's failing. Based on
the documentation
http://pig.apache.org/docs/r0.12.0/func.html#regex-extract I am trying
It's not valid PigLatin...
The Grunt shell doesn't let you try out functions and UDFs are you're
trying to use them.
A = LOAD 'data' USING PigStorage() as (ip: chararray);
B = FOREACH A GENERATE REGEX_EXTRACT(ip, '(.*):(.*)', 1);
DUMP B;
You always have to load a dataset and work
That's what I was trying first, but then I tried defining it too.
-Original Message-
From: Ankit Bhatnagar [mailto:ank...@yahoo-inc.com]
Sent: Wednesday, December 04, 2013 11:15 AM
To: user@pig.apache.org; Watrous, Daniel
Subject: Re: Trouble with REGEX in PIG
R u planning to use
Pradeep,
Does the documentation here need to be updated:
http://pig.apache.org/docs/r0.12.0/func.html#regex-extract
It suggests that the function can run against a string and should return the
expected value.
I did confirm that I can use REGEX_EXTRACT on values loaded from a file.
Thank
I have this bug that is killing me, where I can't self-join/cross a dataset
with itself. Its blocking my work :(
The script is like this:
businesses = LOAD
'yelp_phoenix_academic_dataset/yelp_academic_dataset_business.json' using
com.twitter.elephantbird.pig.load.JsonLoader() as json:map[];
/*
There was a bug in the script on the 2nd to last line. Fixed it, still have
same issue.
I found a workaround: if I store the CROSSED relation immediately after the
CROSS, then load it... it works. Something about resetting the plan. This
is a bug. I'll file a JIRA.
On Wed, Dec 4, 2013 at 1:21
I tried to following script (not exactly the same) and it worked correctly
for me.
businesses = LOAD 'dataset' using PigStorage(',') AS (a, b, c,
business_id: chararray, lat: double, lng: double);
locations = FOREACH businesses GENERATE business_id, lat, lng;
STORE locations INTO 'locations.tsv';
Hi everyone,
I am having some weird classpath issues with a UDF that returns a custom tuple.
My custom tuple has an arraylist of custom objects. It looks like:
class MyTuple
private ArrayListMyClass list;
When the UDF is called, everything works fine: the tuples are created and the
UDF
If you store immediately after the CROSS, it works. If you do another
FOREACH/GENERATE, etc. it does not.
On Wed, Dec 4, 2013 at 1:41 PM, Pradeep Gollakota pradeep...@gmail.comwrote:
I tried to following script (not exactly the same) and it worked correctly
for me.
businesses = LOAD
10 matches
Mail list logo