Check out
http://archive.cloudera.com/cdh/3/pig/piglatin_ref2.html#REGEX_EXTRACT

This may suit your needs


On Mon, May 12, 2014 at 12:16 AM, kartik manocha <[email protected]>wrote:

> Hi,
>
> I am new to pig & facing an issue in filtering out a string from a field,
> mentioned is the scenario.
>
> - > I am loading data with several fields, among those fields there is
> field name called 'test_data'
> - > There are lot of things in this field, I wanted to filter out a string
> from this field which starts from B75 & ends with semi colon.
> - > After taking this string out, wanted to add this as a new field to the
> existing bag which was loaded
>
> I tried using INDEXOF UDF, but that works for a single character only,
> however when I tried using that for single character, it returns () only
> instead of index number. I was just testing, & by manually providing
> indexes in SUBSTRING UDF, it was generating string.
>
> But unable to get the position using indexof UDF, or may be there could be
> a better of doing this.
>
> If you have any pointers / suggestions, please share.
>
> Thanks in advance.
>
>
> Best,
> Kartik
>

Reply via email to