Check out http://archive.cloudera.com/cdh/3/pig/piglatin_ref2.html#REGEX_EXTRACT
This may suit your needs On Mon, May 12, 2014 at 12:16 AM, kartik manocha <[email protected]>wrote: > Hi, > > I am new to pig & facing an issue in filtering out a string from a field, > mentioned is the scenario. > > - > I am loading data with several fields, among those fields there is > field name called 'test_data' > - > There are lot of things in this field, I wanted to filter out a string > from this field which starts from B75 & ends with semi colon. > - > After taking this string out, wanted to add this as a new field to the > existing bag which was loaded > > I tried using INDEXOF UDF, but that works for a single character only, > however when I tried using that for single character, it returns () only > instead of index number. I was just testing, & by manually providing > indexes in SUBSTRING UDF, it was generating string. > > But unable to get the position using indexof UDF, or may be there could be > a better of doing this. > > If you have any pointers / suggestions, please share. > > Thanks in advance. > > > Best, > Kartik >
