Index is not match number, but group number, so you need something
like (REGEX_EXTRACT(base, '(".+?")[^"]*(".+?")[^"]*(".+?")', 3))2012/9/6 Joao Salcedo <[email protected]> > Hi All, > > I am using regular expressions to parse my string > > I have the following > > "GET /javascript/quicksearch.js HTTP/1.0" 200 1947 " > > http://www.gothiclolitawigs.com/gothic-lolita-wigs/straight-split-ss-blonde-white/ > " > "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.5 (KHTML, like Gecko) > Chrome/19.0.1084.56 Safari/536.5" > > I perform: > > agent = FOREACH base GENERATE (REGEX_EXTRACT(base, '(".+?")', 1)); > > And I get the first match > > > ---------------------------------------------------------------------------- > | agent | org.apache.pig.builtin.regex_extract_base_1661:chararray > | > > ----------------------------------------------------------------------------- > | | "GET /javascript/quicksearch.js HTTP/1.0" > | > > ----------------------------------------------------------------------------- > > If I do the same command : agent = FOREACH base GENERATE > (REGEX_EXTRACT(base, '(".+?")', 3)); In order to get the 3 string between > quotes > > > I get the following: I do not understand why? any ideas? > > 2012-09-06 10:38:23,785 [main] INFO > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler > - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3 > java.lang.NullPointerException > at > > org.apache.hadoop.mapreduce.TaskInputOutputContext.getCounter(TaskInputOutputContext.java:84) > at > > org.apache.pig.tools.pigstats.PigStatusReporter.getCounter(PigStatusReporter.java:55) > at > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger.warn(PigHadoopLogger.java:56) > at org.apache.pig.EvalFunc.warn(EvalFunc.java:186) > at org.apache.pig.builtin.REGEX_EXTRACT.exec(REGEX_EXTRACT.java:90) > at org.apache.pig.builtin.REGEX_EXTRACT.exec(REGEX_EXTRACT.java:47) > at > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:216) > at > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:305) > at > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:322) > at > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332) > at > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284) > at > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:267) > at > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:262) > at > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > at > > org.apache.pig.pen.LocalMapReduceSimulator.launchPig(LocalMapReduceSimulator.java:194) > at org.apache.pig.pen.ExampleGenerator.getData(ExampleGenerator.java:257) > at org.apache.pig.pen.ExampleGenerator.getData(ExampleGenerator.java:238) > at > > org.apache.pig.pen.LineageTrimmingVisitor.init(LineageTrimmingVisitor.java:103) > at > > org.apache.pig.pen.LineageTrimmingVisitor.<init>(LineageTrimmingVisitor.java:98) > at > org.apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java:166) > at org.apache.pig.PigServer.getExamples(PigServer.java:1245) > at > > org.apache.pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java:698) > at > > org.apache.pig.tools.pigscript.parser.PigScriptParser.Illustrate(PigScriptParser.java:591) > at > > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:306) > at > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:188) > at > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164) > at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69) > at org.apache.pig.Main.run(Main.java:495) > at org.apache.pig.Main.main(Main.java:111) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:197) > 2012-09-06 10:38:23,794 [main] ERROR org.apache.pig.tools.grunt.Grunt - > ERROR 2997: Encountered IOException. Exception : null > -- Best regards, Vitalii Tymchyshyn
