DUMP works as expected If I write the exact same thing in one line, it works.. I remember seeing a JIRA for this some time back, but am not able to find it now.
On Wed, Dec 14, 2011 at 12:23 AM, Stan Rosenberg < [email protected]> wrote: > There is something syntactically wrong with your script. > MismatchedTokenException seems to indicate that the semicolon > character was expected (ttype==93). > What happens if you replace the entire "STORE A ..." line by say "DUMP A"? > > On Tue, Dec 13, 2011 at 1:17 PM, IGZ Nick <[email protected]> wrote: > > Hi Stan, > > > > Here is my pig script: > > REGISTER avro-1.4.0.jar > > REGISTER joda-time-1.6.jar > > REGISTER json-simple-1.1.jar > > REGISTER jackson-core-asl-1.5.5.jar > > REGISTER jackson-mapper-asl-1.5.5.jar > > REGISTER pig-0.9.1-SNAPSHOT.jar > > REGISTER dwh-udf-0.1.jar > > REGISTER piggybank.jar > > REGISTER linkedin-pig-0.8.jar > > REGISTER google-collect-1.0-rc2.jar; > > > > A = LOAD '/user/hshankar/temp' USING PigStorage();RMF > > '/user/hshankar/out1';STORE A INTO '/user/hshankar/out1' USING > > org.apache.pig.piggybank.storage.avro.AvroStorage('{"type": "record", > > "name": "test", "fields": [{"name":"my_region", "type": "string"}]}'); > > > > On executing it, I get this error: > > 2011-12-13 18:16:35,133 [main] ERROR org.apache.pig.tools.grunt.Grunt - > > ERROR 1200: Pig script failed to parse: MismatchedTokenException(93!=3) > > Details at logfile: > /export/home/hshankar/pig_scripts/pig_1323800194535.log > > > > Log file contains: > > Pig Stack Trace > > --------------- > > ERROR 1200: Pig script failed to parse: MismatchedTokenException(93!=3) > > > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error > > during parsing. Pig script failed to parse: > MismatchedTokenException(93!=3) > > at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1652) > > at > org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1597) > > at org.apache.pig.PigServer.registerQuery(PigServer.java:583) > > at > > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:942) > > at > > > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386) > > at > > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:188) > > at > > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164) > > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81) > > at org.apache.pig.Main.run(Main.java:553) > > at org.apache.pig.Main.main(Main.java:108) > > Caused by: Failed to parse: Pig script failed to parse: > > MismatchedTokenException(93!=3) > > at > > org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:178) > > at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1644) > > ... 9 more > > Caused by: MismatchedTokenException(93!=3) > > at > > > org.apache.pig.parser.AstValidator.recoverFromMismatchedToken(AstValidator.java:209) > > at org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:115) > > at > > org.apache.pig.parser.AstValidator.func_clause(AstValidator.java:3497) > > at > > org.apache.pig.parser.AstValidator.store_clause(AstValidator.java:4626) > > at > > org.apache.pig.parser.AstValidator.op_clause(AstValidator.java:970) > > at > > > org.apache.pig.parser.AstValidator.general_statement(AstValidator.java:574) > > at > > org.apache.pig.parser.AstValidator.statement(AstValidator.java:396) > > at org.apache.pig.parser.AstValidator.query(AstValidator.java:306) > > at > > > org.apache.pig.parser.QueryParserDriver.validateAst(QueryParserDriver.java:236) > > at > > org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:168) > > ... 10 more > > > ================================================================================ > > > > > > On Tue, Dec 13, 2011 at 9:05 PM, Stan Rosenberg < > > [email protected]> wrote: > > > >> The following test script works for me: > >> ============================================= > >> > >> A = load '$LOGS' using > org.apache.pig.piggybank.storage.avro.AvroStorage(); > >> describe A; > >> > >> B = foreach A generate region as my_region, google_ip; > >> > >> dump B; > >> > >> store B into './output' using > >> org.apache.pig.piggybank.storage.avro.AvroStorage( > >> '{"debug": 5, > >> "schema": {"type": "record", "name": "test", "fields": [{"name": > >> "my_region", "type": ["null", "string"]}, {"name": "ip", "type": > >> ["null", "string"]}]} > >> }'); > >> ============================================================= > >> Note you don't need to pass the first parameter, i.e., 'schema'; you > >> can just pass a string formatted in json. > >> If you're still getting MismatchException, please compile a small > >> repro and send it to the list. > >> > >> stan > >> > >> On Tue, Dec 13, 2011 at 5:49 AM, IGZ Nick <[email protected]> wrote: > >> > Hi all, > >> > > >> > I want to keep the pig script and storage schema separate. Is it > possible > >> > to do this in a clean way? THe only way that has worked so far is to > do > >> > like: > >> > AvroStorage('schema', > >> > > >> > '{"name":"xyz","type":"record","fields":[{"name":"abc","type":"string"}]}'); > >> > > >> > That too, all the schema in one line. If I split it onto multiple > lines, > >> I > >> > get a MismatchException (93-3) or something like that. Is there no > way to > >> > do AvroStorage('file', <hdfs path of schema file>) or something of > that > >> > sort, or at least be able to specify the schema in multiple lines? > >> > > >> > Thanks, > >> >
