Hi Ashutosh, It worked smoothly...... :-)
Thanks a lot for the support.... I think better hcatalog documents these piece of information in its wiki as samples/troubleshooting/FAQs etc. Regards, Subroto Sanyal ________________________________________ From: Ashutosh Chauhan [[email protected]] Sent: Thursday, February 09, 2012 1:02 AM To: [email protected] Subject: Re: Data Loding in HCatalog using PIG You need to provide schema in your load statement. if this file is same as /etc/passwd following should work, else modify the load statement to match schema of your file A = load 'hdfs://linux-emzg:9000/subroto/passwd' using PigStorage(':') as (uname : chararray, perm : chararray, int1 : int, int2 : int, grp : chararray, loc : chararray, dir : chararray); Ashutosh On Tue, Feb 7, 2012 at 20:44, Subroto sanyal <[email protected]<mailto:[email protected]>> wrote: Hi Ashutosh, I tried the suggestion but, ended up with other issue: =================================================================== Pig Stack Trace --------------- ERROR 1115: Column name for a field is not specified. Please provide the full schema as an argument to HCatStorer. org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2042: Error in new logical plan. Try -Dpig.usenewlogicalplan=false. at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:313) at org.apache.pig.PigServer.compilePp(PigServer.java:1365) at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1207) at org.apache.pig.PigServer.execute(PigServer.java:1201) at org.apache.pig.PigServer.access$100(PigServer.java:129) at org.apache.pig.PigServer$Graph.execute(PigServer.java:1528) at org.apache.pig.PigServer.executeBatchEx(PigServer.java:373) at org.apache.pig.PigServer.executeBatch(PigServer.java:340) at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:115) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:172) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:90) at org.apache.pig.Main.run(Main.java:500) at org.apache.pig.Main.main(Main.java:107) Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1115: Output Location Validation Failed for: 'default.passwdTableAnalysis More info to follow: Column name for a field is not specified. Please provide the full schema as an argument to HCatStorer. at org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:82) at org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:76) at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64) at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53) at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50) at org.apache.pig.newplan.logical.rules.InputOutputFileValidator.validate(InputOutputFileValidator.java:52) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:292) ... 13 more Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1115: Column name for a field is not specified. Please provide the full schema as an argument to HCatStorer. at org.apache.hcatalog.pig.HCatBaseStorer.validateAlias(HCatBaseStorer.java:410) at org.apache.hcatalog.pig.HCatBaseStorer.doSchemaValidations(HCatBaseStorer.java:328) at org.apache.hcatalog.pig.HCatStorer.setStoreLocation(HCatStorer.java:105) at org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:75) ... 21 more ================================================================================ PIG Version: 0.8.1 Hadoop Version: 0.20.2 CDHB34 HCatalog Version: 0.2.0 incubating RDBMS: db-derby-10.5.3.0 Regards, Subroto Sanyal ________________________________________ From: Ashutosh Chauhan [[email protected]<mailto:[email protected]>] Sent: Wednesday, February 08, 2012 12:30 AM To: [email protected]<mailto:[email protected]> Subject: Re: Data Loding in HCatalog using PIG Try following: store B into 'default.passwdTableAnalysis' using org.apache.hcatalog.pig.HCatStorer('test=2012'); Single-quotes for 'test=2012' instead of double (") Ashutosh On Tue, Feb 7, 2012 at 10:54, Subroto sanyal <[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>> wrote: Hi, While loading a HCatalog table using PIG script I get this error: ============================================================= Pig Stack Trace --------------- ERROR 1000: Error during parsing. Lexical error at line 3, column 87. Encountered: "t" (116), after : "\"" org.apache.pig.impl.logicalLayer.parser.TokenMgrError: Lexical error at line 3, column 87. Encountered: "t" (116), after : "\"" at org.apache.pig.impl.logicalLayer.parser.QueryParserTokenManager.getNextToken(QueryParserTokenManager.java:1829) at org.apache.pig.impl.logicalLayer.parser.QueryParser.jj_ntk(QueryParser.java:9457) at org.apache.pig.impl.logicalLayer.parser.QueryParser.StringList(QueryParser.java:1667) at org.apache.pig.impl.logicalLayer.parser.QueryParser.NonEvalFuncSpec(QueryParser.java:5560) at org.apache.pig.impl.logicalLayer.parser.QueryParser.StoreClause(QueryParser.java:3968) at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1501) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:1013) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:825) at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1612) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1562) at org.apache.pig.PigServer.registerQuery(PigServer.java:534) at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:871) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:388) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:90) at org.apache.pig.Main.run(Main.java:500) at org.apache.pig.Main.main(Main.java:107) ============================================================= The the Hcatalog table structure is: ============================================================= username string from deserializer description string from deserializer shell string from deserializer test string ============================================================= The PIG script use in this context is: ============================================================= A = load 'hdfs://linux-emzg:9000/subroto/passwd' using PigStorage(':'); B = foreach A generate $0, $4, $6, $7; store B into 'default.passwdTableAnalysis' using org.apache.hcatalog.pig.HCatStorer("test=2012"); ============================================================= The sample file which is getting loaded from the PIG script is. The file is available in HDFS at the correct location: ============================================================= user1:x:1:1:test user1:myhome1:bash1:20120203 user2:x:1:2:test user2:myhome2:bash2:20120204 user3:x:1:3:test user3:myhome3:bash3:20120205 user3:x:1:4:test user4:myhome4:bash4:20120206 user4:x:1:5:test user5:myhome5:bash5:20120207 user5:x:1:6:test user6:myhome6:bash6:20120208 ============================================================= Request support to resolve this problem. Regards, Subroto Sanyal
