Satish Subhashrao Saley created PIG-5263: --------------------------------------------
Summary: Using wildcard doesn't work with OrcStorage Key: PIG-5263 URL: https://issues.apache.org/jira/browse/PIG-5263 Project: Pig Issue Type: Bug Reporter: Satish Subhashrao Saley Priority: Minor myinput = LOAD '/user/saley/data/datestamp=20170301*' USING OrcStorage(); Its throwing an exception {{Caused by: java.io.FileNotFoundException: File hdfs://localhost:8020/user/saley/data/datestamp=20170301* does not exist.}} Full stack trace {code} 2017-03-03 18:50:12,651 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir org.apache.pig.backend.executionengine.ExecException: ERROR 2118: serious problem at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:279) at org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateNewSplits(MRInputHelpers.java:411) at org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateInputSplitsToMem(MRInputHelpers.java:292) at org.apache.pig.backend.hadoop.executionengine.tez.plan.optimizer.LoaderProcessor.processLoads(LoaderProcessor.java:169) at org.apache.pig.backend.hadoop.executionengine.tez.plan.optimizer.LoaderProcessor.visitTezOp(LoaderProcessor.java:182) at org.apache.pig.backend.hadoop.executionengine.tez.plan.TezOperator.visit(TezOperator.java:259) at org.apache.pig.backend.hadoop.executionengine.tez.plan.TezOperator.visit(TezOperator.java:56) at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87) at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46) at org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher.processLoadAndParallelism(TezLauncher.java:503) at org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher.launchPig(TezLauncher.java:187) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:286) at org.apache.pig.PigServer.launchPlan(PigServer.java:1401) at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1386) at org.apache.pig.PigServer.storeEx(PigServer.java:1045) at org.apache.pig.PigServer.store(PigServer.java:1008) at org.apache.pig.PigServer.openIterator(PigServer.java:921) at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:762) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:376) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81) at org.apache.pig.Main.run(Main.java:630) at org.apache.pig.Main.main(Main.java:176) Caused by: java.lang.RuntimeException: serious problem at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1119) at org.apache.hadoop.hive.ql.io.orc.OrcNewInputFormat.getSplits(OrcNewInputFormat.java:121) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:265) ... 23 more Caused by: java.util.concurrent.ExecutionException: java.io.FileNotFoundException: File hdfs://localhost:8020/user/saley/data/datestamp=20170301* does not exist. at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1087) ... 25 more Caused by: java.io.FileNotFoundException: File hdfs://localhost:8020/user/saley/data/datestamp=20170301* does not exist. at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.<init>(DistributedFileSystem.java:948) at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.<init>(DistributedFileSystem.java:927) at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:872) at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:868) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:886) at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1697) at org.apache.hadoop.hive.shims.Hadoop23Shims.listLocatedStatus(Hadoop23Shims.java:665) at org.apache.hadoop.hive.ql.io.AcidUtils.getAcidState(AcidUtils.java:361) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.callInternal(OrcInputFormat.java:692) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.access$600(OrcInputFormat.java:659) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator$1.run(OrcInputFormat.java:682) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator$1.run(OrcInputFormat.java:679) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1738) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.call(OrcInputFormat.java:679) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.call(OrcInputFormat.java:659) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)