Thanks guys for reply. Following query also did not work hive> select count(*), filename from (select INPUT__FILE__NAME as filename from netflow) tmp where filename='vzb.1351794600.0' group by filename; FAILED: SemanticException java.lang.RuntimeException: cannot find field input__file__name from [org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@1d264bf5, org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@3d44d0c6
I forgot to mention that my table uses partitions. Do you guys know any other way to filter files? Thanks and Regards, -- Jitendra Kumar Singh Mobile: (+91) 9891314709 On Sat, Jun 15, 2013 at 12:33 PM, Navis류승우 <navis....@nexr.com> wrote: > Firstly, the exception seemed > https://issues.apache.org/jira/browse/HIVE-3926. > > Secondly, file selection on vc (file-name, etc.) is > https://issues.apache.org/jira/browse/HIVE-1662 > > Both of them are not fixed yet. > > 2013/6/14 Nitin Pawar <nitinpawar...@gmail.com>: > > Jitendra, > > I am really not sure you can use virtual columns in where clause. (I > never > > tried it so I may be wrong as well). > > > > can you try executing your query as below > > > > select count(*), filename from (select INPUT__FILE__NAME as filename from > > netflow)tmp where filename='vzb.1351794600.0'; > > > > please check for query syntax. I am giving an idea and have not verified > the > > query > > > > > > On Fri, Jun 14, 2013 at 4:57 PM, Jitendra Kumar Singh > > <jksingh26...@gmail.com> wrote: > >> > >> Hi Guys, > >> > >> Executing hive query with filter on virtual column INPUT_FILE_NAME > result > >> in following exception. > >> > >> hive> select count(*) from netflow where > >> INPUT__FILE__NAME='vzb.1351794600.0'; > >> > >> FAILED: SemanticException java.lang.RuntimeException: cannot find field > >> input__file__name from > >> > [org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@1d264bf5 > , > >> > org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@3d44d0c6 > , > >> > >> . > >> > >> . > >> > >> . > >> > >> > >> > org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@7e6bc5aa > ] > >> > >> This error is different from the one we get when column name is wrong > >> > >> hive> select count(*) from netflow where > >> INPUT__FILE__NAM='vzb.1351794600.0'; > >> > >> FAILED: SemanticException [Error 10004]: Line 1:35 Invalid table alias > or > >> column reference 'INPUT__FILE__NAM': (possible column names are: first, > >> last, ....) > >> > >> But using this virtual column in select clause works fine. > >> > >> hive> select INPUT__FILE__NAME from netflow group by INPUT__FILE__NAME; > >> > >> Total MapReduce jobs = 1 > >> > >> Launching Job 1 out of 1 > >> > >> Number of reduce tasks not specified. Estimated from input data size: 4 > >> > >> In order to change the average load for a reducer (in bytes): > >> > >> set hive.exec.reducers.bytes.per.reducer=<number> > >> > >> In order to limit the maximum number of reducers: > >> > >> set hive.exec.reducers.max=<number> > >> > >> In order to set a constant number of reducers: > >> > >> set mapred.reduce.tasks=<number> > >> > >> Starting Job = job_201306041359_0006, Tracking URL = > >> http://192.168.0.224:50030/jobdetails.jsp?jobid=job_201306041359_0006 > >> > >> Kill Command = /opt/hadoop/bin/../bin/hadoop job -kill > >> job_201306041359_0006 > >> > >> Hadoop job information for Stage-1: number of mappers: 12; number of > >> reducers: 4 > >> > >> 2013-06-14 18:20:10,265 Stage-1 map = 0%, reduce = 0% > >> > >> 2013-06-14 18:20:33,363 Stage-1 map = 8%, reduce = 0% > >> > >> . > >> > >> . > >> > >> . > >> > >> 2013-06-14 18:21:15,554 Stage-1 map = 100%, reduce = 100% > >> > >> Ended Job = job_201306041359_0006 > >> > >> MapReduce Jobs Launched: > >> > >> Job 0: Map: 12 Reduce: 4 HDFS Read: 3107826046 HDFS Write: 55 SUCCESS > >> > >> Total MapReduce CPU Time Spent: 0 msec > >> > >> OK > >> > >> hdfs://192.168.0.224:9000/data/jk/vzb/vzb.1351794600.0 > >> > >> Time taken: 78.467 seconds > >> > >> I am trying to create external hive table on already present HDFS data. > >> And I have extra files in the folder that I want to ignore. Similar to > what > >> is asked and suggested in following stackflow questions how to make hive > >> take only specific files as input from hdfs folder when creating an > external > >> table in hive can I point the location to specific files in a direcotry? > >> > >> Any help would be appreciated. Full stack trace I am getting is as > follows > >> > >> 2013-06-14 15:01:32,608 ERROR ql.Driver > >> (SessionState.java:printError(401)) - FAILED: SemanticException > >> java.lang.RuntimeException: cannot find field input__ > >> > >> org.apache.hadoop.hive.ql.parse.SemanticException: > >> java.lang.RuntimeException: cannot find field input__file__name from > >> [org.apache.hadoop.hive.serde2.object > >> > >> at > >> > org.apache.hadoop.hive.ql.optimizer.pcr.PcrOpProcFactory$FilterPCR.process(PcrOpProcFactory.java:122) > >> > >> at > >> > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89) > >> > >> at > >> > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:87) > >> > >> at > >> > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:124) > >> > >> at > >> > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:101) > >> > >> at > >> > org.apache.hadoop.hive.ql.optimizer.pcr.PartitionConditionRemover.transform(PartitionConditionRemover.java:86) > >> > >> at > >> > org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:102) > >> > >> at > >> > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8163) > >> > >> at > >> > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258) > >> > >> at > >> > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:50) > >> > >> at > >> > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258) > >> > >> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431) > >> > >> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:335) > >> > >> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:893) > >> > >> at > >> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) > >> > >> at > >> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) > >> > >> at > >> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412) > >> > >> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:755) > >> > >> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613) > >> > >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > >> > >> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) > >> > >> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown > Source) > >> > >> at java.lang.reflect.Method.invoke(Unknown Source) > >> > >> at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > >> > >> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > >> java.lang.RuntimeException: cannot find field input__file__name from > >> [org.apache.hadoop.hive.ser > >> > >> at > >> > org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prune(PartitionPruner.java:231) > >> > >> at > >> > org.apache.hadoop.hive.ql.optimizer.pcr.PcrOpProcFactory$FilterPCR.process(PcrOpProcFactory.java:112) > >> > >> ... 23 more > >> > >> Caused by: java.lang.RuntimeException: cannot find field > input__file__name > >> from > >> > [org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyF > >> > >> at > >> > org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:344) > >> > >> at > >> > org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldRef(UnionStructObjectInspector.java:100) > >> > >> at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.init > >> > >> > >> Thanks and Regards, > >> -- > >> Jitendra Kumar Singh > >> Mobile: (+91) 9891314709 > > > > > > > > > > -- > > Nitin Pawar >