Let me summarize the steps: 1. Create external table on HDFS (STORED AS TEXTFILE) and INSERT OVERWRITE that table. 2. Use "hadoop dfs -getmerge" to get the data.
So there is no need for running a "INSERT OVERWRITE LOCAL DIRECTORY". Zheng On Fri, Jun 5, 2009 at 2:16 AM, Min Zhou <[email protected]> wrote: > hive> explain INSERT OVERWRITE LOCAL DIRECTORY '/home/hive/result' SELECT * > FROM staticacs; > OK > ABSTRACT SYNTAX TREE: > (TOK_QUERY (TOK_FROM (TOK_TABREF staticacs)) (TOK_INSERT (TOK_DESTINATION > (TOK_LOCAL_DIR '/home/hive/result')) (TOK_SELECT (TOK_SELEXPR > TOK_ALLCOLREF)))) > > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > > STAGE PLANS: > Stage: Stage-1 > Map Reduce > Alias -> Map Operator Tree: > staticacs > Select Operator > expressions: > expr: time > type: string > expr: application > type: string > expr: count1 > type: int > expr: count2 > type: int > expr: dt > type: string > File Output Operator > compressed: true > GlobalTableId: 1 > table: > input format: org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > > Stage: Stage-0 > Move Operator > files: > hdfs directory: false > destination: /home/hive/result > > Thanks, > Min > > > On Fri, Jun 5, 2009 at 4:53 PM, Zheng Shao <[email protected]> wrote: > >> Can you give an example? Please also include the results of "explain >> <query>" so we can see how many map-reduce jobs are there. >> >> Zheng >> >> >> >> On Fri, Jun 5, 2009 at 1:21 AM, Min Zhou <[email protected]> wrote: >> >>> but is there any way to define row format of files outputed by command >>> "INSERT OVERWRITE LOCAL DIRECTORY '/tmp/aaa' SELECT ..." ? If I create an >>> external table to store result , then put it to local by that command, more >>> mapred job needed. >>> >>> Min >>> >>> >>> On Fri, Jun 5, 2009 at 2:04 PM, Zheng Shao <[email protected]> wrote: >>> >>>> Use dfs -getmerge. >>>> >>>> By the way, if you just want text file format, "INSERT OVERWRITE LOCAL >>>> DIRECTORY '/tmp/aaa' SELECT ..." is good enough. You don't need to split >>>> the >>>> process into 2 steps. >>>> >>>> Zheng >>>> >>>> >>>> On Thu, Jun 4, 2009 at 8:22 PM, Min Zhou <[email protected]> wrote: >>>> >>>>> So how do you copy them back? >>>>> use dfs -germerge or INSERT OVERWRITE LOCAL DIRECTORY? >>>>> >>>>> Thanks, >>>>> Min >>>>> >>>>> >>>>> On Thu, Jun 4, 2009 at 1:48 PM, Zheng Shao <[email protected]> wrote: >>>>> >>>>>> No, we would need to copy the file back from HDFS. But I think there >>>>>> is no overhead - we are doing the same thing as INSERT OVERWRITE LOCAL >>>>>> DIRECTORY, I think. >>>>>> >>>>>> Zheng >>>>>> >>>>>> >>>>>> On Wed, Jun 3, 2009 at 10:45 PM, Min Zhou <[email protected]>wrote: >>>>>> >>>>>>> Can external table stored at a local directory? >>>>>>> >>>>>>> >>>>>>> On Thu, Jun 4, 2009 at 1:42 PM, Zheng Shao <[email protected]> wrote: >>>>>>> >>>>>>>> You can first create an external table and then insert into that >>>>>>>> table. >>>>>>>> >>>>>>>> Zheng >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Jun 3, 2009 at 10:31 PM, Min Zhou <[email protected]>wrote: >>>>>>>> >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> Any helps? >>>>>>>>> >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> Min >>>>>>>>> -- >>>>>>>>> My research interests are distributed systems, parallel computing >>>>>>>>> and bytecode based virtual machine. >>>>>>>>> >>>>>>>>> My profile: >>>>>>>>> http://www.linkedin.com/in/coderplay >>>>>>>>> My blog: >>>>>>>>> http://coderplay.javaeye.com >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Yours, >>>>>>>> Zheng >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> My research interests are distributed systems, parallel computing and >>>>>>> bytecode based virtual machine. >>>>>>> >>>>>>> My profile: >>>>>>> http://www.linkedin.com/in/coderplay >>>>>>> My blog: >>>>>>> http://coderplay.javaeye.com >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Yours, >>>>>> Zheng >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> My research interests are distributed systems, parallel computing and >>>>> bytecode based virtual machine. >>>>> >>>>> My profile: >>>>> http://www.linkedin.com/in/coderplay >>>>> My blog: >>>>> http://coderplay.javaeye.com >>>>> >>>> >>>> >>>> >>>> -- >>>> Yours, >>>> Zheng >>>> >>> >>> >>> >>> -- >>> My research interests are distributed systems, parallel computing and >>> bytecode based virtual machine. >>> >>> My profile: >>> http://www.linkedin.com/in/coderplay >>> My blog: >>> http://coderplay.javaeye.com >>> >> >> >> >> -- >> Yours, >> Zheng >> > > > > -- > My research interests are distributed systems, parallel computing and > bytecode based virtual machine. > > My profile: > http://www.linkedin.com/in/coderplay > My blog: > http://coderplay.javaeye.com > -- Yours, Zheng
