got! It's also my way. Thanks, Min
On Fri, Jun 5, 2009 at 5:29 PM, Zheng Shao <[email protected]> wrote: > Let me summarize the steps: > 1. Create external table on HDFS (STORED AS TEXTFILE) and INSERT OVERWRITE > that table. > 2. Use "hadoop dfs -getmerge" to get the data. > > So there is no need for running a "INSERT OVERWRITE LOCAL DIRECTORY". > > Zheng > > > On Fri, Jun 5, 2009 at 2:16 AM, Min Zhou <[email protected]> wrote: > >> hive> explain INSERT OVERWRITE LOCAL DIRECTORY '/home/hive/result' SELECT >> * FROM staticacs; >> OK >> ABSTRACT SYNTAX TREE: >> (TOK_QUERY (TOK_FROM (TOK_TABREF staticacs)) (TOK_INSERT >> (TOK_DESTINATION (TOK_LOCAL_DIR '/home/hive/result')) (TOK_SELECT >> (TOK_SELEXPR TOK_ALLCOLREF)))) >> >> STAGE DEPENDENCIES: >> Stage-1 is a root stage >> Stage-0 depends on stages: Stage-1 >> >> STAGE PLANS: >> Stage: Stage-1 >> Map Reduce >> Alias -> Map Operator Tree: >> staticacs >> Select Operator >> expressions: >> expr: time >> type: string >> expr: application >> type: string >> expr: count1 >> type: int >> expr: count2 >> type: int >> expr: dt >> type: string >> File Output Operator >> compressed: true >> GlobalTableId: 1 >> table: >> input format: org.apache.hadoop.mapred.TextInputFormat >> output format: >> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat >> >> Stage: Stage-0 >> Move Operator >> files: >> hdfs directory: false >> destination: /home/hive/result >> >> Thanks, >> Min >> >> >> On Fri, Jun 5, 2009 at 4:53 PM, Zheng Shao <[email protected]> wrote: >> >>> Can you give an example? Please also include the results of "explain >>> <query>" so we can see how many map-reduce jobs are there. >>> >>> Zheng >>> >>> >>> >>> On Fri, Jun 5, 2009 at 1:21 AM, Min Zhou <[email protected]> wrote: >>> >>>> but is there any way to define row format of files outputed by command >>>> "INSERT OVERWRITE LOCAL DIRECTORY '/tmp/aaa' SELECT ..." ? If I create an >>>> external table to store result , then put it to local by that command, more >>>> mapred job needed. >>>> >>>> Min >>>> >>>> >>>> On Fri, Jun 5, 2009 at 2:04 PM, Zheng Shao <[email protected]> wrote: >>>> >>>>> Use dfs -getmerge. >>>>> >>>>> By the way, if you just want text file format, "INSERT OVERWRITE LOCAL >>>>> DIRECTORY '/tmp/aaa' SELECT ..." is good enough. You don't need to split >>>>> the >>>>> process into 2 steps. >>>>> >>>>> Zheng >>>>> >>>>> >>>>> On Thu, Jun 4, 2009 at 8:22 PM, Min Zhou <[email protected]> wrote: >>>>> >>>>>> So how do you copy them back? >>>>>> use dfs -germerge or INSERT OVERWRITE LOCAL DIRECTORY? >>>>>> >>>>>> Thanks, >>>>>> Min >>>>>> >>>>>> >>>>>> On Thu, Jun 4, 2009 at 1:48 PM, Zheng Shao <[email protected]> wrote: >>>>>> >>>>>>> No, we would need to copy the file back from HDFS. But I think there >>>>>>> is no overhead - we are doing the same thing as INSERT OVERWRITE LOCAL >>>>>>> DIRECTORY, I think. >>>>>>> >>>>>>> Zheng >>>>>>> >>>>>>> >>>>>>> On Wed, Jun 3, 2009 at 10:45 PM, Min Zhou <[email protected]>wrote: >>>>>>> >>>>>>>> Can external table stored at a local directory? >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Jun 4, 2009 at 1:42 PM, Zheng Shao <[email protected]>wrote: >>>>>>>> >>>>>>>>> You can first create an external table and then insert into that >>>>>>>>> table. >>>>>>>>> >>>>>>>>> Zheng >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, Jun 3, 2009 at 10:31 PM, Min Zhou <[email protected]>wrote: >>>>>>>>> >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> Any helps? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> Min >>>>>>>>>> -- >>>>>>>>>> My research interests are distributed systems, parallel computing >>>>>>>>>> and bytecode based virtual machine. >>>>>>>>>> >>>>>>>>>> My profile: >>>>>>>>>> http://www.linkedin.com/in/coderplay >>>>>>>>>> My blog: >>>>>>>>>> http://coderplay.javaeye.com >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Yours, >>>>>>>>> Zheng >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> My research interests are distributed systems, parallel computing >>>>>>>> and bytecode based virtual machine. >>>>>>>> >>>>>>>> My profile: >>>>>>>> http://www.linkedin.com/in/coderplay >>>>>>>> My blog: >>>>>>>> http://coderplay.javaeye.com >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Yours, >>>>>>> Zheng >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> My research interests are distributed systems, parallel computing and >>>>>> bytecode based virtual machine. >>>>>> >>>>>> My profile: >>>>>> http://www.linkedin.com/in/coderplay >>>>>> My blog: >>>>>> http://coderplay.javaeye.com >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Yours, >>>>> Zheng >>>>> >>>> >>>> >>>> >>>> -- >>>> My research interests are distributed systems, parallel computing and >>>> bytecode based virtual machine. >>>> >>>> My profile: >>>> http://www.linkedin.com/in/coderplay >>>> My blog: >>>> http://coderplay.javaeye.com >>>> >>> >>> >>> >>> -- >>> Yours, >>> Zheng >>> >> >> >> >> -- >> My research interests are distributed systems, parallel computing and >> bytecode based virtual machine. >> >> My profile: >> http://www.linkedin.com/in/coderplay >> My blog: >> http://coderplay.javaeye.com >> > > > > -- > Yours, > Zheng > -- My research interests are distributed systems, parallel computing and bytecode based virtual machine. My profile: http://www.linkedin.com/in/coderplay My blog: http://coderplay.javaeye.com
