Re: How to define the format of files outputed by command INSERT OVERWRITE LOCAL DIRECTORY?

Min Zhou Fri, 05 Jun 2009 03:12:15 -0700

got! It's also my way.

Thanks,
Min


On Fri, Jun 5, 2009 at 5:29 PM, Zheng Shao <[email protected]> wrote:

> Let me summarize the steps:
> 1. Create external table on HDFS (STORED AS TEXTFILE) and INSERT OVERWRITE
> that table.
> 2. Use "hadoop dfs -getmerge" to get the data.
>
> So there is no need for running a "INSERT OVERWRITE LOCAL DIRECTORY".
>
> Zheng
>
>
> On Fri, Jun 5, 2009 at 2:16 AM, Min Zhou <[email protected]> wrote:
>
>> hive> explain INSERT OVERWRITE LOCAL DIRECTORY '/home/hive/result' SELECT
>> * FROM staticacs;
>> OK
>> ABSTRACT SYNTAX TREE:
>>   (TOK_QUERY (TOK_FROM (TOK_TABREF staticacs)) (TOK_INSERT
>> (TOK_DESTINATION (TOK_LOCAL_DIR '/home/hive/result')) (TOK_SELECT
>> (TOK_SELEXPR TOK_ALLCOLREF))))
>>
>> STAGE DEPENDENCIES:
>>   Stage-1 is a root stage
>>   Stage-0 depends on stages: Stage-1
>>
>> STAGE PLANS:
>>   Stage: Stage-1
>>     Map Reduce
>>       Alias -> Map Operator Tree:
>>         staticacs
>>             Select Operator
>>               expressions:
>>                     expr: time
>>                     type: string
>>                     expr: application
>>                     type: string
>>                     expr: count1
>>                     type: int
>>                     expr: count2
>>                     type: int
>>                     expr: dt
>>                     type: string
>>               File Output Operator
>>                 compressed: true
>>                 GlobalTableId: 1
>>                 table:
>>                     input format: org.apache.hadoop.mapred.TextInputFormat
>>                     output format:
>> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>>
>>   Stage: Stage-0
>>     Move Operator
>>       files:
>>             hdfs directory: false
>>             destination: /home/hive/result
>>
>> Thanks,
>> Min
>>
>>
>> On Fri, Jun 5, 2009 at 4:53 PM, Zheng Shao <[email protected]> wrote:
>>
>>> Can you give an example? Please also include the results of "explain
>>> <query>" so we can see how many map-reduce jobs are there.
>>>
>>> Zheng
>>>
>>>
>>>
>>> On Fri, Jun 5, 2009 at 1:21 AM, Min Zhou <[email protected]> wrote:
>>>
>>>> but is there any way to define row format of files outputed by command
>>>> "INSERT OVERWRITE LOCAL DIRECTORY '/tmp/aaa' SELECT ..." ? If I create an
>>>> external table to store result , then put it to local by that command, more
>>>> mapred job needed.
>>>>
>>>> Min
>>>>
>>>>
>>>> On Fri, Jun 5, 2009 at 2:04 PM, Zheng Shao <[email protected]> wrote:
>>>>
>>>>> Use dfs -getmerge.
>>>>>
>>>>> By the way, if you just want text file format, "INSERT OVERWRITE LOCAL
>>>>> DIRECTORY '/tmp/aaa' SELECT ..." is good enough. You don't need to split 
>>>>> the
>>>>> process into 2 steps.
>>>>>
>>>>> Zheng
>>>>>
>>>>>
>>>>> On Thu, Jun 4, 2009 at 8:22 PM, Min Zhou <[email protected]> wrote:
>>>>>
>>>>>> So how do you copy them back?
>>>>>> use dfs -germerge or INSERT OVERWRITE LOCAL DIRECTORY?
>>>>>>
>>>>>> Thanks,
>>>>>> Min
>>>>>>
>>>>>>
>>>>>> On Thu, Jun 4, 2009 at 1:48 PM, Zheng Shao <[email protected]> wrote:
>>>>>>
>>>>>>> No, we would need to copy the file back from HDFS. But I think there
>>>>>>> is no overhead - we are doing the same thing as INSERT OVERWRITE LOCAL
>>>>>>> DIRECTORY, I think.
>>>>>>>
>>>>>>> Zheng
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jun 3, 2009 at 10:45 PM, Min Zhou <[email protected]>wrote:
>>>>>>>
>>>>>>>> Can external table stored at a local directory?
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Jun 4, 2009 at 1:42 PM, Zheng Shao <[email protected]>wrote:
>>>>>>>>
>>>>>>>>> You can first create an external table and then insert into that
>>>>>>>>> table.
>>>>>>>>>
>>>>>>>>> Zheng
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Jun 3, 2009 at 10:31 PM, Min Zhou <[email protected]>wrote:
>>>>>>>>>
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> Any helps?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> thanks,
>>>>>>>>>> Min
>>>>>>>>>> --
>>>>>>>>>> My research interests are distributed systems, parallel computing
>>>>>>>>>> and bytecode based virtual machine.
>>>>>>>>>>
>>>>>>>>>> My profile:
>>>>>>>>>> http://www.linkedin.com/in/coderplay
>>>>>>>>>> My blog:
>>>>>>>>>> http://coderplay.javaeye.com
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Yours,
>>>>>>>>> Zheng
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> My research interests are distributed systems, parallel computing
>>>>>>>> and bytecode based virtual machine.
>>>>>>>>
>>>>>>>> My profile:
>>>>>>>> http://www.linkedin.com/in/coderplay
>>>>>>>> My blog:
>>>>>>>> http://coderplay.javaeye.com
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Yours,
>>>>>>> Zheng
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> My research interests are distributed systems, parallel computing and
>>>>>> bytecode based virtual machine.
>>>>>>
>>>>>> My profile:
>>>>>> http://www.linkedin.com/in/coderplay
>>>>>> My blog:
>>>>>> http://coderplay.javaeye.com
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Yours,
>>>>> Zheng
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> My research interests are distributed systems, parallel computing and
>>>> bytecode based virtual machine.
>>>>
>>>> My profile:
>>>> http://www.linkedin.com/in/coderplay
>>>> My blog:
>>>> http://coderplay.javaeye.com
>>>>
>>>
>>>
>>>
>>> --
>>> Yours,
>>> Zheng
>>>
>>
>>
>>
>> --
>> My research interests are distributed systems, parallel computing and
>> bytecode based virtual machine.
>>
>> My profile:
>> http://www.linkedin.com/in/coderplay
>> My blog:
>> http://coderplay.javaeye.com
>>
>
>
>
> --
> Yours,
> Zheng
>



-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com

Re: How to define the format of files outputed by command INSERT OVERWRITE LOCAL DIRECTORY?

Reply via email to