[jira] [Commented] (HIVE-9012) Not able to move and populate the data fully on to the table when the scratch directory is on S3

Steve Loughran (JIRA) Mon, 03 Jul 2017 03:31:42 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-9012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072238#comment-16072238
 ]


Steve Loughran commented on HIVE-9012:
--------------------------------------

This is just rename() being emulated in S3 with a copy-and-delete.

> Not able to move and populate the data fully on to the table when the scratch 
> directory is on S3
> ------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-9012
>                 URL: https://issues.apache.org/jira/browse/HIVE-9012
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.13.1
>         Environment: Amazon AMI and S3 as storage service
>            Reporter: Kolluru Som Shekhar Sharma
>            Priority: Blocker
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> I have set the hive.exec.scratchDir to point to a directory on S3 and 
> external table is on S3 level. 
> I ran a simple query which extracts the key value pairs from JSON string 
> without any WHERE clause, and the about of data is ~500GB.  The query ran 
> fine, but when it is trying to move the data from the scratch directory it 
> doesn't complete. So i need to kill the process and manually need to move the 
> data.
> The data size in the scratch directory was nearly ~550GB
> I tried the same scenario with less data and putting where clause, it 
> completed successfully and data also gets populated in the table. I checked 
> the size in the table and in the scratch directory. The data in the table was 
> showing 2MB and the data in the scratch directory is 48.6GB



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-9012) Not able to move and populate the data fully on to the table when the scratch directory is on S3

Reply via email to