[ 
https://issues.apache.org/jira/browse/IMPALA-8821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell resolved IMPALA-8821.
-----------------------------------
    Fix Version/s: Not Applicable
       Resolution: Won't Fix

I don't think we need this anymore, so I'm closing this. If there is a desire 
for this functionality, this can be reopened.

> Dataload for remote clusters should use recover partitions
> ----------------------------------------------------------
>
>                 Key: IMPALA-8821
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8821
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Infrastructure
>    Affects Versions: Impala 3.3.0
>            Reporter: Joe McDonnell
>            Priority: Major
>             Fix For: Not Applicable
>
>
> Some test setups have data already in place and only need to run the DDLs to 
> sync up the metadata. This corresponds to running 
> testdata/bin/create-load-data.sh using a data snapshot but without 
> skip_metadata_load.
> Right now, for partitioned tables where the partitions are created 
> dynamically as part of the insert, generate-schema-statements.py forces a 
> reload:
> {noformat}
> # Force reloading of the table if the user specified the --force option or
> # if the table is partitioned and there was no ALTER section specified. This 
> is to
> # ensure the partition metadata is always properly created. The ALTER section 
> is
> # used to create partitions, so if that section exists there is no need to 
> force
> # reload.
> # IMPALA-6579: Also force reload all Kudu tables. The Kudu entity referenced
> # by the table may or may not exist, so requiring a force reload guarantees
> # that the Kudu entity is always created correctly.
> # TODO: Rename the ALTER section to ALTER_TABLE_ADD_PARTITION
> force_reload = options.force_reload or (partition_columns and not alter) or \
>     file_format == 'kudu'{noformat}
> In the case where the data is already in place, this would drop that data and 
> reload it. Instead, we should just use "recover partitions" on that table to 
> get all the partition information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to