Or use Falcon ...

The Spark JDBC I would try to avoid. Jdbc is not designed for these big data 
bulk operations, eg data has to be transferred uncompressed and there is the 
serialization/deserialization issue query result -> protocol -> Java objects -> 
writing to specific storage format etc
This costs more time than you may think.

> On 25 May 2016, at 18:05, Mich Talebzadeh <mich.talebza...@gmail.com> wrote:
> 
> They are multiple ways of doing this without relying any vendors release.
> 
> 1) Using  hive EXPORT/IMPORT utility
> 
> EXPORT TABLE table_or_partition TO hdfs_path;
> IMPORT [[EXTERNAL] TABLE table_or_partition] FROM hdfs_path [LOCATION 
> [table_location]];
> 2) This works for individual tables but you can easily write a generic script 
> to pick up name of tables for a given database from Hive metadata.
>      example
> 
> SELECT
>           t.owner AS Owner
>         , d.NAME AS DBName
>         , t.TBL_NAME AS Tablename
>         , TBL_TYPE
> FROM tbls t, dbs d
> WHERE
>           t.DB_ID = d.DB_ID
> AND
>           TBL_TYPE IN ('MANAGED_TABLE','EXTERNAL_TABLE')
> ORDER BY 1,2
> 
> Then a Linux shell script will table 5 min max to create and you have full 
> control of the code. You can even do multiple EXPORT/IMPORT at the same time.
> 
> 3) Easier  to create a shared NFS mount between PROD and UAT so you can put 
> the tables data and metadata on this NFS
> 
> 2) Use Spark shell script to get data via JDBC from the source database and 
> push schema and data into the new env. Again this is no different from 
> getting the underlying data from Oracle or Sybase database and putting in Hive
> 
> 3) Using vendor's product to do the same. I am not sure vendors do 
> parallelise this sort of things.
> 
> HTH
> 
> 
> 
> Dr Mich Talebzadeh
>  
> LinkedIn  
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>  
> http://talebzadehmich.wordpress.com
>  
> 
>> On 25 May 2016 at 14:50, Suresh Kumar Sethuramaswamy <rock...@gmail.com> 
>> wrote:
>> Hi
>> 
>>   If you are using CDH, via CM , Backup->replications you could do inter 
>> cluster hive data transfer including metadata
>> 
>> Regards
>> Suresh
>> 
>> 
>>> On Wednesday, May 25, 2016, mahender bigdata <mahender.bigd...@outlook.com> 
>>> wrote:
>>> Any Document on it. 
>>> 
>>>> On 4/8/2016 6:28 PM, Will Du wrote:
>>>> did you try export and import statement in HQL?
>>>> 
>>>>> On Apr 8, 2016, at 6:24 PM, Ashok Kumar <ashok34...@yahoo.com> wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> Anyone has suggestions how to create and copy Hive and Spark tables from 
>>>>> Production to UAT.
>>>>> 
>>>>> One way would be to copy table data to external files and then move the 
>>>>> external files to a local target directory and populate the tables in 
>>>>> target Hive with data.
>>>>> 
>>>>> Is there an easier way of doing so?
>>>>> 
>>>>> thanks
> 

Reply via email to