[jira] [Commented] (HIVE-1918) Add export/import facilities to the hive system

2013-07-04 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699829#comment-13699829
 ] 

Gelesh commented on HIVE-1918:
--

Wish:-
Can We have a option to copy meta information alone,

UseCase:-
So that, during DistCp with out copying the Hive files, (with partition folder 
and clustered file structure) to a temp location, we can create a _meta file 
alone.

Then, DistCp the hive files (the partioned and clusteded file structure) as 
such and load re create a hive table in the new cluster.

 Add export/import facilities to the hive system
 ---

 Key: HIVE-1918
 URL: https://issues.apache.org/jira/browse/HIVE-1918
 Project: Hive
  Issue Type: New Feature
  Components: Metastore, Query Processor
Reporter: Krishna Kumar
Assignee: Krishna Kumar
 Fix For: 0.8.0

 Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.2.txt, 
 HIVE-1918.patch.3.txt, HIVE-1918.patch.4.txt, HIVE-1918.patch.5.txt, 
 HIVE-1918.patch.txt, hive-metastore-er.pdf


 This is an enhancement request to add export/import features to hive.
 With this language extension, the user can export the data of the table - 
 which may be located in different hdfs locations in case of a partitioned 
 table - as well as the metadata of the table into a specified output 
 location. This output location can then be moved over to another different 
 hadoop/hive instance and imported there.  
 This should work independent of the source and target metastore dbms used; 
 for instance, between derby and mysql.
 For partitioned tables, the ability to export/import a subset of the 
 partition must be supported.
 Howl will add more features on top of this: The ability to create/use the 
 exported data even in the absence of hive, using MR or Pig. Please see 
 http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-03-14 Thread Paul Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13006582#comment-13006582
 ] 

Paul Yang commented on HIVE-1918:
-

+1 Looks good, will test and commit

 Add export/import facilities to the hive system
 ---

 Key: HIVE-1918
 URL: https://issues.apache.org/jira/browse/HIVE-1918
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Krishna Kumar
Assignee: Krishna Kumar
 Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.2.txt, 
 HIVE-1918.patch.3.txt, HIVE-1918.patch.4.txt, HIVE-1918.patch.5.txt, 
 HIVE-1918.patch.txt, hive-metastore-er.pdf


 This is an enhancement request to add export/import features to hive.
 With this language extension, the user can export the data of the table - 
 which may be located in different hdfs locations in case of a partitioned 
 table - as well as the metadata of the table into a specified output 
 location. This output location can then be moved over to another different 
 hadoop/hive instance and imported there.  
 This should work independent of the source and target metastore dbms used; 
 for instance, between derby and mysql.
 For partitioned tables, the ability to export/import a subset of the 
 partition must be supported.
 Howl will add more features on top of this: The ability to create/use the 
 exported data even in the absence of hive, using MR or Pig. Please see 
 http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-02-28 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13000445#comment-13000445
 ] 

Carl Steinbach commented on HIVE-1918:
--

I discarded my reviewboard request after Krishna's original request. 
Unfortunately this doesn't appear to prevent people from commenting on it, and 
reviewboard won't let me delete the request outright.

Please review the patch here: [https://reviews.apache.org/r/430/]


 Add export/import facilities to the hive system
 ---

 Key: HIVE-1918
 URL: https://issues.apache.org/jira/browse/HIVE-1918
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Krishna Kumar
Assignee: Krishna Kumar
 Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.2.txt, 
 HIVE-1918.patch.3.txt, HIVE-1918.patch.4.txt, HIVE-1918.patch.5.txt, 
 HIVE-1918.patch.txt, hive-metastore-er.pdf


 This is an enhancement request to add export/import features to hive.
 With this language extension, the user can export the data of the table - 
 which may be located in different hdfs locations in case of a partitioned 
 table - as well as the metadata of the table into a specified output 
 location. This output location can then be moved over to another different 
 hadoop/hive instance and imported there.  
 This should work independent of the source and target metastore dbms used; 
 for instance, between derby and mysql.
 For partitioned tables, the ability to export/import a subset of the 
 partition must be supported.
 Howl will add more features on top of this: The ability to create/use the 
 exported data even in the absence of hive, using MR or Pig. Please see 
 http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-02-22 Thread Paul Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12998129#comment-12998129
 ] 

Paul Yang commented on HIVE-1918:
-

Made a couple of comments on reviewboard.

 Add export/import facilities to the hive system
 ---

 Key: HIVE-1918
 URL: https://issues.apache.org/jira/browse/HIVE-1918
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Krishna Kumar
Assignee: Krishna Kumar
 Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.2.txt, 
 HIVE-1918.patch.3.txt, HIVE-1918.patch.4.txt, HIVE-1918.patch.txt, 
 hive-metastore-er.pdf


 This is an enhancement request to add export/import features to hive.
 With this language extension, the user can export the data of the table - 
 which may be located in different hdfs locations in case of a partitioned 
 table - as well as the metadata of the table into a specified output 
 location. This output location can then be moved over to another different 
 hadoop/hive instance and imported there.  
 This should work independent of the source and target metastore dbms used; 
 for instance, between derby and mysql.
 For partitioned tables, the ability to export/import a subset of the 
 partition must be supported.
 Howl will add more features on top of this: The ability to create/use the 
 exported data even in the absence of hive, using MR or Pig. Please see 
 http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-02-21 Thread Krishna Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12997296#comment-12997296
 ] 

Krishna Kumar commented on HIVE-1918:
-

There are a few reasons why I took this approach

 - The decision on compatibility (forward/backward) checks as in 
EximUtil.checkCompatibility needs to taken consciously. That is, automatically 
breaking backward compatibility is not an option here I think.

 - What needs to be serialized/deserialized is also requires a human decision. 
For instance, even now, authorization details are not transferred by an 
export/import.

 - The serialization/deserialization methods are also used by howl codebase 
outside of a hive context. It will be good to have this code only loosely 
coupled to the metastore code.

 Add export/import facilities to the hive system
 ---

 Key: HIVE-1918
 URL: https://issues.apache.org/jira/browse/HIVE-1918
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Krishna Kumar
Assignee: Krishna Kumar
 Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.2.txt, 
 HIVE-1918.patch.3.txt, HIVE-1918.patch.4.txt, HIVE-1918.patch.txt, 
 hive-metastore-er.pdf


 This is an enhancement request to add export/import features to hive.
 With this language extension, the user can export the data of the table - 
 which may be located in different hdfs locations in case of a partitioned 
 table - as well as the metadata of the table into a specified output 
 location. This output location can then be moved over to another different 
 hadoop/hive instance and imported there.  
 This should work independent of the source and target metastore dbms used; 
 for instance, between derby and mysql.
 For partitioned tables, the ability to export/import a subset of the 
 partition must be supported.
 Howl will add more features on top of this: The ability to create/use the 
 exported data even in the absence of hive, using MR or Pig. Please see 
 http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-02-20 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12997160#comment-12997160
 ] 

Namit Jain commented on HIVE-1918:
--

@Paul, do you have any additional comments ?

 Add export/import facilities to the hive system
 ---

 Key: HIVE-1918
 URL: https://issues.apache.org/jira/browse/HIVE-1918
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Krishna Kumar
Assignee: Krishna Kumar
 Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.2.txt, 
 HIVE-1918.patch.3.txt, HIVE-1918.patch.4.txt, HIVE-1918.patch.txt, 
 hive-metastore-er.pdf


 This is an enhancement request to add export/import features to hive.
 With this language extension, the user can export the data of the table - 
 which may be located in different hdfs locations in case of a partitioned 
 table - as well as the metadata of the table into a specified output 
 location. This output location can then be moved over to another different 
 hadoop/hive instance and imported there.  
 This should work independent of the source and target metastore dbms used; 
 for instance, between derby and mysql.
 For partitioned tables, the ability to export/import a subset of the 
 partition must be supported.
 Howl will add more features on top of this: The ability to create/use the 
 exported data even in the absence of hive, using MR or Pig. Please see 
 http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-02-20 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12997158#comment-12997158
 ] 

Namit Jain commented on HIVE-1918:
--

Krishna, the code changes look good - I had one concern only.

The functions in EximUtil.java like:


  private static Element createStorageDescriptor(Document doc,
  String location,
  String inputFormatClass,


have a implicit dependency on the metastore schema. 
If the schema changes, export/import will break, and it will be difficult
to add them.

Do you want to think about it ?

I mean, add a API in the metastore thrift to generate this, or something like 
this ?
So that, this code is auto-generated and is amenable to new fields.


 Add export/import facilities to the hive system
 ---

 Key: HIVE-1918
 URL: https://issues.apache.org/jira/browse/HIVE-1918
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Krishna Kumar
Assignee: Krishna Kumar
 Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.2.txt, 
 HIVE-1918.patch.3.txt, HIVE-1918.patch.4.txt, HIVE-1918.patch.txt, 
 hive-metastore-er.pdf


 This is an enhancement request to add export/import features to hive.
 With this language extension, the user can export the data of the table - 
 which may be located in different hdfs locations in case of a partitioned 
 table - as well as the metadata of the table into a specified output 
 location. This output location can then be moved over to another different 
 hadoop/hive instance and imported there.  
 This should work independent of the source and target metastore dbms used; 
 for instance, between derby and mysql.
 For partitioned tables, the ability to export/import a subset of the 
 partition must be supported.
 Howl will add more features on top of this: The ability to create/use the 
 exported data even in the absence of hive, using MR or Pig. Please see 
 http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-02-17 Thread Krishna Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995767#comment-12995767
 ] 

Krishna Kumar commented on HIVE-1918:
-

https://reviews.apache.org/r/430/ added (with hive-git as repository).

Carl, can you take down 339 as that is now superseded?

 Add export/import facilities to the hive system
 ---

 Key: HIVE-1918
 URL: https://issues.apache.org/jira/browse/HIVE-1918
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Krishna Kumar
Assignee: Krishna Kumar
 Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.2.txt, 
 HIVE-1918.patch.3.txt, HIVE-1918.patch.4.txt, HIVE-1918.patch.txt, 
 hive-metastore-er.pdf


 This is an enhancement request to add export/import features to hive.
 With this language extension, the user can export the data of the table - 
 which may be located in different hdfs locations in case of a partitioned 
 table - as well as the metadata of the table into a specified output 
 location. This output location can then be moved over to another different 
 hadoop/hive instance and imported there.  
 This should work independent of the source and target metastore dbms used; 
 for instance, between derby and mysql.
 For partitioned tables, the ability to export/import a subset of the 
 partition must be supported.
 Howl will add more features on top of this: The ability to create/use the 
 exported data even in the absence of hive, using MR or Pig. Please see 
 http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-02-16 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995669#comment-12995669
 ] 

Namit Jain commented on HIVE-1918:
--

Can you upload a patch to review-board

 Add export/import facilities to the hive system
 ---

 Key: HIVE-1918
 URL: https://issues.apache.org/jira/browse/HIVE-1918
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Krishna Kumar
Assignee: Krishna Kumar
 Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.2.txt, 
 HIVE-1918.patch.3.txt, HIVE-1918.patch.4.txt, HIVE-1918.patch.txt, 
 hive-metastore-er.pdf


 This is an enhancement request to add export/import features to hive.
 With this language extension, the user can export the data of the table - 
 which may be located in different hdfs locations in case of a partitioned 
 table - as well as the metadata of the table into a specified output 
 location. This output location can then be moved over to another different 
 hadoop/hive instance and imported there.  
 This should work independent of the source and target metastore dbms used; 
 for instance, between derby and mysql.
 For partitioned tables, the ability to export/import a subset of the 
 partition must be supported.
 Howl will add more features on top of this: The ability to create/use the 
 exported data even in the absence of hive, using MR or Pig. Please see 
 http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-02-09 Thread Krishna Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12992373#comment-12992373
 ] 

Krishna Kumar commented on HIVE-1918:
-

Importing into existing tables is now supported, but the checks (to see whether 
the imported table and the target table are compatible) have been kept fairly 
simple for now. Please see ImportSemanticAnalyzer.checkTable. The schemas 
(column and partition) of the two should match exactly, except for comments. 
Since we are just moving files (rather than rewriting records), I think there 
will be issues if the metadata schema does not match (in terms of types, number 
etc) the data serialization exactly.

Re the earlier comment re outputs/inputs, got what you meant. I will add the 
table/partition to the inputs in exportsemanticanalyzer. But in the case of the 
imports, I see that the tasks themselves adds the entity operated upon to the 
inputs/outputs list. Isn't that too late for authorization/concurrency, even 
though it may work for replication. Or both the sem.analyzers and the tasks are 
expected to add them? In the case of newly created table/partition, the 
sem.analyzer does not have a handle ?

 Add export/import facilities to the hive system
 ---

 Key: HIVE-1918
 URL: https://issues.apache.org/jira/browse/HIVE-1918
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Krishna Kumar
Assignee: Krishna Kumar
 Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.2.txt, 
 HIVE-1918.patch.3.txt, HIVE-1918.patch.txt, hive-metastore-er.pdf


 This is an enhancement request to add export/import features to hive.
 With this language extension, the user can export the data of the table - 
 which may be located in different hdfs locations in case of a partitioned 
 table - as well as the metadata of the table into a specified output 
 location. This output location can then be moved over to another different 
 hadoop/hive instance and imported there.  
 This should work independent of the source and target metastore dbms used; 
 for instance, between derby and mysql.
 For partitioned tables, the ability to export/import a subset of the 
 partition must be supported.
 Howl will add more features on top of this: The ability to create/use the 
 exported data even in the absence of hive, using MR or Pig. Please see 
 http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-02-09 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12992555#comment-12992555
 ] 

Namit Jain commented on HIVE-1918:
--

Tasks only add them when they may be available at compile time - for example, 
in case of dynamic partitions.
Semantic Analyzer is supposed to add them

 Add export/import facilities to the hive system
 ---

 Key: HIVE-1918
 URL: https://issues.apache.org/jira/browse/HIVE-1918
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Krishna Kumar
Assignee: Krishna Kumar
 Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.2.txt, 
 HIVE-1918.patch.3.txt, HIVE-1918.patch.txt, hive-metastore-er.pdf


 This is an enhancement request to add export/import features to hive.
 With this language extension, the user can export the data of the table - 
 which may be located in different hdfs locations in case of a partitioned 
 table - as well as the metadata of the table into a specified output 
 location. This output location can then be moved over to another different 
 hadoop/hive instance and imported there.  
 This should work independent of the source and target metastore dbms used; 
 for instance, between derby and mysql.
 For partitioned tables, the ability to export/import a subset of the 
 partition must be supported.
 Howl will add more features on top of this: The ability to create/use the 
 exported data even in the absence of hive, using MR or Pig. Please see 
 http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-02-09 Thread Krishna Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12992606#comment-12992606
 ] 

Krishna Kumar commented on HIVE-1918:
-

Hmm. LoadSemanticAnalyzer (which knows the table) does not add it to the 
outputs, but the MoveTask it schedules, does. 

Similarly, CREATE-TABLE does not add the entity but the DDLTask it schedules, 
does. This may be fine only because the entity does not exist at compile time?

ADD-PARTITION adds the table as an *input* at compile time and the partition 
itself is added as an output at execution time. Should not the table be an 
output (at compile time) as well - for authorization/concurrency purposes?

Anyway, where the import operates on existing tables/partitions, I will add 
them at compile time. If the entity is being created as part of the task, then 
the task will be adding them to inputs/outputs at runtime. Is this fine?


 Add export/import facilities to the hive system
 ---

 Key: HIVE-1918
 URL: https://issues.apache.org/jira/browse/HIVE-1918
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Krishna Kumar
Assignee: Krishna Kumar
 Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.2.txt, 
 HIVE-1918.patch.3.txt, HIVE-1918.patch.txt, hive-metastore-er.pdf


 This is an enhancement request to add export/import features to hive.
 With this language extension, the user can export the data of the table - 
 which may be located in different hdfs locations in case of a partitioned 
 table - as well as the metadata of the table into a specified output 
 location. This output location can then be moved over to another different 
 hadoop/hive instance and imported there.  
 This should work independent of the source and target metastore dbms used; 
 for instance, between derby and mysql.
 For partitioned tables, the ability to export/import a subset of the 
 partition must be supported.
 Howl will add more features on top of this: The ability to create/use the 
 exported data even in the absence of hive, using MR or Pig. Please see 
 http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-02-09 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12992612#comment-12992612
 ] 

Namit Jain commented on HIVE-1918:
--

Please file bugs for the above cases - 


The changes for import look fine.
You also need to make similar changes for export.

 Add export/import facilities to the hive system
 ---

 Key: HIVE-1918
 URL: https://issues.apache.org/jira/browse/HIVE-1918
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Krishna Kumar
Assignee: Krishna Kumar
 Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.2.txt, 
 HIVE-1918.patch.3.txt, HIVE-1918.patch.txt, hive-metastore-er.pdf


 This is an enhancement request to add export/import features to hive.
 With this language extension, the user can export the data of the table - 
 which may be located in different hdfs locations in case of a partitioned 
 table - as well as the metadata of the table into a specified output 
 location. This output location can then be moved over to another different 
 hadoop/hive instance and imported there.  
 This should work independent of the source and target metastore dbms used; 
 for instance, between derby and mysql.
 For partitioned tables, the ability to export/import a subset of the 
 partition must be supported.
 Howl will add more features on top of this: The ability to create/use the 
 exported data even in the absence of hive, using MR or Pig. Please see 
 http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-02-06 Thread Paul Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12991282#comment-12991282
 ] 

Paul Yang commented on HIVE-1918:
-

Looking at it as well..

 Add export/import facilities to the hive system
 ---

 Key: HIVE-1918
 URL: https://issues.apache.org/jira/browse/HIVE-1918
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Krishna Kumar
Assignee: Krishna Kumar
 Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.2.txt, 
 HIVE-1918.patch.3.txt, HIVE-1918.patch.txt, hive-metastore-er.pdf


 This is an enhancement request to add export/import features to hive.
 With this language extension, the user can export the data of the table - 
 which may be located in different hdfs locations in case of a partitioned 
 table - as well as the metadata of the table into a specified output 
 location. This output location can then be moved over to another different 
 hadoop/hive instance and imported there.  
 This should work independent of the source and target metastore dbms used; 
 for instance, between derby and mysql.
 For partitioned tables, the ability to export/import a subset of the 
 partition must be supported.
 Howl will add more features on top of this: The ability to create/use the 
 exported data even in the absence of hive, using MR or Pig. Please see 
 http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-02-04 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12990678#comment-12990678
 ] 

Namit Jain commented on HIVE-1918:
--

Can you create a review-board request. or is the old one still valid ?

 Add export/import facilities to the hive system
 ---

 Key: HIVE-1918
 URL: https://issues.apache.org/jira/browse/HIVE-1918
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Krishna Kumar
 Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.2.txt, 
 HIVE-1918.patch.3.txt, HIVE-1918.patch.txt, hive-metastore-er.pdf


 This is an enhancement request to add export/import features to hive.
 With this language extension, the user can export the data of the table - 
 which may be located in different hdfs locations in case of a partitioned 
 table - as well as the metadata of the table into a specified output 
 location. This output location can then be moved over to another different 
 hadoop/hive instance and imported there.  
 This should work independent of the source and target metastore dbms used; 
 for instance, between derby and mysql.
 For partitioned tables, the ability to export/import a subset of the 
 partition must be supported.
 Howl will add more features on top of this: The ability to create/use the 
 exported data even in the absence of hive, using MR or Pig. Please see 
 http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-02-04 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12990826#comment-12990826
 ] 

Namit Jain commented on HIVE-1918:
--

@Paul, you should definitely review it also, before it gets in.
I am reviewing it right now

 Add export/import facilities to the hive system
 ---

 Key: HIVE-1918
 URL: https://issues.apache.org/jira/browse/HIVE-1918
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Krishna Kumar
Assignee: Krishna Kumar
 Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.2.txt, 
 HIVE-1918.patch.3.txt, HIVE-1918.patch.txt, hive-metastore-er.pdf


 This is an enhancement request to add export/import features to hive.
 With this language extension, the user can export the data of the table - 
 which may be located in different hdfs locations in case of a partitioned 
 table - as well as the metadata of the table into a specified output 
 location. This output location can then be moved over to another different 
 hadoop/hive instance and imported there.  
 This should work independent of the source and target metastore dbms used; 
 for instance, between derby and mysql.
 For partitioned tables, the ability to export/import a subset of the 
 partition must be supported.
 Howl will add more features on top of this: The ability to create/use the 
 exported data even in the absence of hive, using MR or Pig. Please see 
 http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-02-04 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12990860#comment-12990860
 ] 

Carl Steinbach commented on HIVE-1918:
--

I updated the diff on reviewboard: https://reviews.apache.org/r/339

 Add export/import facilities to the hive system
 ---

 Key: HIVE-1918
 URL: https://issues.apache.org/jira/browse/HIVE-1918
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Krishna Kumar
Assignee: Krishna Kumar
 Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.2.txt, 
 HIVE-1918.patch.3.txt, HIVE-1918.patch.txt, hive-metastore-er.pdf


 This is an enhancement request to add export/import features to hive.
 With this language extension, the user can export the data of the table - 
 which may be located in different hdfs locations in case of a partitioned 
 table - as well as the metadata of the table into a specified output 
 location. This output location can then be moved over to another different 
 hadoop/hive instance and imported there.  
 This should work independent of the source and target metastore dbms used; 
 for instance, between derby and mysql.
 For partitioned tables, the ability to export/import a subset of the 
 partition must be supported.
 Howl will add more features on top of this: The ability to create/use the 
 exported data even in the absence of hive, using MR or Pig. Please see 
 http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-01-24 Thread Krishna Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12985556#action_12985556
 ] 

Krishna Kumar commented on HIVE-1918:
-

@Edward: Both the existing data model (prettified er diagram attached) and the 
object model (class org.apache.hadoop.hive.metastore.api.Partition) allow the 
specification of parameters on a per-partition basis. So I am not adding new 
fields to either of these models. By proposal 2 above, I will not be adding any 
ctor parameters to  org.apache.hadoop.hive.ql.metadata.Partition as well. 

Your point re providing manageability via ddl statements to all aspects of the 
data/object model is taken. But I am not adding new aspects to either model, so 
if indeed we need to address current manageability gaps, should they not be 
addressed via another enhancement request, rather than this one, which aims 
simply to add export/import facilities?

 Add export/import facilities to the hive system
 ---

 Key: HIVE-1918
 URL: https://issues.apache.org/jira/browse/HIVE-1918
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Krishna Kumar
 Attachments: HIVE-1918.patch.txt


 This is an enhancement request to add export/import features to hive.
 With this language extension, the user can export the data of the table - 
 which may be located in different hdfs locations in case of a partitioned 
 table - as well as the metadata of the table into a specified output 
 location. This output location can then be moved over to another different 
 hadoop/hive instance and imported there.  
 This should work independent of the source and target metastore dbms used; 
 for instance, between derby and mysql.
 For partitioned tables, the ability to export/import a subset of the 
 partition must be supported.
 Howl will add more features on top of this: The ability to create/use the 
 exported data even in the absence of hive, using MR or Pig. Please see 
 http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-01-24 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12985744#action_12985744
 ] 

Edward Capriolo commented on HIVE-1918:
---


{quote}
But I am not adding new aspects to either model, so if indeed we need to 
address current manageability gaps, should they not be addressed via another 
enhancement request, rather than this one, which aims simply to add 
export/import facilities?
{quote}
This depends on the lag time between the feature getting added and the 
enhancement being added. With wishful thinking the two events will be close, 
but history does not always agree. Management becomes more important as 
different people begin using the metastore for different purposes. I believe if 
we add any feature support to manage it completely with DML is mandatory. 
Currently the metastore is changing for security. import-export, and indexes if 
each of these features are only half manageable at time X this will make 
releases awkward and half functional. 

 Add export/import facilities to the hive system
 ---

 Key: HIVE-1918
 URL: https://issues.apache.org/jira/browse/HIVE-1918
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Krishna Kumar
 Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.txt, 
 hive-metastore-er.pdf


 This is an enhancement request to add export/import features to hive.
 With this language extension, the user can export the data of the table - 
 which may be located in different hdfs locations in case of a partitioned 
 table - as well as the metadata of the table into a specified output 
 location. This output location can then be moved over to another different 
 hadoop/hive instance and imported there.  
 This should work independent of the source and target metastore dbms used; 
 for instance, between derby and mysql.
 For partitioned tables, the ability to export/import a subset of the 
 partition must be supported.
 Howl will add more features on top of this: The ability to create/use the 
 exported data even in the absence of hive, using MR or Pig. Please see 
 http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-01-23 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12985527#action_12985527
 ] 

Namit Jain commented on HIVE-1918:
--

A couple of issues:

1. Can you add a new test directory - I mean, exporting in /tmp/.. means that 
there will be problems in concurrent tests on the same machine.
2. Do you want to support errors at import time - I mean, what happens if one 
of the rows is bad - should I have an option to specify the number
of rows to ignore, and dump them somewhere else ? 

 Add export/import facilities to the hive system
 ---

 Key: HIVE-1918
 URL: https://issues.apache.org/jira/browse/HIVE-1918
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Krishna Kumar
 Attachments: HIVE-1918.patch.txt


 This is an enhancement request to add export/import features to hive.
 With this language extension, the user can export the data of the table - 
 which may be located in different hdfs locations in case of a partitioned 
 table - as well as the metadata of the table into a specified output 
 location. This output location can then be moved over to another different 
 hadoop/hive instance and imported there.  
 This should work independent of the source and target metastore dbms used; 
 for instance, between derby and mysql.
 For partitioned tables, the ability to export/import a subset of the 
 partition must be supported.
 Howl will add more features on top of this: The ability to create/use the 
 exported data even in the absence of hive, using MR or Pig. Please see 
 http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-01-21 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12984981#action_12984981
 ] 

Carl Steinbach commented on HIVE-1918:
--

@Krishna: I tried applying the patch to trunk and running the tests. Noticed a 
couple things upfront that need to be fixed:

* The patch doesn't apply cleanly with 'patch -p0'. To satisfy this using a Git 
repo you need to generate the patch using 'git diff --no-prefix ...'
* Most of the new exim* tests fail with diffs. Can you please fix this and 
update the patch?
* 'hive.test.exim' needs to be added to HiveConf and hive-default.xml. There is 
already a 'hive.test.mode.*' namespace defined in HiveConf, so you should 
probably follow this convention and change the name to 'hive.test.mode.exim'. I 
think an even better solution would be to instead define a new conf property 
called 'hive.exim.uri.scheme.whitelist' and make it a comma separated list of 
acceptable URI schemes for import and export.


 Add export/import facilities to the hive system
 ---

 Key: HIVE-1918
 URL: https://issues.apache.org/jira/browse/HIVE-1918
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Krishna Kumar
 Attachments: HIVE-1918.patch.txt


 This is an enhancement request to add export/import features to hive.
 With this language extension, the user can export the data of the table - 
 which may be located in different hdfs locations in case of a partitioned 
 table - as well as the metadata of the table into a specified output 
 location. This output location can then be moved over to another different 
 hadoop/hive instance and imported there.  
 This should work independent of the source and target metastore dbms used; 
 for instance, between derby and mysql.
 For partitioned tables, the ability to export/import a subset of the 
 partition must be supported.
 Howl will add more features on top of this: The ability to create/use the 
 exported data even in the absence of hive, using MR or Pig. Please see 
 http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-01-20 Thread Krishna Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12984128#action_12984128
 ] 

Krishna Kumar commented on HIVE-1918:
-

Ok. Will take of this via a delegating ctor.

A process question: I guess I should wait for more comments from other 
reviewers before I create another patch in case if others are reviewing the 
current patch?

 Add export/import facilities to the hive system
 ---

 Key: HIVE-1918
 URL: https://issues.apache.org/jira/browse/HIVE-1918
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Krishna Kumar
 Attachments: HIVE-1918.patch.txt


 This is an enhancement request to add export/import features to hive.
 With this language extension, the user can export the data of the table - 
 which may be located in different hdfs locations in case of a partitioned 
 table - as well as the metadata of the table into a specified output 
 location. This output location can then be moved over to another different 
 hadoop/hive instance and imported there.  
 This should work independent of the source and target metastore dbms used; 
 for instance, between derby and mysql.
 For partitioned tables, the ability to export/import a subset of the 
 partition must be supported.
 Howl will add more features on top of this: The ability to create/use the 
 exported data even in the absence of hive, using MR or Pig. Please see 
 http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-01-20 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12984348#action_12984348
 ] 

Edward Capriolo commented on HIVE-1918:
---

I was not implying that we should definately not add {noformat}, MapString, 
String partParams{noformat}. What I am asking is what is the rational for 
doing it? I think we should not need to add things to the metastore to export 
it's information, but I might be missing something.

 Add export/import facilities to the hive system
 ---

 Key: HIVE-1918
 URL: https://issues.apache.org/jira/browse/HIVE-1918
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Krishna Kumar
 Attachments: HIVE-1918.patch.txt


 This is an enhancement request to add export/import features to hive.
 With this language extension, the user can export the data of the table - 
 which may be located in different hdfs locations in case of a partitioned 
 table - as well as the metadata of the table into a specified output 
 location. This output location can then be moved over to another different 
 hadoop/hive instance and imported there.  
 This should work independent of the source and target metastore dbms used; 
 for instance, between derby and mysql.
 For partitioned tables, the ability to export/import a subset of the 
 partition must be supported.
 Howl will add more features on top of this: The ability to create/use the 
 exported data even in the absence of hive, using MR or Pig. Please see 
 http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-01-20 Thread Krishna Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12984563#action_12984563
 ] 

Krishna Kumar commented on HIVE-1918:
-

Why export/import needs this change: It is not the export part, but rather the 
import part which needs this change. While creating a partition as part of an 
import, we need to be able to create the partition along with its ancillary 
data including partition parameters. But first part of the existing create 
partition flow (AddPartitionDesc - DDLTask.addPartition - 
Hive.createPartition) did not support partition params specification but the 
second part (metastore.api.Partition - IMetaStoreClient.add_partition - 
HiveMetaStore.HMSHandler.add_partition - ObjectStore.addPartition) does. So I 
added the ability to pass the partition parameters along in the first part of 
the flow.

In terms of options for compatible changes, there are two I can see:

1. The solution suggested above. Add an additional ctor so that no existing 
code breaks.

{noformat}
public Partition(Table tbl, MapString, String partSpec, Path location) {
  this(tbl, partSpec, location, null);
}

public Partition(Table tbl, MapString, String partSpec, Path location, 
MapString, String partParams) {...}
{noformat}

2. Have only the current ctor but in Hive.createPartition get the underlying 
metastore.api.Partition and set the parameters to it before passing it on to 
the metastoreClient.

Thoughts?

 Add export/import facilities to the hive system
 ---

 Key: HIVE-1918
 URL: https://issues.apache.org/jira/browse/HIVE-1918
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Krishna Kumar
 Attachments: HIVE-1918.patch.txt


 This is an enhancement request to add export/import features to hive.
 With this language extension, the user can export the data of the table - 
 which may be located in different hdfs locations in case of a partitioned 
 table - as well as the metadata of the table into a specified output 
 location. This output location can then be moved over to another different 
 hadoop/hive instance and imported there.  
 This should work independent of the source and target metastore dbms used; 
 for instance, between derby and mysql.
 For partitioned tables, the ability to export/import a subset of the 
 partition must be supported.
 Howl will add more features on top of this: The ability to create/use the 
 exported data even in the absence of hive, using MR or Pig. Please see 
 http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-01-18 Thread Krishna Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983532#action_12983532
 ] 

Krishna Kumar commented on HIVE-1918:
-

Design notes:

 - Export/Import modeled on existing load functionality. No new tasks added, 
but existing tasks for copy/move/create table/add partition et al reused.

  - EXPORT TABLE table [PARTITION (partition_col=partition_colval, ...) ] TO 
location
  - IMPORT [[EXTERNAL] TABLE table [PARTITION (partition_col=partition_colval, 
...)] ] FROM sourcelocation [LOCATION targetlocation] 

 - The data/metadata stored as an xml-serialized file for the metadata in the 
target directory plus sub-directories for the data files.

 Add export/import facilities to the hive system
 ---

 Key: HIVE-1918
 URL: https://issues.apache.org/jira/browse/HIVE-1918
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Krishna Kumar
 Attachments: HIVE-1918.patch.txt


 This is an enhancement request to add export/import features to hive.
 With this language extension, the user can export the data of the table - 
 which may be located in different hdfs locations in case of a partitioned 
 table - as well as the metadata of the table into a specified output 
 location. This output location can then be moved over to another different 
 hadoop/hive instance and imported there.  
 This should work independent of the source and target metastore dbms used; 
 for instance, between derby and mysql.
 For partitioned tables, the ability to export/import a subset of the 
 partition must be supported.
 Howl will add more features on top of this: The ability to create/use the 
 exported data even in the absence of hive, using MR or Pig. Please see 
 http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.