[ 
https://issues.apache.org/jira/browse/HIVE-6109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13880526#comment-13880526
 ] 

Sushanth Sowmyan commented on HIVE-6109:
----------------------------------------

Apparently reviewboard is having a fair number of issues, so I'm copy-pasting 
my comments from reviewboard on to here:

hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputCommitterContainer.java
 (Diff revision 1)
{noformat}
73      
  private static final String DYNTEMP_DIR_NAME = "_DYN";
{noformat}
"_DYN" is already defined in FosterStorageHandler, needs to have one place 
where it's defined. I'm okay with it being defined here if the 
FosterStorageHandler constant is removed and references to that are changed to 
reference this.


hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputCommitterContainer.java
 (Diff revision 1)
{noformat}
public void commitJob(JobContext jobContext) throws IOException {
272     
    
{noformat}
whitespace errors - git refers to a bunch of these through the patch when we 
try to apply, please correct for final patch upload.

hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputCommitterContainer.java
 (Diff revision 1)
{noformat}
private void registerPartitions(JobContext context) throws IOException{
662     
720     
        if (customDynamicLocationUsed) {
{noformat}
A bit about code readability - if we add a special case, then it makes sense to 
add the special case as an else, rather than as an if - that way, the default 
behaviour is visible first, and then the special case - please swap this around 
so that this is a if (!customDynamicLocationUsed) structure.


hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputCommitterContainer.java
 (Diff revision 1)
{noformat}
private void registerPartitions(JobContext context) throws IOException{
764     
          if (customDynamicLocationUsed) {
{noformat}
This is now significant amount of code repetition from line 720-741 above, 
please see if we can refactor this into a separate method.


hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/OutputJobInfo.java
 (Diff revision 1)
{noformat}
public String getLocation() {
178     
  public void setCustomDynamicLocation(String customDynamicRoot, String 
customDynamicPath) {
{noformat}
This becomes the primary API point with this change, wherein, a user that is 
using HCatOutputFormat will generate an OutputJobInfo, and then call 
setCustomDynamicLocation on it. This is fine for M/R users of HCat, but is 
something that will wind up having to be implemented for each M/R user. It 
might have been better to define a constant in HCatConstants, say 
"hcat.dynamic.partitioning.custom.pattern", and to use that as a JobInfo 
parameter. That makes it easier for other tools to integrate with this feature. 
For example, with your patch, we still do not support the ability for the 
HCatStorer from pig to be able to write to custom dynamic partitions, while we 
do want to keep feature parity where possible between HCatOutputFormat and 
HCatStorer.

In fact, as a design goal for HCat, we're trying to move away from 
letting(requiring) users explicitly muck around with OutputJobInfo and 
InputJobInfo, and stick to static calls to HCatInputFormat/HCatOutputFormat.

I would like to see this call be something the HCatOutputFormat automatically 
calls if a jobConf parameter(as above) is set. That way, we can solve pig 
compatibility as well easily

> Support customized location for EXTERNAL tables created by Dynamic 
> Partitioning
> -------------------------------------------------------------------------------
>
>                 Key: HIVE-6109
>                 URL: https://issues.apache.org/jira/browse/HIVE-6109
>             Project: Hive
>          Issue Type: Improvement
>          Components: HCatalog
>            Reporter: Satish Mittal
>            Assignee: Satish Mittal
>         Attachments: HIVE-6109.1.patch.txt, HIVE-6109.2.patch.txt, 
> HIVE-6109.pdf
>
>
> Currently when dynamic partitions are created by HCatalog, the underlying 
> directories for the partitions are created in a fixed 'Hive-style' format, 
> i.e. root_dir/key1=value1/key2=value2/.... and so on. However in case of 
> external table, user should be able to control the format of directories 
> created for dynamic partitions.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to