[ 
https://issues.apache.org/jira/browse/HCATALOG-64?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13083856#comment-13083856
 ] 

Francis Liu edited comment on HCATALOG-64 at 8/12/11 1:42 AM:
--------------------------------------------------------------

Thanks for addressing my concerns.

a) So users can specify implementation specific hints but not parameters? If 
you're concern is abstraction this will be exposing implementation anyway 
except the user is "not sure". Most users won't supply hints unless they know 
the actual underlying storage system.

b) An HBaseTableInfo sounds like a good idea but it won't solve our use case 
since it will be populated with metastore information and not job specific ones 
(or at least in this patch :-)). Probably you meant an HBaseInputJobInfo class 
which was one of my suggestions in hbase-dev. I didn't go this route because it 
was too invasive and I felt we needed to do some experimentation with the first 
few drops to really figure things out. BTW the twiki is a bit outdated because 
of the refactoring and some redesign. I'll try to migrate the internal one into 
confluence. 

c) For input we need support for passing a specific version and a range of 
versions, per column family. We need this to support the repeatable read 
feature we are trying to develop. For output we need just a single version.

It seems you're main concern now is not using a properties field for 
implementation specific parameters. We can definitely explore that route but 
why don't we experiment with this simpler solution in the meantime we are 
ironing a better solution out?

      was (Author: toffer):
    Thanks for addressing my concerns.

a) So users can specify implementation specific hints but not parameters? If 
you're concern is abstraction this will be exposing implementation anyway 
except the user is "not sure". Most users won't supply hints unless they know 
the actual underlying storage system.

b) An HBaseTableInfo sounds like a good idea but it won't solve our use case 
since it will be populated with metastore information and not job specific ones 
(or at least in this patch :-)). Probably you meant an HBaseInputJobInfo class 
which was one of my suggestions in hbase-dev. I didn't go this route because it 
was too invasive and I felt we needed to do some experimentation with the first 
few drops to really figure things out. BTW the twiki is a bit outdated because 
of the refactoring and some redesign. I'll try to migrate the internal one into 
confluence. 

c) For input we need support for passing a specific version and a range of 
versions, possibly per column family. We need this to support the repeatable 
read feature we are trying to develop. For output we need just a single version.

It seems you're main concern now is not using a properties field for 
implementation specific parameters. We can definitely explore that route but 
why don't we experiment with this simpler solution in the meantime we are 
ironing a better solution out?
  
> Refactor HCatTableInfo, JobInfo and OutputJobInfo
> -------------------------------------------------
>
>                 Key: HCATALOG-64
>                 URL: https://issues.apache.org/jira/browse/HCATALOG-64
>             Project: HCatalog
>          Issue Type: Improvement
>    Affects Versions: 0.1, 0.2
>            Reporter: Francis Liu
>            Assignee: Francis Liu
>             Fix For: 0.2
>
>         Attachments: HCatTableInfo_JobInfo_OutputJobInfo_3.patch
>
>
> These classes and their roles has become convoluted. HCatTableInfo should be 
> an HCat abstraction of table and thus not have any job specific information 
> and should not contain different information depending on usage. *JobInfo 
> classes should contain job specific information (user provided, derived from 
> metastore info, etc). Since *JobInfo contains such information it should be 
> the object which is passed to HCatInputFormat.setInput and 
> HCatInputFormat.setOutput. Also JobInfo should be renamed to InputJobInfo for 
> consistency and clarity. Also there needs to be a way to pass implementation 
> specific configuration information down to the actual storage driver.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to