[ 
https://issues.apache.org/jira/browse/SQOOP-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho updated SQOOP-2299:
--------------------------------------
    Attachment: SQOOP-2299.patch

Attaching partial patch - I have working Derby repository, but I'm missing 
support in PostgreSQL repository. As quite huge unit of work is done (and 
PostgreSQL will be "the same"), I think that it would be great if others can 
take a look and comment if my direction make sense. 

Nevertheless please do not commit this patch yet, I will include the PostgreSQL 
repo and probably clean it up a bit before committing.

> Sqoop2: Store Context classes in repository
> -------------------------------------------
>
>                 Key: SQOOP-2299
>                 URL: https://issues.apache.org/jira/browse/SQOOP-2299
>             Project: Sqoop
>          Issue Type: Bug
>    Affects Versions: 1.99.5
>            Reporter: Jarek Jarcec Cecho
>            Assignee: Jarek Jarcec Cecho
>             Fix For: 1.99.7
>
>         Attachments: SQOOP-2299.patch
>
>
> While looking into persisting state from incremental job (SQOOP-1803), I've 
> uncover a Hadoop bug where any Hadoop 2 will return incorrect {{job.xml}} 
> when using {{JobClient}} APIs to get job's details. The issue is harder to 
> track as it was initially fixed in Hadoop 2.7.0 via MAPREDUCE-5875, but 
> subsequently reverted because of MAPREDUCE-6288 and it's not clear to me 
> when/if the fix will be provided. This is relevant to us as we are storing 
> our {{Context}} classes in job conf. I've looked around why nobody seen this 
> problem before and it seems that projects are generally persisting properties 
> in their repositories rather then using Hadoop APIs to retrieve the 
> {{Configuration}} object back.
> Thinking about it a bit more, I think that it would be useful to keep track 
> of the context classes as they contain additional information that can be 
> useful for debugging purpose. I'm not yet sure whether we should expose those 
> objects over the REST interface as they can possibly contain sensitive 
> information, but it seems useful to at least persist those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to