[ 
https://issues.apache.org/jira/browse/SQOOP-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Veena Basavaraj updated SQOOP-1719:
-----------------------------------
    Description: 
Today we have a Matcher code that checks for existence of atleast one schema.
{code}

public Matcher(Schema fromSchema, Schema toSchema) {
    if (fromSchema.isEmpty() && toSchema.isEmpty()) {
      throw new SqoopException(MatcherError.MATCHER_0000, "Neither a FROM or TO 
schemas been provided.");
    } else if (toSchema.isEmpty()) {
      this.fromSchema = fromSchema;
      this.toSchema = fromSchema;
    } else if (fromSchema.isEmpty()) {
      this.fromSchema = toSchema;
      this.toSchema = toSchema;
    } else {
      this.fromSchema = fromSchema;
      this.toSchema = toSchema;
    }
  }
{code}
if both exist, then in addition to this we need to validate that they both are 
compatible.

For instance if we have From schema with a one column of type String and then a 
To schema with one column of type INTEGER, then we should warn/ fail to even 
start the JOB since it might not be recommended . These validation rules are 
not documented in Sqoop and if implemented should be configurable if possible 
externally per job.


Second, such validation should happen before the job is submitted. But for that 
we need to get the schemas. so It may not be not be possible to avoid starting 
the job.

  was:
Today we have a Matcher code that checks for existence of atleast one schema.
{code}

public Matcher(Schema fromSchema, Schema toSchema) {
    if (fromSchema.isEmpty() && toSchema.isEmpty()) {
      throw new SqoopException(MatcherError.MATCHER_0000, "Neither a FROM or TO 
schemas been provided.");
    } else if (toSchema.isEmpty()) {
      this.fromSchema = fromSchema;
      this.toSchema = fromSchema;
    } else if (fromSchema.isEmpty()) {
      this.fromSchema = toSchema;
      this.toSchema = toSchema;
    } else {
      this.fromSchema = fromSchema;
      this.toSchema = toSchema;
    }
  }
{code}
if both exist, then in addition to this we need to validate that they both are 
compatible.

For instance if we have From schema with a one column of type String and then a 
To schema with one column of type INTEGER, then we should warn/ fail to even 
start the JOB since it might not be recommended 


these validation rules are not documented in Sqoop and atleast should be 
configurable if possible externally. 


> Schema Validation Rules between From and To Schema
> --------------------------------------------------
>
>                 Key: SQOOP-1719
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1719
>             Project: Sqoop
>          Issue Type: Sub-task
>          Components: sqoop2-framework
>            Reporter: Veena Basavaraj
>            Assignee: Veena Basavaraj
>             Fix For: 1.99.5
>
>
> Today we have a Matcher code that checks for existence of atleast one schema.
> {code}
> public Matcher(Schema fromSchema, Schema toSchema) {
>     if (fromSchema.isEmpty() && toSchema.isEmpty()) {
>       throw new SqoopException(MatcherError.MATCHER_0000, "Neither a FROM or 
> TO schemas been provided.");
>     } else if (toSchema.isEmpty()) {
>       this.fromSchema = fromSchema;
>       this.toSchema = fromSchema;
>     } else if (fromSchema.isEmpty()) {
>       this.fromSchema = toSchema;
>       this.toSchema = toSchema;
>     } else {
>       this.fromSchema = fromSchema;
>       this.toSchema = toSchema;
>     }
>   }
> {code}
> if both exist, then in addition to this we need to validate that they both 
> are compatible.
> For instance if we have From schema with a one column of type String and then 
> a To schema with one column of type INTEGER, then we should warn/ fail to 
> even start the JOB since it might not be recommended . These validation rules 
> are not documented in Sqoop and if implemented should be configurable if 
> possible externally per job.
> Second, such validation should happen before the job is submitted. But for 
> that we need to get the schemas. so It may not be not be possible to avoid 
> starting the job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to