[ 
https://issues.apache.org/jira/browse/AVRO-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16742948#comment-16742948
 ] 

Rumeshkrishnan commented on AVRO-2299:
--------------------------------------

[~cutting] Nice, the SchemaNormalization will be helpful. But it keeps only 
attributes that are relevant to parsing data, which are: type, name, fields, 
symbols, items, values, size. I was expecting `default` value along with that. 

Example Code: 

 
{code:java}
import scala.collection.JavaConverters._
import org.apache.avro._
import org.apache.avro.Schema.Parser
import scala.util.Try

object Test extends App {

  val schemaOne: String =
    """{
      |  "name": "testSchema",
      |  "namespace": "com.avro",
      |  "type": "record",
      |  "fields": [
      |    {
      |      "name": "email",
      |      "type": "string",
      |      "doc": "email id",
      |      "user_field_prop": "xxxxx"
      |    }
      |  ],
      |  "user_schema_prop": "xxxxxx"
      |}""".stripMargin

  val schemaTwo: String =
    """{
      |  "name": "testSchema",
      |  "namespace": "com.avro",
      |  "type": "record",
      |  "fields": [
      |    {
      |      "name": "email",
      |      "type": "string",
      |      "doc": "email id",
      |      "user_field_prop": "xxxxx"
      |    },
      |    {
      |      "name": "country",
      |      "type": "string",
      |      "doc": "country",
      |      "user_field_prop": "xxxxx",
      |      "default": "IN"
      |    }
      |  ],
      |  "user_schema_prop": "xxxxxx"
      |}""".stripMargin

  def getSchema(schemaString: String): Schema = new Parser().parse(schemaString)
  def normalize(schema: Schema): Schema = 
getSchema(SchemaNormalization.toParsingForm(schema))

  // Validator to check full compatibility on two schema versions
  val getFullValidator = new 
SchemaValidatorBuilder().mutualReadStrategy().validateLatest()

  def fullCompatibilityCheck(oldSchema: Schema, newSchema: Schema): Boolean =
    Try {
      getFullValidator.validate(newSchema, List(oldSchema).asJava)
      true
    }.toOption.getOrElse(false)
  
  val schema1: Schema = getSchema(schemaOne)
  val schema2: Schema = getSchema(schemaTwo)

  val normSchema1 = normalize(schema1)
  val normSchema2 = normalize(schema2)

  val compatibilityResultBeforeNormalization = fullCompatibilityCheck(schema1, 
schema2)
  val compatibilityResultAfterNormalization = 
fullCompatibilityCheck(normSchema1, normSchema2)
  
  // display result
  println(compatibilityResultBeforeNormalization)
  println(compatibilityResultAfterNormalization)
  
}{code}
Result :
{code:java}
true
false{code}
When I am trying to do compatibility check before and after normalisation, 
results are different due to `default` property missing. 

Is it possible to add the parsing logic not strip `default` if exist. ?

 

 

> Get Plain Schema
> ----------------
>
>                 Key: AVRO-2299
>                 URL: https://issues.apache.org/jira/browse/AVRO-2299
>             Project: Apache Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.8.2
>            Reporter: Rumeshkrishnan
>            Priority: Minor
>              Labels: features
>             Fix For: 1.9.0, 1.8.3, 1.8.4
>
>
> {panel:title=Avro Schema Reserved Keys:}
> "doc", "fields", "items", "name", "namespace",
>  "size", "symbols", "values", "type", "aliases", "default"
> {panel}
> AVRO also supports user defined properties for both Schema and Field.
> Is there way to get the schema with reserved property (key, value)? 
> Input Schema: 
> {code:java}
> {
>   "name": "testSchema",
>   "namespace": "com.avro",
>   "type": "record",
>   "fields": [
>     {
>       "name": "email",
>       "type": "string",
>       "doc": "email id",
>       "user_field_prop": "xxxxx"
>     }
>   ],
>   "user_schema_prop": "xxxxxx"
> }{code}
> Expected Plain Schema:
> {code:java}
> {
>   "name": "testSchema",
>   "namespace": "com.avro",
>   "type": "record",
>   "fields": [
>     {
>       "name": "email",
>       "type": "string",
>       "doc": "email id"
>     }
>   ]
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to