[ 
https://issues.apache.org/jira/browse/AVRO-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16661058#comment-16661058
 ] 

ASF GitHub Bot commented on AVRO-2090:
--------------------------------------

cutting closed pull request #350: AVRO-2090 second try
URL: https://github.com/apache/avro/pull/350
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/doc/src/content/xdocs/gettingstartedjava.xml 
b/doc/src/content/xdocs/gettingstartedjava.xml
index fe6c7d284..7f331e347 100644
--- a/doc/src/content/xdocs/gettingstartedjava.xml
+++ b/doc/src/content/xdocs/gettingstartedjava.xml
@@ -319,6 +319,43 @@ $ mvn compile # includes code generation via Avro Maven 
plugin
 $ mvn -q exec:java -Dexec.mainClass=example.SpecificMain
         </source>
       </section>
+      <section>
+        <title>Beta feature: Generating faster code</title>
+        <p>
+          In this release we have introduced a new approach to
+          generating code that speeds up decoding of objects by more
+          than 10% and encoding by more than 30% (future performance
+          enhancements are underway).  To ensure a smooth introduction
+          of this change into production systems, this feature is
+          controlled by a feature flag, the system
+          property <code>org.apache.avro.specific.use_custom_coders</code>.
+          In this first release, this feature is off by default.  To
+          turn it on, set the system flag to <code>true</code> at
+          runtime.  In the sample above, for example, you could enable
+          the fater coders as follows:
+        </p>
+        <source>
+$ mvn -q exec:java -Dexec.mainClass=example.SpecificMain \
+    -Dorg.apache.avro.specific.use_custom_coders=true
+        </source>
+        <p>
+          Note that you do <em>not</em> have to recompile your Avro
+          schema to have access to this feature.  The feature is
+          compiled and built into your code, and you turn it on and
+          off at runtime using the feature flag.  As a result, you can
+          turn it on during testing, for example, and then off in
+          production.  Or you can turn it on in production, and
+          quickly turn it off if something breaks.
+        </p>
+        <p>
+          We encourage the Avro community to exercise this new feature
+          early to help build confidence.  (For those paying
+          one-demand for compute resources in the cloud, it can lead
+          to meaningful cost savings.)  As confidence builds, we will
+          turn this feature on by default, and eventually eliminate
+          the feature flag (and the old code).
+        </p>
+      </section>
     </section>
 
     <section>
diff --git 
a/lang/java/avro/src/main/java/org/apache/avro/io/ResolvingDecoder.java 
b/lang/java/avro/src/main/java/org/apache/avro/io/ResolvingDecoder.java
index cb9a82b48..073ca27b0 100644
--- a/lang/java/avro/src/main/java/org/apache/avro/io/ResolvingDecoder.java
+++ b/lang/java/avro/src/main/java/org/apache/avro/io/ResolvingDecoder.java
@@ -116,9 +116,7 @@ public static Object resolve(Schema writer, Schema reader)
    * the above loop will always be correct.
    *
    * Throws a runtime exception if we're not just about to read the
-   * field of a record.  Also, this method will consume the field
-   * information, and thus may only be called <em>once</em> before
-   * reading the field value.  (However, if the client knows the
+   * first field of a record.  (If the client knows the
    * order of incoming fields, then the client does <em>not</em>
    * need to call this method but rather can just start reading the
    * field values.)
diff --git 
a/lang/java/avro/src/main/java/org/apache/avro/reflect/ReflectData.java 
b/lang/java/avro/src/main/java/org/apache/avro/reflect/ReflectData.java
index 1c179007a..79558ba47 100644
--- a/lang/java/avro/src/main/java/org/apache/avro/reflect/ReflectData.java
+++ b/lang/java/avro/src/main/java/org/apache/avro/reflect/ReflectData.java
@@ -66,6 +66,9 @@
 
 /** Utilities to use existing Java classes and interfaces via reflection. */
 public class ReflectData extends SpecificData {
+  @Override
+  public boolean useCustomCoders() { return false; }
+
   /** {@link ReflectData} implementation that permits null field values.  The
    * schema generated for each field is a union of its declared type and
    * null. */
diff --git 
a/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificData.java 
b/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificData.java
index 44de5c434..60d43dcf0 100644
--- a/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificData.java
+++ b/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificData.java
@@ -122,6 +122,22 @@ public DatumWriter createDatumWriter(Schema schema) {
   /** Return the singleton instance. */
   public static SpecificData get() { return INSTANCE; }
 
+  private boolean useCustomCoderFlag
+    = 
Boolean.parseBoolean(System.getProperty("org.apache.avro.specific.use_custom_coders","false"));
+
+  /** Retrieve the current value of the custom-coders feature flag.
+    * Defaults to <code>true</code>, but this default can be overriden
+    * using the system property
+    * <code>org.apache.avro.specific.use_custom_coders</code>, and can
+    * be set dynamically by {@link useCustomCoders}.  See <a
+    * 
href="https://avro.apache.org/docs/current/gettingstartedjava.html#Beta+feature:+Generating+faster+code"Getting
 started with Java</a> for more about this
+    * feature flag. */
+  public boolean useCustomCoders() { return useCustomCoderFlag; }
+
+  /** Dynamically set the value of the custom-coder feature flag.
+   *  See {@link useCustomCoders}. */
+  public void setCustomCoders(boolean flag) { useCustomCoderFlag = flag; }
+
   @Override
   protected boolean isEnum(Object datum) {
     return datum instanceof Enum || super.isEnum(datum);
diff --git 
a/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificDatumReader.java
 
b/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificDatumReader.java
index 29c989b78..ccf8107ac 100644
--- 
a/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificDatumReader.java
+++ 
b/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificDatumReader.java
@@ -101,6 +101,23 @@ private Class getPropAsClass(Schema schema, String prop) {
     }
   }
 
+  @Override
+  protected Object readRecord(Object old, Schema expected, ResolvingDecoder in)
+    throws IOException {
+    SpecificData data = getSpecificData();
+    if (data.useCustomCoders()) {
+      old = data.newRecord(old, expected);
+      if (old instanceof SpecificRecordBase) {
+        SpecificRecordBase d = (SpecificRecordBase) old;
+        if (d.hasCustomCoders()) {
+          d.customDecode(in);
+          return d;
+        }
+      }
+    }
+    return super.readRecord(old, expected, in);
+  }
+
   @Override
   protected void readField(Object r, Schema.Field f, Object oldDatum,
                            ResolvingDecoder in, Object state)
diff --git 
a/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificDatumWriter.java
 
b/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificDatumWriter.java
index 1204f4955..3d5e7ff4f 100644
--- 
a/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificDatumWriter.java
+++ 
b/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificDatumWriter.java
@@ -71,6 +71,19 @@ protected void writeString(Schema schema, Object datum, 
Encoder out)
     writeString(datum, out);
   }
 
+  @Override
+  protected void writeRecord(Schema schema, Object datum, Encoder out)
+    throws IOException {
+    if (datum instanceof SpecificRecordBase && 
this.getSpecificData().useCustomCoders()) {
+      SpecificRecordBase d = (SpecificRecordBase) datum;
+      if (d.hasCustomCoders()) {
+        d.customEncode(out);
+        return;
+      }
+    }
+    super.writeRecord(schema, datum, out);
+  }
+
   @Override
   protected void writeField(Object datum, Schema.Field f, Encoder out,
                             Object state) throws IOException {
diff --git 
a/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificRecordBase.java 
b/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificRecordBase.java
index 1902cbc52..eed41b514 100644
--- 
a/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificRecordBase.java
+++ 
b/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificRecordBase.java
@@ -25,6 +25,8 @@
 import org.apache.avro.Conversion;
 import org.apache.avro.Schema;
 import org.apache.avro.generic.GenericRecord;
+import org.apache.avro.io.ResolvingDecoder;
+import org.apache.avro.io.Encoder;
 
 /** Base class for generated record classes. */
 public abstract class SpecificRecordBase
@@ -90,4 +92,19 @@ public void readExternal(ObjectInput in)
     new SpecificDatumReader(getSchema())
       .read(this, SpecificData.getDecoder(in));
   }
+
+  /** Returns true iff an instance supports the {@link #encode} and
+    * {@link #decode} operations.  Should only be used by
+    * <code>SpecificDatumReader/Writer</code> to selectively use
+    * {@link #customEncode} and {@link #customDecode} to optimize the 
(de)serialization of
+    * values. */
+  protected boolean hasCustomCoders() { return false; }
+
+  protected void customEncode(Encoder out) throws IOException {
+    throw new UnsupportedOperationException();
+  }
+
+  protected void customDecode(ResolvingDecoder in) throws IOException {
+    throw new UnsupportedOperationException();
+  }
 }
diff --git a/lang/java/compiler/pom.xml b/lang/java/compiler/pom.xml
index ee260c7f1..c7cef915f 100644
--- a/lang/java/compiler/pom.xml
+++ b/lang/java/compiler/pom.xml
@@ -113,7 +113,57 @@
           </execution>
         </executions>
       </plugin>
-
+      <plugin>
+        <groupId>org.codehaus.mojo</groupId>
+        <artifactId>exec-maven-plugin</artifactId>
+        <version>1.6.0</version>
+        <executions>
+          <execution>
+            <phase>generate-test-sources</phase>
+            <goals>
+              <goal>exec</goal>
+            </goals>
+            <configuration>
+              <executable>java</executable>
+              <workingDirectory>/tmp</workingDirectory>
+              <classpathScope>test</classpathScope>
+              <arguments>
+                <argument>-classpath</argument>
+                <classpath></classpath>
+                
<argument>org.apache.avro.compiler.specific.SchemaTask</argument>
+                
<argument>${project.basedir}/src/test/resources/full_record_v1.avsc</argument>
+                
<argument>${project.basedir}/src/test/resources/full_record_v2.avsc</argument>
+                
<argument>${project.basedir}/target/generated-test-sources</argument>
+              </arguments>
+            </configuration>
+          </execution>
+        </executions>
+      </plugin>
+      <plugin>
+        <groupId>org.codehaus.mojo</groupId>
+        <artifactId>build-helper-maven-plugin</artifactId>
+        <version>3.0.0</version>
+        <executions>
+          <execution>
+            <!--
+              Usually code is generated using a special-purpose maven plugin 
and the plugin
+              automatically adds the generated sources into project.
+              Here since general-purpose exec plugin is used for generating 
code, we need to manually
+              add the sources.
+            -->
+            <id>add-source</id>
+            <phase>generate-test-sources</phase>
+            <goals>
+              <goal>add-test-source</goal>
+            </goals>
+            <configuration>
+              <sources>
+                
<source>${project.basedir}/target/generated-test-sources</source>
+              </sources>
+            </configuration>
+          </execution>
+        </executions>
+      </plugin>
     </plugins>
   </build>
 
diff --git 
a/lang/java/compiler/src/main/java/org/apache/avro/compiler/specific/SchemaTask.java
 
b/lang/java/compiler/src/main/java/org/apache/avro/compiler/specific/SchemaTask.java
index 9d2c244d3..89e2f882c 100644
--- 
a/lang/java/compiler/src/main/java/org/apache/avro/compiler/specific/SchemaTask.java
+++ 
b/lang/java/compiler/src/main/java/org/apache/avro/compiler/specific/SchemaTask.java
@@ -32,5 +32,15 @@ protected void doCompile(File src, File dest) throws 
IOException {
     compiler.setStringType(getStringType());
     compiler.compileToDestination(src, dest);
   }
+
+  public static void main(String[] args) throws IOException {
+    if (args.length < 2) {
+      System.err.println("Usage: SchemaTask <schema.avsc>... <output-folder>");
+      System.exit(1);
+    }
+    File dst = new File(args[args.length-1]);
+    for (int i = 0; i < args.length-1; i++)
+      new SchemaTask().doCompile(new File(args[i]), dst);
+  }
 }
 
diff --git 
a/lang/java/compiler/src/main/java/org/apache/avro/compiler/specific/SpecificCompiler.java
 
b/lang/java/compiler/src/main/java/org/apache/avro/compiler/specific/SpecificCompiler.java
index 575462e73..58c43d094 100644
--- 
a/lang/java/compiler/src/main/java/org/apache/avro/compiler/specific/SpecificCompiler.java
+++ 
b/lang/java/compiler/src/main/java/org/apache/avro/compiler/specific/SpecificCompiler.java
@@ -624,9 +624,24 @@ private Schema addStringType(Schema s, Map<Schema,Schema> 
seen) {
     return result;
   }
 
-  private String getStringType(JsonNode overrideClassProperty) {
-    if (overrideClassProperty != null)
-      return overrideClassProperty.getTextValue();
+  /** Utility for template use (and also internal use).  Returns
+    * a string giving the FQN of the Java type to be used for a string
+    * schema or for the key of a map schema.  (It's an error to call
+    * this on a schema other than a string or map.) */
+  public String getStringType(Schema s) {
+    String prop;
+    switch (s.getType()) {
+    case MAP:
+      prop = SpecificData.KEY_CLASS_PROP;
+      break;
+    case STRING:
+      prop = SpecificData.CLASS_PROP;
+      break;
+    default:
+      throw new IllegalArgumentException("Can't check string-type of 
non-string/map type: " + s);
+    }
+    JsonNode override = s.getJsonProp(prop);
+    if (override != null) return override.getTextValue();
     switch (stringType) {
     case String:        return "java.lang.String";
     case Utf8:          return "org.apache.avro.util.Utf8";
@@ -635,6 +650,17 @@ private String getStringType(JsonNode 
overrideClassProperty) {
    }
   }
 
+  /** Utility for template use.  Returns true iff a STRING-schema or
+    * the key of a MAP-schema is what SpecificData defines as
+    * "stringable" (which means we need to call toString on it before
+    * before writing it). */
+  public boolean isStringable(Schema schema) {
+    String t = getStringType(schema);
+    return ! (t.equals("java.lang.String")
+              || t.equals("java.lang.CharSequence")
+              || t.equals("org.apache.avro.util.Utf8"));
+  }
+
   private static final Schema NULL_SCHEMA = Schema.create(Schema.Type.NULL);
 
   /** Utility for template use.  Returns the java type for a Schema. */
@@ -659,15 +685,14 @@ private String javaType(Schema schema, boolean 
checkConvertedLogicalType) {
       return "java.util.List<" + javaType(schema.getElementType()) + ">";
     case MAP:
       return "java.util.Map<"
-        + getStringType(schema.getJsonProp(SpecificData.KEY_CLASS_PROP))+","
-        + javaType(schema.getValueType()) + ">";
+        + getStringType(schema)+ "," + javaType(schema.getValueType()) + ">";
     case UNION:
       List<Schema> types = schema.getTypes(); // elide unions with null
       if ((types.size() == 2) && types.contains(NULL_SCHEMA))
         return javaType(types.get(types.get(0).equals(NULL_SCHEMA) ? 1 : 0));
       return "java.lang.Object";
     case STRING:
-      return getStringType(schema.getJsonProp(SpecificData.CLASS_PROP));
+      return getStringType(schema);
     case BYTES:   return "java.nio.ByteBuffer";
     case INT:     return "java.lang.Integer";
     case LONG:    return "java.lang.Long";
@@ -708,6 +733,58 @@ public String javaUnbox(Schema schema) {
     }
   }
 
+
+  /** Utility for template use.  Return a string with a given number
+    * of spaces to be used for indentation purposes. */
+  public String indent(int n) {
+    return new String(new char[n]).replace('\0', ' ');
+  }
+
+  /** Utility for template use.  For a two-branch union type with
+    * one null branch, returns the index of the null branch.  It's an
+    * error to use on anything other than a two-branch union with on
+    * null branch. */
+  public int getNonNullIndex(Schema s) {
+    if (s.getType() != Schema.Type.UNION
+        || s.getTypes().size() != 2
+        || ! s.getTypes().contains(NULL_SCHEMA))
+      throw new IllegalArgumentException("Can only be used on 2-branch union 
with a null branch: " + s);
+    return (s.getTypes().get(0).equals(NULL_SCHEMA) ? 1 : 0);
+  }
+
+  /** Utility for template use.  Returns true if the encode/decode
+    * logic in record.vm can handle the schema being presented. */
+  public boolean isCustomCodable(Schema schema) {
+    if (schema.isError()) return false;
+    return isCustomCodable(schema, new HashSet<Schema>());
+  }
+
+  private boolean isCustomCodable(Schema schema, Set<Schema> seen) {
+    if (! seen.add(schema)) return true;
+    if (schema.getLogicalType() != null) return false;
+    boolean result = true;
+    switch (schema.getType()) {
+    case RECORD:
+      for (Schema.Field f : schema.getFields())
+        result &= isCustomCodable(f.schema(), seen);
+      break;
+    case MAP:
+      result = isCustomCodable(schema.getValueType(), seen);
+      break;
+    case ARRAY:
+      result = isCustomCodable(schema.getElementType(), seen);
+      break;
+    case UNION:
+      List<Schema> types = schema.getTypes();
+      // Only know how to handle "nulling" unions for now
+      if (types.size() != 2 || ! types.contains(NULL_SCHEMA)) return false;
+      for (Schema s : types) result &= isCustomCodable(s, seen);
+      break;
+    default:
+    }
+    return result;
+  }
+
   public boolean hasLogicalTypeField(Schema schema) {
     for (Schema.Field field : schema.getFields()) {
       if (field.schema().getLogicalType() != null) {
diff --git 
a/lang/java/compiler/src/main/velocity/org/apache/avro/compiler/specific/templates/java/classic/record.vm
 
b/lang/java/compiler/src/main/velocity/org/apache/avro/compiler/specific/templates/java/classic/record.vm
index 045c7e1ef..5a59a6ee9 100644
--- 
a/lang/java/compiler/src/main/velocity/org/apache/avro/compiler/specific/templates/java/classic/record.vm
+++ 
b/lang/java/compiler/src/main/velocity/org/apache/avro/compiler/specific/templates/java/classic/record.vm
@@ -19,7 +19,9 @@
 package $schema.getNamespace();
 #end
 
+import org.apache.avro.generic.GenericArray;
 import org.apache.avro.specific.SpecificData;
+import org.apache.avro.util.Utf8;
 #if (!$schema.isError())
 import org.apache.avro.message.BinaryMessageEncoder;
 import org.apache.avro.message.BinaryMessageDecoder;
@@ -496,4 +498,294 @@ public class ${this.mangle($schema.getName())}#if 
($schema.isError()) extends or
     READER$.read(this, SpecificData.getDecoder(in));
   }
 
+#if ($this.isCustomCodable($schema))
+  @Override protected boolean hasCustomCoders() { return true; }
+
+  @Override protected void customEncode(org.apache.avro.io.Encoder out)
+    throws java.io.IOException
+  {
+#set ($nv = 0)## Counter to ensure unique var-names
+#set ($maxnv = 0)## Holds high-water mark during recursion
+#foreach ($field in $schema.getFields())
+#set ($n = $this.mangle($field.name(), $schema.isError()))
+#set ($s = $field.schema())
+#encodeVar(0 "this.${n}" $s)
+
+#set ($nv = $maxnv)
+#end
+  }
+
+  @Override protected void customDecode(org.apache.avro.io.ResolvingDecoder in)
+    throws java.io.IOException
+  {
+    org.apache.avro.Schema.Field[] fieldOrder = in.readFieldOrder();
+    for (int i = 0; i < $schema.getFields().size(); i++) {
+      switch (fieldOrder[i].pos()) {
+#set ($fieldno = 0)
+#set ($nv = 0)## Counter to ensure unique var-names
+#set ($maxnv = 0)## Holds high-water mark during recursion
+#foreach ($field in $schema.getFields())
+      case $fieldno:
+#set ($n = $this.mangle($field.name(), $schema.isError()))
+#set ($s = $field.schema())
+#set ($rs = "SCHEMA$.getField(""${n}"").schema()")
+#decodeVar(4 "this.${n}" $s $rs)
+        break;
+
+#set ($nv = $maxnv)
+#set ($fieldno = $fieldno + 1)
+#end
+      default:
+        throw new java.io.IOException("Corrupt ResolvingDecoder.");
+      }
+    }
+  }
+#end
 }
+
+#macro( encodeVar $indent $var $s )
+#set ($I = $this.indent($indent))
+##### Compound types (array, map, and union) require calls
+##### that will recurse back into this encodeVar macro:
+#if ($s.Type.Name.equals("array"))
+#encodeArray($indent $var $s)
+#elseif ($s.Type.Name.equals("map"))
+#encodeMap($indent $var $s)
+#elseif ($s.Type.Name.equals("union"))
+#encodeUnion($indent $var $s)
+##### Use the generated "encode" method as fast way to write
+##### (specific) record types:
+#elseif ($s.Type.Name.equals("record"))
+$I    ${var}.customEncode(out);
+##### For rest of cases, generate calls out.writeXYZ:
+#elseif ($s.Type.Name.equals("null"))
+$I    out.writeNull();
+#elseif ($s.Type.Name.equals("boolean"))
+$I    out.writeBoolean(${var});
+#elseif ($s.Type.Name.equals("int"))
+$I    out.writeInt(${var});
+#elseif ($s.Type.Name.equals("long"))
+$I    out.writeLong(${var});
+#elseif ($s.Type.Name.equals("float"))
+$I    out.writeFloat(${var});
+#elseif ($s.Type.Name.equals("double"))
+$I    out.writeDouble(${var});
+#elseif ($s.Type.Name.equals("string"))
+#if ($this.isStringable($s))
+$I    out.writeString(${var}.toString());
+#else
+$I    out.writeString(${var});
+#end
+#elseif ($s.Type.Name.equals("bytes"))
+$I    out.writeBytes(${var});
+#elseif ($s.Type.Name.equals("fixed"))
+$I    out.writeFixed(${var}.bytes(), 0, ${s.FixedSize});
+#elseif ($s.Type.Name.equals("enum"))
+$I    out.writeEnum(${var}.ordinal());
+#else
+## TODO -- singal a code-gen-time error
+#end
+#end
+
+#macro( encodeArray $indent $var $s )
+#set ($I = $this.indent($indent))
+#set ($et = $this.javaType($s.ElementType))
+$I    long size${nv} = ${var}.size();
+$I    out.writeArrayStart();
+$I    out.setItemCount(size${nv});
+$I    long actualSize${nv} = 0;
+$I    for ($et e${nv}: ${var}) {
+$I      actualSize${nv}++;
+$I      out.startItem();
+#set ($var = "e${nv}")
+#set ($nv = $nv + 1)
+#set ($maxnv = $nv)
+#set ($indent = $indent + 2)
+#encodeVar($indent $var $s.ElementType)
+#set ($nv = $nv - 1)
+#set ($indent = $indent - 2)
+#set ($I = $this.indent($indent))
+$I    }
+$I    out.writeArrayEnd();
+$I    if (actualSize${nv} != size${nv})
+$I      throw new java.util.ConcurrentModificationException("Array-size 
written was " + size${nv} + ", but element count was " + actualSize${nv} + ".");
+#end
+
+#macro( encodeMap $indent $var $s )
+#set ($I = $this.indent($indent))
+#set ($kt = $this.getStringType($s))
+#set ($vt = $this.javaType($s.ValueType))
+$I    long size${nv} = ${var}.size();
+$I    out.writeMapStart();
+$I    out.setItemCount(size${nv});
+$I    long actualSize${nv} = 0;
+$I    for (java.util.Map.Entry<$kt, $vt> e${nv}: ${var}.entrySet()) {
+$I      actualSize${nv}++;
+$I      out.startItem();
+#if ($this.isStringable($s))
+$I      out.writeString(e${nv}.getKey().toString());
+#else
+$I      out.writeString(e${nv}.getKey());
+#end
+$I      $vt v${nv} = e${nv}.getValue();
+#set ($var = "v${nv}")
+#set ($nv = $nv + 1)
+#set ($maxnv = $nv)
+#set ($indent = $indent + 2)
+#encodeVar($indent $var $s.ValueType)
+#set ($nv = $nv - 1)
+#set ($indent = $indent - 2)
+#set ($I = $this.indent($indent))
+$I    }
+$I    out.writeMapEnd();
+$I    if (actualSize${nv} != size${nv})
+      throw new java.util.ConcurrentModificationException("Map-size written 
was " + size${nv} + ", but element count was " + actualSize${nv} + ".");
+#end
+
+#macro( encodeUnion $indent $var $s )
+#set ($I = $this.indent($indent))
+#set ($et = $this.javaType($s.Types.get($this.getNonNullIndex($s))))
+$I    if (${var} == null) {
+$I      out.writeIndex(#if($this.getNonNullIndex($s)==0)1#{else}0#end);
+$I      out.writeNull();
+$I    } else {
+$I      out.writeIndex(${this.getNonNullIndex($s)});
+#set ($indent = $indent + 2)
+#encodeVar($indent $var $s.Types.get($this.getNonNullIndex($s)))
+#set ($indent = $indent - 2)
+#set ($I = $this.indent($indent))
+$I    }
+#end
+
+
+#macro( decodeVar $indent $var $s $rs )
+#set ($I = $this.indent($indent))
+##### Compound types (array, map, and union) require calls
+##### that will recurse back into this decodeVar macro:
+#if ($s.Type.Name.equals("array"))
+#decodeArray($indent $var $s $rs)
+#elseif ($s.Type.Name.equals("map"))
+#decodeMap($indent $var $s $rs)
+#elseif ($s.Type.Name.equals("union"))
+#decodeUnion($indent $var $s $rs)
+##### Use the generated "decode" method as fast way to write
+##### (specific) record types:
+#elseif ($s.Type.Name.equals("record"))
+$I    if (${var} == null) {
+$I      ${var} = new ${this.javaType($s)}();
+$I    }
+$I    ${var}.customDecode(in);
+##### For rest of cases, generate calls in.readXYZ:
+#elseif ($s.Type.Name.equals("null"))
+$I    in.readNull();
+#elseif ($s.Type.Name.equals("boolean"))
+$I    $var = in.readBoolean();
+#elseif ($s.Type.Name.equals("int"))
+$I    $var = in.readInt();
+#elseif ($s.Type.Name.equals("long"))
+$I    $var = in.readLong();
+#elseif ($s.Type.Name.equals("float"))
+$I    $var = in.readFloat();
+#elseif ($s.Type.Name.equals("double"))
+$I    $var = in.readDouble();
+#elseif ($s.Type.Name.equals("string"))
+#decodeString( "$I" $var $s )
+#elseif ($s.Type.Name.equals("bytes"))
+$I    $var = in.readBytes(${var});
+#elseif ($s.Type.Name.equals("fixed"))
+$I    if (${var} == null) {
+$I      ${var} = new ${this.javaType($s)}();
+$I    }
+$I    in.readFixed(${var}.bytes(), 0, ${s.FixedSize});
+#elseif ($s.Type.Name.equals("enum"))
+$I    $var = ${this.javaType($s)}.values()[in.readEnum()];
+#else
+## TODO -- singal a code-gen-time error
+#end
+#end
+
+#macro( decodeString $II $var $s )
+#set ($st = ${this.getStringType($s)})
+#if ($this.isStringable($s))
+$II    ${var} = new ${st}(in.readString());
+#elseif ($st.equals("java.lang.String"))
+$II    $var = in.readString();
+#elseif ($st.equals("org.apache.avro.util.Utf8"))
+$II    $var = in.readString(${var});
+#else
+$II    $var = in.readString(${var} instanceof Utf8 ? (Utf8)${var} : null);
+#end
+#end
+
+#macro( decodeArray $indent $var $s $rs )
+#set ($I = $this.indent($indent))
+#set ($t = $this.javaType($s))
+#set ($et = $this.javaType($s.ElementType))
+#set ($gat = "SpecificData.Array<${et}>")
+$I    long size${nv} = in.readArrayStart();
+## Need fresh variable name due to limitation of macro system
+$I    $t a${nv} = ${var};
+$I    if (a${nv} == null) {
+$I      a${nv} = new ${gat}((int)size${nv}, ${rs});
+$I      $var = a${nv};
+$I    } else a${nv}.clear();
+$I    $gat ga${nv} = (a${nv} instanceof SpecificData.Array ? (${gat})a${nv} : 
null);
+$I    for ( ; 0 < size${nv}; size${nv} = in.arrayNext()) {
+$I      for ( ; size${nv} != 0; size${nv}--) {
+$I        $et e${nv} = (ga${nv} != null ? ga${nv}.peek() : null);
+#set ($var = "e${nv}")
+#set ($nv = $nv + 1)
+#set ($maxnv = $nv)
+#set ($indent = $indent + 4)
+#decodeVar($indent $var $s.ElementType "${rs}.getElementType()")
+#set ($nv = $nv - 1)
+#set ($indent = $indent - 4)
+#set ($I = $this.indent($indent))
+$I        a${nv}.add(e${nv});
+$I      }
+$I    }
+#end
+
+#macro( decodeMap $indent $var $s $rs )
+#set ($I = $this.indent($indent))
+#set ($t = $this.javaType($s))
+#set ($kt = $this.getStringType($s))
+#set ($vt = $this.javaType($s.ValueType))
+$I    long size${nv} = in.readMapStart();
+$I    $t m${nv} = ${var}; // Need fresh name due to limitation of macro system
+$I    if (m${nv} == null) {
+$I      m${nv} = new java.util.HashMap<${kt},${vt}>((int)size${nv});
+$I      $var = m${nv};
+$I    } else m${nv}.clear();
+$I    for ( ; 0 < size${nv}; size${nv} = in.mapNext()) {
+$I      for ( ; size${nv} != 0; size${nv}--) {
+$I        $kt k${nv} = null;
+#decodeString( "$I    " "k${nv}" $s )
+$I        $vt v${nv} = null;
+#set ($var = "v${nv}")
+#set ($nv = $nv + 1)
+#set ($maxnv = $nv)
+#set ($indent = $indent + 4)
+#decodeVar($indent $var $s.ValueType "${rs}.getValueType()")
+#set ($nv = $nv - 1)
+#set ($indent = $indent - 4)
+#set ($I = $this.indent($indent))
+$I        m${nv}.put(k${nv}, v${nv});
+$I      }
+$I    }
+#end
+
+#macro( decodeUnion $indent $var $s $rs )
+#set ($I = $this.indent($indent))
+#set ($et = $this.javaType($s.Types.get($this.getNonNullIndex($s))))
+#set ($si = $this.getNonNullIndex($s))
+$I    if (in.readIndex() != ${si}) {
+$I      in.readNull();
+$I      ${var} = null;
+$I    } else {
+#set ($indent = $indent + 2)
+#decodeVar($indent $var $s.Types.get($si) "${rs}.getTypes().get(${si})")
+#set ($indent = $indent - 2)
+#set ($I = $this.indent($indent))
+$I    }
+#end
diff --git 
a/lang/java/compiler/src/test/java/org/apache/avro/compiler/specific/TestSpecificCompiler.java
 
b/lang/java/compiler/src/test/java/org/apache/avro/compiler/specific/TestSpecificCompiler.java
index b5b3a0ab2..ceae52c12 100644
--- 
a/lang/java/compiler/src/test/java/org/apache/avro/compiler/specific/TestSpecificCompiler.java
+++ 
b/lang/java/compiler/src/test/java/org/apache/avro/compiler/specific/TestSpecificCompiler.java
@@ -81,7 +81,7 @@ public void setUp() {
   }
 
   @After
-  public void tearDow() {
+  public void tearDown() {
     if (this.outputFile != null) {
       this.outputFile.delete();
     }
@@ -622,8 +622,4 @@ public void 
testConversionInstanceWithDecimalLogicalTypeEnabled() throws Excepti
     Assert.assertEquals("Should use null for decimal if the flag is off",
         "null", compiler.conversionInstance(uuidSchema));
   }
-
-  public void testToFromByteBuffer() {
-
-  }
 }
diff --git 
a/lang/java/compiler/src/test/java/org/apache/avro/specific/TestGeneratedCode.java
 
b/lang/java/compiler/src/test/java/org/apache/avro/specific/TestGeneratedCode.java
new file mode 100644
index 000000000..394a5900c
--- /dev/null
+++ 
b/lang/java/compiler/src/test/java/org/apache/avro/specific/TestGeneratedCode.java
@@ -0,0 +1,93 @@
+/*
+ * Copyright 2017 The Apache Software Foundation.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.avro.specific;
+
+import java.io.ByteArrayInputStream;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+
+import org.apache.avro.Schema;
+import org.apache.avro.io.Encoder;
+import org.apache.avro.io.EncoderFactory;
+import org.apache.avro.io.Decoder;
+import org.apache.avro.io.DecoderFactory;
+import org.apache.avro.io.DatumReader;
+import org.apache.avro.io.DatumWriter;
+import org.apache.avro.util.Utf8;
+
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+import org.apache.avro.specific.test.FullRecordV1;
+import org.apache.avro.specific.test.FullRecordV2;
+
+public class TestGeneratedCode {
+
+  private final static SpecificData MODEL = new SpecificData();
+  private final static Schema V1S = FullRecordV1.getClassSchema();
+  private final static Schema V2S = FullRecordV2.getClassSchema();
+
+  @Before
+  public void setUp() {
+    MODEL.setCustomCoders(true);
+  }
+
+  @Test
+  public void withoutSchemaMigration() throws IOException {
+    FullRecordV1 src = new FullRecordV1(true, 87231, 731L, 54.2832F, 38.321, 
"Hi there", null);
+    Assert.assertTrue("Test schema must allow for custom coders.",
+                      ((SpecificRecordBase)src).hasCustomCoders());
+
+    ByteArrayOutputStream out = new ByteArrayOutputStream(1024);
+    Encoder e = EncoderFactory.get().directBinaryEncoder(out, null);
+    DatumWriter<FullRecordV1> w = 
(DatumWriter<FullRecordV1>)MODEL.createDatumWriter(V1S);
+    w.write(src, e);
+    e.flush();
+
+    ByteArrayInputStream in = new ByteArrayInputStream(out.toByteArray());
+    Decoder d = DecoderFactory.get().directBinaryDecoder(in, null);
+    DatumReader<FullRecordV1> r = 
(DatumReader<FullRecordV1>)MODEL.createDatumReader(V1S);
+    FullRecordV1 dst = r.read(null, d);
+
+    Assert.assertEquals(src, dst);
+  }
+
+  @Test
+  public void withSchemaMigration() throws IOException {
+    FullRecordV2 src = new FullRecordV2(true, 731, 87231, 38L, 54.2832F, "Hi 
there",
+                                        
ByteBuffer.wrap(Utf8.getBytesFor("Hello, world!")));
+    Assert.assertTrue("Test schema must allow for custom coders.",
+                      ((SpecificRecordBase)src).hasCustomCoders());
+
+    ByteArrayOutputStream out = new ByteArrayOutputStream(1024);
+    Encoder e = EncoderFactory.get().directBinaryEncoder(out, null);
+    DatumWriter<FullRecordV2> w = 
(DatumWriter<FullRecordV2>)MODEL.createDatumWriter(V2S);
+    w.write(src, e);
+    e.flush();
+
+    ByteArrayInputStream in = new ByteArrayInputStream(out.toByteArray());
+    Decoder d = DecoderFactory.get().directBinaryDecoder(in, null);
+    DatumReader<FullRecordV1> r = 
(DatumReader<FullRecordV1>)MODEL.createDatumReader(V2S, V1S);
+    FullRecordV1 dst = r.read(null, d);
+
+    FullRecordV1 expected = new FullRecordV1(true, 87231, 731L, 54.2832F, 
38.0, null,
+                                             "Hello, world!");
+    Assert.assertEquals(expected, dst);
+  }
+}
diff --git a/lang/java/compiler/src/test/resources/full_record_v1.avsc 
b/lang/java/compiler/src/test/resources/full_record_v1.avsc
new file mode 100644
index 000000000..4e2218875
--- /dev/null
+++ b/lang/java/compiler/src/test/resources/full_record_v1.avsc
@@ -0,0 +1,30 @@
+{
+  "type" : "record",
+  "name" : "FullRecordV1",
+  "doc" : "Test schema changes: this is the 'old' schema the SpecificRecord 
expects to see",
+  "namespace" : "org.apache.avro.specific.test",
+  "fields" : [ {
+    "name" : "b",
+    "type" : "boolean"
+  }, {
+    "name" : "i32",
+    "type" : "int"
+  }, {
+    "name" : "i64",
+    "type" : "long"
+  }, {
+    "name" : "f32",
+    "type" : "float"
+  }, {
+    "name" : "f64",
+    "type" : "double"
+  }, {
+    "name" : "s",
+    "type" : [ "null", "string" ],
+    "default" : null
+  }, {
+    "name" : "h",
+    "type" : [ "null", "string" ]
+  } ]
+}
+
diff --git a/lang/java/compiler/src/test/resources/full_record_v2.avsc 
b/lang/java/compiler/src/test/resources/full_record_v2.avsc
new file mode 100644
index 000000000..b80b9b4ae
--- /dev/null
+++ b/lang/java/compiler/src/test/resources/full_record_v2.avsc
@@ -0,0 +1,29 @@
+{
+  "type" : "record",
+  "name" : "FullRecordV2",
+  "doc" : "Test schema changes: this is the 'new' schema actually used to 
write data",
+  "namespace" : "org.apache.avro.specific.test",
+  "fields" : [ {
+    "name" : "b",
+    "type" : "boolean"
+  }, {
+    "name" : "i64",
+    "type" : "int"
+  }, {
+    "name" : "i32",
+    "type" : "int"
+  }, {
+    "name" : "f64",
+    "type" : "long"
+  }, {
+    "name" : "f32",
+    "type" : [ "float", "null" ]
+  }, {
+    "name" : "newfield",
+    "type" : "string"
+  }, {
+    "name" : "h",
+    "type" : "bytes"
+  } ]
+}
+
diff --git a/lang/java/pom.xml b/lang/java/pom.xml
index 471c34044..dd2c28533 100644
--- a/lang/java/pom.xml
+++ b/lang/java/pom.xml
@@ -195,6 +195,21 @@
           <groupId>org.apache.maven.plugins</groupId>
           <artifactId>maven-surefire-plugin</artifactId>
           <version>${surefire-plugin.version}</version>
+          <executions>
+            <execution>
+              <id>test-with-custom-coders</id>
+              <phase>test</phase>
+              <goals>
+                <goal>test</goal>
+              </goals>
+              <configuration>
+                <systemPropertyVariables>
+                  
<org.apache.avro.specific.use_custom_coders>true</org.apache.avro.specific.use_custom_coders>
+                  <test.dir>${project.basedir}/target/</test.dir>
+                </systemPropertyVariables>
+              </configuration>
+            </execution>
+          </executions>
           <configuration>
             <includes>
               <!-- Avro naming convention for JUnit tests -->
diff --git 
a/lang/java/tools/src/test/compiler/output-string/avro/examples/baseball/Player.java
 
b/lang/java/tools/src/test/compiler/output-string/avro/examples/baseball/Player.java
index 569f42711..26cc31fc0 100644
--- 
a/lang/java/tools/src/test/compiler/output-string/avro/examples/baseball/Player.java
+++ 
b/lang/java/tools/src/test/compiler/output-string/avro/examples/baseball/Player.java
@@ -5,7 +5,9 @@
  */
 package avro.examples.baseball;
 
+import org.apache.avro.generic.GenericArray;
 import org.apache.avro.specific.SpecificData;
+import org.apache.avro.util.Utf8;
 import org.apache.avro.message.BinaryMessageEncoder;
 import org.apache.avro.message.BinaryMessageDecoder;
 import org.apache.avro.message.SchemaStore;
@@ -472,4 +474,80 @@ public Player build() {
     READER$.read(this, SpecificData.getDecoder(in));
   }
 
+  @Override protected boolean hasCustomCoders() { return true; }
+
+  @Override protected void customEncode(org.apache.avro.io.Encoder out)
+    throws java.io.IOException
+  {
+    out.writeInt(this.number);
+
+    out.writeString(this.first_name);
+
+    out.writeString(this.last_name);
+
+    long size0 = this.position.size();
+    out.writeArrayStart();
+    out.setItemCount(size0);
+    long actualSize0 = 0;
+    for (avro.examples.baseball.Position e0: this.position) {
+      actualSize0++;
+      out.startItem();
+      out.writeEnum(e0.ordinal());
+    }
+    out.writeArrayEnd();
+    if (actualSize0 != size0)
+      throw new java.util.ConcurrentModificationException("Array-size written 
was " + size0 + ", but element count was " + actualSize0 + ".");
+
+  }
+
+  @Override protected void customDecode(org.apache.avro.io.ResolvingDecoder in)
+    throws java.io.IOException
+  {
+    org.apache.avro.Schema.Field[] fieldOrder = in.readFieldOrder();
+    for (int i = 0; i < 4; i++) {
+      switch (fieldOrder[i].pos()) {
+      case 0:
+        this.number = in.readInt();
+        break;
+
+      case 1:
+        this.first_name = in.readString();
+        break;
+
+      case 2:
+        this.last_name = in.readString();
+        break;
+
+      case 3:
+        long size0 = in.readArrayStart();
+        java.util.List<avro.examples.baseball.Position> a0 = this.position;
+        if (a0 == null) {
+          a0 = new 
SpecificData.Array<avro.examples.baseball.Position>((int)size0, 
SCHEMA$.getField("position").schema());
+          this.position = a0;
+        } else a0.clear();
+        SpecificData.Array<avro.examples.baseball.Position> ga0 = (a0 
instanceof SpecificData.Array ? 
(SpecificData.Array<avro.examples.baseball.Position>)a0 : null);
+        for ( ; 0 < size0; size0 = in.arrayNext()) {
+          for ( ; size0 != 0; size0--) {
+            avro.examples.baseball.Position e0 = (ga0 != null ? ga0.peek() : 
null);
+            e0 = avro.examples.baseball.Position.values()[in.readEnum()];
+            a0.add(e0);
+          }
+        }
+        break;
+
+      default:
+        throw new java.io.IOException("Corrupt ResolvingDecoder.");
+      }
+    }
+  }
 }
+
+
+
+
+
+
+
+
+
+
diff --git a/lang/java/tools/src/test/compiler/output/Player.java 
b/lang/java/tools/src/test/compiler/output/Player.java
index 5bbb3b018..8eaf5d7ad 100644
--- a/lang/java/tools/src/test/compiler/output/Player.java
+++ b/lang/java/tools/src/test/compiler/output/Player.java
@@ -5,7 +5,9 @@
  */
 package avro.examples.baseball;
 
+import org.apache.avro.generic.GenericArray;
 import org.apache.avro.specific.SpecificData;
+import org.apache.avro.util.Utf8;
 import org.apache.avro.message.BinaryMessageEncoder;
 import org.apache.avro.message.BinaryMessageDecoder;
 import org.apache.avro.message.SchemaStore;
@@ -472,4 +474,80 @@ public Player build() {
     READER$.read(this, SpecificData.getDecoder(in));
   }
 
+  @Override protected boolean hasCustomCoders() { return true; }
+
+  @Override protected void customEncode(org.apache.avro.io.Encoder out)
+    throws java.io.IOException
+  {
+    out.writeInt(this.number);
+
+    out.writeString(this.first_name);
+
+    out.writeString(this.last_name);
+
+    long size0 = this.position.size();
+    out.writeArrayStart();
+    out.setItemCount(size0);
+    long actualSize0 = 0;
+    for (avro.examples.baseball.Position e0: this.position) {
+      actualSize0++;
+      out.startItem();
+      out.writeEnum(e0.ordinal());
+    }
+    out.writeArrayEnd();
+    if (actualSize0 != size0)
+      throw new java.util.ConcurrentModificationException("Array-size written 
was " + size0 + ", but element count was " + actualSize0 + ".");
+
+  }
+
+  @Override protected void customDecode(org.apache.avro.io.ResolvingDecoder in)
+    throws java.io.IOException
+  {
+    org.apache.avro.Schema.Field[] fieldOrder = in.readFieldOrder();
+    for (int i = 0; i < 4; i++) {
+      switch (fieldOrder[i].pos()) {
+      case 0:
+        this.number = in.readInt();
+        break;
+
+      case 1:
+        this.first_name = in.readString(this.first_name instanceof Utf8 ? 
(Utf8)this.first_name : null);
+        break;
+
+      case 2:
+        this.last_name = in.readString(this.last_name instanceof Utf8 ? 
(Utf8)this.last_name : null);
+        break;
+
+      case 3:
+        long size0 = in.readArrayStart();
+        java.util.List<avro.examples.baseball.Position> a0 = this.position;
+        if (a0 == null) {
+          a0 = new 
SpecificData.Array<avro.examples.baseball.Position>((int)size0, 
SCHEMA$.getField("position").schema());
+          this.position = a0;
+        } else a0.clear();
+        SpecificData.Array<avro.examples.baseball.Position> ga0 = (a0 
instanceof SpecificData.Array ? 
(SpecificData.Array<avro.examples.baseball.Position>)a0 : null);
+        for ( ; 0 < size0; size0 = in.arrayNext()) {
+          for ( ; size0 != 0; size0--) {
+            avro.examples.baseball.Position e0 = (ga0 != null ? ga0.peek() : 
null);
+            e0 = avro.examples.baseball.Position.values()[in.readEnum()];
+            a0.add(e0);
+          }
+        }
+        break;
+
+      default:
+        throw new java.io.IOException("Corrupt ResolvingDecoder.");
+      }
+    }
+  }
 }
+
+
+
+
+
+
+
+
+
+


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve encode/decode time for SpecificRecord using code generation
> -------------------------------------------------------------------
>
>                 Key: AVRO-2090
>                 URL: https://issues.apache.org/jira/browse/AVRO-2090
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>            Reporter: Raymie Stata
>            Assignee: Raymie Stata
>            Priority: Major
>         Attachments: customcoders.md, perf-data.txt
>
>
> New implementation for generation of code for SpecificRecord that improves 
> decoding by over 10% and encoding over 30% (more improvements are on the 
> way).  This feature is behind a feature flag 
> ({{org.apache.avro.specific.use_custom_coders}}) and (for now) turned off by 
> default.  See [Getting Started 
> (Java)|https://avro.apache.org/docs/current/gettingstartedjava.html#Beta+feature:+Generating+faster+code]
>  for instructions.
> (A bit more info: Compared to GenericRecords, SpecificRecords offer 
> type-safety plus the performance of traditional getters/setters/instance 
> variables.  But these are only beneficial to Java code accessing those 
> records.  SpecificRecords inherit serialization and deserialization code from 
> GenericRecords, which is dynamic and thus slow (in fact, benchmarks show that 
> serialization and deserialization is _slower_ for SpecificRecord than for 
> GenericRecord).  This patch extends record.vm to generate custom, 
> higher-performance encoder and decoder functions for SpecificRecords.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to