[
https://issues.apache.org/jira/browse/AVRO-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13720966#comment-13720966
]
Vincenz Priesnitz commented on AVRO-1341:
-----------------------------------------
You are right. The patch made record reading and writing take about twice as
long.
Here is the reflection performance of the trunk:
{noformat}
test name time M
entries/sec M bytes/sec bytes/cycle
ReflectRecordRead: 5646 ms 2.952 114.543
808498
ReflectRecordWrite: 3537 ms 4.711 182.822
808498
ReflectBigRecordRead: 6044 ms 1.654 101.558
767380
ReflectBigRecordWrite: 4222 ms 2.368 145.384
767380
ReflectFloatRead: 5519 ms 0.000 144.932
1000004
ReflectFloatWrite: 1210 ms 0.001 660.832
1000004
ReflectDoubleRead: 7310 ms 0.000 218.876
2000004
ReflectDoubleWrite: 2190 ms 0.000 730.585
2000004
ReflectIntArrayRead: 8980 ms 1.856 76.589
859709
ReflectIntArrayWrite: 2707 ms 6.156 254.031
859709
ReflectLongArrayRead: 4569 ms 1.824 140.991
805344
ReflectLongArrayWrite: 1781 ms 4.677 361.609
805344
ReflectDoubleArrayRead: 5396 ms 1.853 121.281
818144
ReflectDoubleArrayWrite: 1652 ms 6.051 396.060
818144
ReflectFloatArrayRead: 9788 ms 2.043 69.156
846172
ReflectFloatArrayWrite: 2309 ms 8.661 293.156
846172
ReflectNestedFloatArrayRead: 11524 ms 1.735 58.738
846172
ReflectNestedFloatArrayWrite: 4506 ms 4.438 150.199
846172
ReflectNestedObjectArrayRead: 9895 ms 0.404 52.156
645104
ReflectNestedObjectArrayWrite: 5745 ms 0.696 89.822
645104
ReflectNestedLargeFloatArrayRead: 7262 ms 0.459 119.783
1087381
ReflectNestedLargeFloatArrayWrite: 2006 ms 1.661 433.513
1087381
ReflectNestedLargeFloatArrayBlockedRead: 7401 ms 0.450 119.034
1101357
ReflectNestedLargeFloatArrayBlockedWrite: 4797 ms 0.695 183.666
1101357
{noformat}
With the patch applied:
{noformat}
test name time M
entries/sec M bytes/sec bytes/cycle
ReflectRecordRead: 9332 ms 1.786 69.305
808498
ReflectRecordWrite: 7412 ms 2.248 87.252
808498
ReflectBigRecordRead: 9533 ms 1.049 64.392
767380
ReflectBigRecordWrite: 8132 ms 1.230 75.487
767380
ReflectFloatRead: 5432 ms 0.000 147.256
1000004
ReflectFloatWrite: 1172 ms 0.001 682.323
1000004
ReflectDoubleRead: 6885 ms 0.000 232.387
2000004
ReflectDoubleWrite: 2303 ms 0.000 694.613
2000004
ReflectIntArrayRead: 8244 ms 2.022 83.426
859709
ReflectIntArrayWrite: 2517 ms 6.619 273.148
859709
ReflectLongArrayRead: 4534 ms 1.838 142.076
805344
ReflectLongArrayWrite: 1729 ms 4.819 372.619
805344
ReflectDoubleArrayRead: 4999 ms 2.000 130.928
818144
ReflectDoubleArrayWrite: 1431 ms 6.985 457.167
818144
ReflectFloatArrayRead: 9139 ms 2.188 74.066
846172
ReflectFloatArrayWrite: 2401 ms 8.329 281.898
846172
ReflectNestedFloatArrayRead: 12295 ms 1.627 55.056
846172
ReflectNestedFloatArrayWrite: 4975 ms 4.020 136.058
846172
ReflectNestedObjectArrayRead: 14627 ms 0.273 35.281
645104
ReflectNestedObjectArrayWrite: 10045 ms 0.398 51.375
645104
ReflectNestedLargeFloatArrayRead: 7315 ms 0.456 118.910
1087381
ReflectNestedLargeFloatArrayWrite: 2029 ms 1.642 428.657
1087381
ReflectNestedLargeFloatArrayBlockedRead: 7429 ms 0.449 118.597
1101357
ReflectNestedLargeFloatArrayBlockedWrite: 5330 ms 0.625 165.280
1101357
{noformat}
I added the proposed booleans to FieldAccessor and this improved performance
almost back to prepatch:
{noformat}
test name time M
entries/sec M bytes/sec bytes/cycle
ReflectRecordRead: 6391 ms 2.607 101.189
808498
ReflectRecordWrite: 4180 ms 3.987 154.712
808498
ReflectBigRecordRead: 6276 ms 1.593 97.812
767380
ReflectBigRecordWrite: 4926 ms 2.030 124.610
767380
ReflectFloatRead: 5580 ms 0.000 143.356
1000004
ReflectFloatWrite: 1285 ms 0.001 622.420
1000004
ReflectDoubleRead: 6847 ms 0.000 233.657
2000004
ReflectDoubleWrite: 2325 ms 0.000 688.114
2000004
ReflectIntArrayRead: 7973 ms 2.090 86.252
859709
ReflectIntArrayWrite: 2760 ms 6.038 249.168
859709
ReflectLongArrayRead: 4720 ms 1.765 136.489
805344
ReflectLongArrayWrite: 1762 ms 4.728 365.527
805344
ReflectDoubleArrayRead: 5253 ms 1.903 124.587
818144
ReflectDoubleArrayWrite: 1637 ms 6.107 399.693
818144
ReflectFloatArrayRead: 9280 ms 2.155 72.942
846172
ReflectFloatArrayWrite: 2182 ms 9.163 310.143
846172
ReflectNestedFloatArrayRead: 11072 ms 1.806 61.134
846172
ReflectNestedFloatArrayWrite: 4058 ms 4.928 166.812
846172
ReflectNestedObjectArrayRead: 11122 ms 0.360 46.399
645104
ReflectNestedObjectArrayWrite: 6689 ms 0.598 77.152
645104
ReflectNestedLargeFloatArrayRead: 7320 ms 0.455 118.834
1087381
ReflectNestedLargeFloatArrayWrite: 1837 ms 1.814 473.434
1087381
ReflectNestedLargeFloatArrayBlockedRead: 7383 ms 0.451 119.326
1101357
ReflectNestedLargeFloatArrayBlockedWrite: 4839 ms 0.689 182.069
1101357
{noformat}
Attached is a new patch with the improved performance.
> Allow controlling avro via java annotations when using reflection.
> -------------------------------------------------------------------
>
> Key: AVRO-1341
> URL: https://issues.apache.org/jira/browse/AVRO-1341
> Project: Avro
> Issue Type: New Feature
> Components: java
> Reporter: Vincenz Priesnitz
> Assignee: Vincenz Priesnitz
> Fix For: 1.7.5
>
> Attachments: AVRO-1341.patch, AVRO-1341.patch, AVRO-1341.patch,
> AVRO-1341.patch, AVRO-1341.patch
>
>
> It would be great if one could control avro with java annotations. As of now,
> it is already possible to mark fields as Nullable or classes being encoded as
> a String. I propose a bigger set of annotations to control the behavior of
> avro on fields and classes. Such annotations have proven useful with jacksons
> json serialization and morphias mongoDB serialization.
> I propose the following additional annotations:
> @AvroName("alternativeName")
> @AvroAlias(alias="alias", space="space")
> @AvroIgnore
> @AvroMeta(key="K", value="V")
> @AvroEncode(using=CustomEncoding.class)
> Java fields with the @AvroName("alternativeName") annotation will be renamed
> in the induced schema. When reading an avro file via reflection, the
> reflection reader will look for fields in the schema with "alternativeName".
> For example:
> {code}
> @AvroName("foo")
> int bar;
> {code}
> is serialized as
> {code}
> { "name" : "foo", "type" : "int" }
> {code}
> The @AvroAlias annotation will add a new alias to the induced schema of a
> record, enum or field. The space parameter is optional and defaults to the
> namespace of the named schema the alias is added to.
> Fields with the @AvroIgnore annotation will be treated as if they had a
> transient modifier, i.e. they will not be written to or read from avro files.
> The @AvroMeta(key="K", value="V") annotation allows you to store an arbitrary
> key : value pair at every node in the schema.
> {code}
> @AvroMeta(key="fieldKey", value="fieldValue")
> int foo;
> {code}
> will create the following schema
> {code}
> {"name" : "foo", "type" : "int", "fieldKey" : "fieldValue" }
> {code}
> Fields can be custom encoded with the AvroEncode(using=CustomEncoding.class)
> annotation. This annotation is a generalization of the @Stringable
> annotation. The @Stringable annotation is limited to classes with string
> argument constructors. Some classes can be similarly reduced to a smaller
> class or even a single primitive, but dont fit the requirements for
> @Stringable. A prominent example is java.util.Date, which instances can
> essentially be described with a single long. Such classes can now be encoded
> with a CustomEncoding, which reads and writes directly from the
> encoder/decoder.
> One simply extends the abstract CustomEncodings class by implementing a
> schema, a read method and a write method. A java field can then be annotated
> like this:
> {code}
> @AvroEncode(using=DateAslongEncoding.class)
> Date date;
> {code}
> The custom encoding implementation would look like
> {code}
> public class DateAsLongEncoding extends CustomEncoding<Date> {
> {
> schema = Schema.create(Schema.Type.LONG);
> schema.addProp("CustomEncoding", "DateAsLongEncoding");
> }
>
> @Override
> public void write(Object datum, Encoder out) throws IOException {
> out.writeLong(((Date)datum).getTime());
> }
>
> @Override
> public Date read(Object reuse, Decoder in) throws IOException {
> if (reuse != null) {
> ((Date)reuse).setTime(in.readLong());
> return (Date)reuse;
> }
> else return new Date(in.readLong());
> }
> }
> {code}
> I implemented said annotations and a custom encoding for java.util.Date as a
> proof of concept and also extended the @Stringable annotations to fields.
> This issue is a followup of AVRO-1328 and AVRO-1330.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira