[ 
https://issues.apache.org/jira/browse/AVRO-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13609486#comment-13609486
 ] 

Alexandre Normand commented on AVRO-1268:
-----------------------------------------

I've hit a snag. The parallel tree solution mostly works but it doesn't with 
recursive schemas. With a recursive schema, we fail with a StackOverflow when 
building the tree. To work around that, we would need to have some way of 
pointing state elements to previously built state but that would require some 
mechanism to lookup these states. It's essentially going back to using a Map. 
Or maybe we could be clever and keep such a Map just while building the state 
tree. 

In any event, I tried running the perf test even with the current broken state 
of recursive schemas and the results are similar to the ones I had when doing 
the Map lookup (some are a bit slower than my last results but they might just 
be run-to-run variations):
{code}
Executing tests: 
[IntTest, SmallLongTest, LongTest, FloatTest, DoubleTest, BoolTest, BytesTest, 
StringTest, ArrayTest, MapTest, RecordTest, ValidatingRecord, ResolvingRecord, 
RecordWithDefault, RecordWithOutOfOrder, RecordWithPromotion, GenericTest, 
GenericStrings, GenericNested, GenericNestedFake, GenericWithDefault, 
GenericWithOutOfOrder, GenericWithPromotion, GenericOneTimeDecoderUse, 
GenericOneTimeReaderUse, GenericOneTimeUse, FooBarSpecificRecordTest]
 readTests:true
 writeTests:true
 cycles=800
                    test name     time    M entries/sec   M bytes/sec  
bytes/cycle
                      IntRead:    712 ms     280.712       706.637        629325
                     IntWrite:   1456 ms     137.289       345.598        629325
                SmallLongRead:    801 ms     249.549       628.189        629325
               SmallLongWrite:   1478 ms     135.299       340.589        629325
                     LongRead:   1690 ms     118.319       516.984       1092353
                    LongWrite:   2708 ms      73.842       322.645       1092353
                    FloatRead:    374 ms     533.773      2135.093       1000000
                   FloatWrite:   1201 ms     166.403       665.611       1000000
                   DoubleRead:    353 ms     564.997      4519.978       2000000
                  DoubleWrite:   1906 ms     104.890       839.122       2000000
                  BooleanRead:    249 ms     802.739       802.739        250000
                 BooleanWrite:    524 ms     381.048       381.048        250000
                    BytesRead:   1613 ms      24.797       881.263       1776937
                   BytesWrite:   2138 ms      18.708       664.843       1776937
                   StringRead:   8433 ms       4.743       168.929       1780910
                  StringWrite:   8097 ms       4.940       175.956       1780910
                    ArrayRead:    403 ms     495.969      1983.888       1000006
                   ArrayWrite:   1162 ms     172.068       688.277       1000006
                      MapRead:   1368 ms     146.168       730.842       1250004
                     MapWrite:   2127 ms      94.017       470.089       1250004
                   RecordRead:    650 ms      51.265      1989.599       1617069
                  RecordWrite:   1987 ms      16.769       650.802       1617069
         ValidatingRecordRead:   3790 ms       8.793       341.252       1617069
        ValidatingRecordWrite:   3656 ms       9.116       353.806       1617069
          ResolvingRecordRead:   4159 ms       8.014       311.034       1617069
        RecordWithDefaultRead:  11357 ms       2.935       113.905       1617069
     RecordWithOutOfOrderRead:   3295 ms      10.115       392.571       1617069
      RecordWithPromotionRead:   3603 ms       9.251       359.046       1617069
                  GenericRead:   5192 ms       3.210       124.566        808498
                 GenericWrite:   3285 ms       5.072       196.853        808498
           GenericStringsRead:   5913 ms       2.818       300.434       2220873
          GenericStringsWrite:  13740 ms       1.213       129.301       2220873
           GenericNested_Read:   8459 ms       1.970        76.459        808498
          GenericNested_Write:   4873 ms       3.420       132.708        808498
       GenericNestedFake_Read:   3397 ms       4.906       190.390        808498
      GenericNestedFake_Write:   1596 ms      10.442       405.222        808498
      GenericWithDefault_Read:  10029 ms       1.662        64.492        808498
   GenericWithOutOfOrder_Read:   5355 ms       3.112       120.782        808498
    GenericWithPromotion_Read:   5369 ms       3.104       120.464        808498
GenericOneTimeDecoderUse_Read:   5324 ms       3.130       121.473        808498
 GenericOneTimeReaderUse_Read:   8190 ms       2.035        78.965        808498
       GenericOneTimeUse_Read:   8274 ms       2.014        78.171        808498
 FooBarSpecificRecordTestRead:  37938 ms       0.439        73.409       3481319
FooBarSpecificRecordTestWrite:  31674 ms       0.526        87.927       3481319
{code} 

I'm not sure if it's worth continuing down that path. If you get better results 
with the parsed-based solution, it's probably the way to go. 
                
> Add java-class, java-key-class and java-element-class support for stringable 
> types to SpecificData
> --------------------------------------------------------------------------------------------------
>
>                 Key: AVRO-1268
>                 URL: https://issues.apache.org/jira/browse/AVRO-1268
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.7.4
>            Reporter: Alexandre Normand
>            Assignee: Alexandre Normand
>            Priority: Minor
>             Fix For: 1.7.5
>
>         Attachments: AVRO-1268-needs-work.patch, AVRO-1268.patch, 
> AVRO-1268.patch, AVRO-1268-performance.patch, AVRO-1268.sh, 
> GenericStringsPerf.patch, pseudo.patch, pseudo.patch
>
>
> Stringable types are java classes that can be serialized through strings 
> (which require a single string constructor and a valid toString() 
> implementation). ReflectData currently has support from stringable types but 
> it would be desirable to get this feature with SpecificData. 
> The work involves changes to the SpecificCompiler (depends on {{@java-class}} 
> support in AVRO-1267) to generate the specific sources with the proper java 
> type as well as moving the ReflectDatumReader and ReflectDatumWriter to read 
> the java-class/java-key-class and java-element-class properties. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to