[ https://issues.apache.org/jira/browse/AVRO-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17969551#comment-17969551 ]
ASF subversion and git services commented on AVRO-4090: ------------------------------------------------------- Commit 5324d94ebe2ca7145ef5d81aa01590cedbc46fb2 in avro's branch refs/heads/branch-1.12 from Thiago Romão Barcala [ https://gitbox.apache.org/repos/asf?p=avro.git;h=5324d94ebe ] AVRO-4090: Avoid repeating data validation (#3241) (cherry picked from commit d5d5466d8d8a36fcbecbc924515174638f7ad515) > PHP data is validated multiple times for nested schemas > ------------------------------------------------------- > > Key: AVRO-4090 > URL: https://issues.apache.org/jira/browse/AVRO-4090 > Project: Apache Avro > Issue Type: Improvement > Reporter: Thiago Romão Barcala > Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Consider the test script below: > {code:php} > <?php > use Apache\Avro\Datum\AvroIOBinaryEncoder; > use Apache\Avro\Datum\AvroIODatumWriter; > use Apache\Avro\IO\AvroStringIO; > use Apache\Avro\Schema\AvroSchema; > require_once 'vendor/autoload.php'; > $writer = new AvroIODatumWriter(); > $schemaJson = <<<'JSON' > { > "type": "record", > "name": "A", > "fields": [ > { > "name": "a", > "type": { > "type": "record", > "name": "B", > "fields": [ > { > "name": "b", > "type": { > "type": "record", > "name": "C", > "fields": [ > { > "name": "c", > "type": { > "type": "record", > "name": "D", > "fields": [ > { > "name": "d", > "type": { > "type": "record", > "name": "E", > "fields": [ > { > "name": "e", > "type": > "string" > } > ] > } > } > ] > } > } > ] > } > } > ] > } > } > ] > } > JSON > ; > $data = ['a' => ['b' => ['c' => ['d' => ['e' => 'value']]]]]; > $schema = AvroSchema::parse($schemaJson); > $io = new AvroStringIO(); > $writer->writeData($schema, $data, new AvroIOBinaryEncoder($io)); > var_dump($io->__toString()); {code} > By running the script above with the command line below, it is possible to > see, by inspecting the profiler output, that the method > AvroSchema::isValidDatum is called 21 times: > {code:bash} > php -dxdebug.start_with_request=true -dxdebug.mode=profile > -dxdebug.output_dir=$(pwd) test.php > {code} > The validation should be called only 6 times though, once for each record, > and once for the string value. This is happening, because writeData is being > called for every field of the record, and writeData validates the entire data > graph. -- This message was sent by Atlassian Jira (v8.20.10#820010)