mxm commented on code in PR #13340:
URL: https://github.com/apache/iceberg/pull/13340#discussion_r2162371074


##########
flink/v2.0/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/TableMetadataCache.java:
##########
@@ -220,37 +238,59 @@ SchemaInfo getSchemaInfo() {
    */
   static class SchemaInfo {
     private final Map<Integer, Schema> schemas;
-    private final Map<Schema, Tuple2<Schema, CompareSchemasVisitor.Result>> 
lastResults;
+    private final Cache<Schema, SchemaCompareInfo> lastResults;

Review Comment:
   I've modified the benchmark to write records with one missing optional field.
   
   Status Quo:
   
   ```
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 10643477950764794 written 5000000 records in 
8256 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 3412597467473839149 written 5000000 records in 
8435 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 1378730447561228687 written 5000000 records in 
8131 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 6395019319803016710 written 5000000 records in 
8185 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 2342353485823757966 written 5000000 records in 
8368 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 793547938713805163 written 5000000 records in 
8276 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 5594118694338812518 written 5000000 records in 
8455 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 7557960821227856897 written 5000000 records in 
8429 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 2950550055628646422 written 5000000 records in 
8292 ms
   ```
   
   This PR:
   
   ```
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 4441515740804838126 written 5000000 records in 
6278 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 693561994983735461 written 5000000 records in 
6170 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 4396434136742105340 written 5000000 records in 
6341 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 5676638108798250077 written 5000000 records in 
6153 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 7335685996694279072 written 5000000 records in 
6176 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 3978788459948623506 written 5000000 records in 
6239 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 714790977727422517 written 5000000 records in 
6081 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 1410967688172637753 written 5000000 records in 
6170 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 6639921374375882181 written 5000000 records in 
6055 ms
   ```
   
   If you scroll all the way to the right, you can see that this PR is a lot 
faster.
   
   I've also tested the out-of-the-box performance, without any data conversion 
(matching schema):
   
   Status quo:
   ```
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 6883744294623845605 written 5000000 records in 
7041 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 8343965097603744595 written 5000000 records in 
7256 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 5909108840912148637 written 5000000 records in 
6819 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 1881752959951611409 written 5000000 records in 
6653 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 2178333854655314502 written 5000000 records in 
6674 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 4156942011317503970 written 5000000 records in 
6547 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 60883210314639382 written 5000000 records in 
6841 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 6777767015628214045 written 5000000 records in 
6673 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 5424435409654578112 written 5000000 records in 
6696 ms
   ```
   
   This PR:
   ```
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 4717988736607451439 written 5000000 records in 
7422 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 9163422086331213140 written 5000000 records in 
6858 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 3565484125246072275 written 5000000 records in 
6710 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 8125595062326341668 written 5000000 records in 
6751 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 1101967860013610846 written 5000000 records in 
6691 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 2683175175057794685 written 5000000 records in 
6779 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 5488538346290472866 written 5000000 records in 
6755 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 3160734148700801147 written 5000000 records in 
6577 ms
   [Test worker] INFO 
org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPerf - TEST RESULT: 
For table default.t_0 snapshot 3081221917974416074 written 5000000 records in 
6758 ms
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to