aiborodin commented on code in PR #13340:
URL: https://github.com/apache/iceberg/pull/13340#discussion_r2165747315


##########
flink/v2.0/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/TableMetadataCache.java:
##########
@@ -220,37 +238,59 @@ SchemaInfo getSchemaInfo() {
    */
   static class SchemaInfo {
     private final Map<Integer, Schema> schemas;
-    private final Map<Schema, Tuple2<Schema, CompareSchemasVisitor.Result>> 
lastResults;
+    private final Cache<Schema, SchemaCompareInfo> lastResults;

Review Comment:
   I reverted the cache to use a `LinkedHashMap`. I also added [this 
benchmark](https://github.com/aiborodin/iceberg/blob/optimise-row-data-conversion/flink/v2.0/flink/src/jmh/java/org/apache/iceberg/flink/sink/dynamic/CacheBenchmark.java#L44)
 to compare it with the caffeine implementation. Here, `LRUCache` is the 
`LinkedHashMap` based cache.
   As you suggested @pvary, the `LinkedHashMap` cache is almost twice as fast:
   ```
   Benchmark                                                                    
Mode  Cnt          Score         Error       Units
   CacheBenchmark.testCaffeineCacheMaxSize_Get  thrpt       5         929.031 ± 
  84.510    ops/s
   CacheBenchmark.testCaffeineCacheMaxSize_Put  thrpt       5         548.677 ± 
  12.191     ops/s
   CacheBenchmark.testLRUCache_Get                       thrpt       5        
1657.313 ±   71.981     ops/s
   CacheBenchmark.testLRUCache_Put                       thrpt       5        
1206.151 ±  112.609    ops/s
   ```
   We should probably replace the caffeine cache 
[here](https://github.com/apache/iceberg/blob/main/flink/v2.0/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/TableMetadataCache.java#L55)
 as well.
   
   Thank you for your review and valuable feedback! I hope we can get this in 
soon :)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to