https://bz.apache.org/bugzilla/show_bug.cgi?id=58740
--- Comment #1 from Javen O'Neal <one...@apache.org> --- Out of curiosity, how many styles are you creating that cause the reported several minutes/10 seconds times? Are any of the styles identical? I ask because most workbooks I create use under 100 styles. Unless your workbook is a styles demo that shows every permutation of font, data format, border, background and foreground color, it seems difficult to get a style count high enough in most applications where there wouldn't be duplicates. I've rolled my own code to manage styles so I can avoid creating millions of styles. Usually I want to change the data format of an existing cell without affecting other cells that use the same style. Rather than cloning the cell's style (that's how you get millions of styles), I temporarily change the cell style's data format, search the style table if there's another style that matches, and if so change the cell's style reference to the match, otherwise I clone the style. Finally, I revert my dataformat change to the original style. Because I apply style changes to thousands of cells, I have some extra data structures to make the style lookup process faster than linearly searching the style table. I mention this because it may solve your problem if you don't really need 1 million styles. POI could benefit from a way to consolidate duplicate styles, either when the workbook is written to disk, or through an explicit call. Glancing at your patch, it looks like your change adds some data structures. The consequence is: 1) higher memory consumption 2) extra processing power updating multiple data structures 3) potential for the data structures to get out of sync, especially considering projects that subclass POI. My recommendation is use a single data structure that is a container for cell styles, that combines the features that you need that will give fast by-index and by-style lookup. If such a data structure isn't available off-the-shelf, you may want to write your own. Most trivially, this is just a class that contains an ArrayList and a HashMap for inverted array lookups, and any call to the hybrid data structure would update both underlying data structures. Encapsulating the complexity is the key to solving my 3rd concern, and makes solving #2 and #1 easier to solve down the road. Alternatively, if you can read the styles from the array into a hashmap, clear the array, and only maintain a hashmap throughout the life of the style table, and then recreate the array when you need to save the style table, you've saved yourself the memory and performance overhead, and also avoided the potential for out-of-sync data structures so long as you clearly mark the array as unmaintained (maybe clear it or set it to null, or don't make it an instance variable, plus comments). -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org For additional commands, e-mail: dev-h...@poi.apache.org