[ https://issues.apache.org/jira/browse/GEODE-5534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16570858#comment-16570858 ]
Barry Oglesby commented on GEODE-5534: -------------------------------------- The PdxStrings are referencing decompressed byte[]s. In the case of PDX values and compressed region entries, PdxStrings should not be used. Instead, Strings should be used. With indexes and compression and changes to use String key in case of compression: {noformat} num #instances #bytes class name ---------------------------------------------- 1: 502136 80427400 [B 2: 2000000 48000000 org.apache.geode.internal.concurrent.CompactConcurrentHashSet2$Node 3: 499990 35999280 org.apache.geode.internal.cache.entries.VersionedThinDiskRegionEntryHeapStringKey2 4: 500004 24000192 org.apache.geode.internal.cache.DiskId$PersistenceWithIntOffset 5: 535915 23319712 [C 6: 1503 15457776 [Lorg.apache.geode.internal.concurrent.CompactConcurrentHashSet2$Node; 7: 535644 12855456 java.lang.String 8: 501508 12036192 java.util.concurrent.ConcurrentSkipListMap$Node 9: 251648 6039552 java.util.concurrent.ConcurrentSkipListMap$Index 10: 80 4196096 [Lorg.apache.geode.internal.util.concurrent.CustomEntryConcurrentHashMap$HashEntry; Total 5477564 269878296 {noformat} > With oql indexes and compression enabled, memory usage is greater after > compression than before > ----------------------------------------------------------------------------------------------- > > Key: GEODE-5534 > URL: https://issues.apache.org/jira/browse/GEODE-5534 > Project: Geode > Issue Type: Bug > Components: querying > Reporter: Barry Oglesby > Assignee: Barry Oglesby > Priority: Major > Labels: swat > > A test that shows the behavior is: > - xml configuration like: > {noformat} > <region name="order"> > <region-attributes scope="distributed-ack" > data-policy="persistent-replicate" disk-store-name="orderDiskStore" > disk-synchronous="false" cloning-enabled="true"> > <compressor> > <class-name>org.apache.geode.compression.SnappyCompressor</class-name> > </compressor> > </region-attributes> > <index name="order_clOrdID" from-clause="/order index_from" > expression="index_from.clOrdID" type="range"/> > <index name="order_externalOrderID" from-clause="/order index_from" > expression="index_from.externalOrderID" type="range"/> > <index name="order_externalOrderIDSource" from-clause="/order index_from" > expression="index_from.externalOrderIDSource" type="range"/> > <index name="order_orderID" from-clause="/order index_from" > expression="index_from.orderID" type="range"/> > <index name="order_parentOrderID" from-clause="/order index_from" > expression="index_from.parentOrderID" type="range"/> > </region> > {noformat} > - an Order object defining string fields for clOrdID, externalOrderID, > orderID and parentOrderID and an enum field for externalOrderIDSource > Here are some histograms showing the behavior: > With indexes and no compression: > {noformat} > num #instances #bytes class name > ---------------------------------------------- > 1: 502136 562132008 [B > 2: 2000000 48000000 > org.apache.geode.internal.concurrent.CompactConcurrentHashSet2$Node > 3: 499990 35999280 > org.apache.geode.internal.cache.entries.VersionedThinDiskRegionEntryHeapStringKey2 > 4: 500004 24000192 > org.apache.geode.internal.cache.DiskId$PersistenceWithIntOffset > 5: 1503 15457776 > [Lorg.apache.geode.internal.concurrent.CompactConcurrentHashSet2$Node; > 6: 501508 12036192 java.util.concurrent.ConcurrentSkipListMap$Node > 7: 501500 12036000 org.apache.geode.pdx.internal.PdxString > 8: 500000 8000000 > org.apache.geode.internal.cache.PreferBytesCachedDeserializable > 9: 250680 6016320 java.util.concurrent.ConcurrentSkipListMap$Index > 10: 80 4196104 > [Lorg.apache.geode.internal.util.concurrent.CustomEntryConcurrentHashMap$HashEntry; > Total 5474612 739440976 > {noformat} > With indexes and compression > {noformat} > num #instances #bytes class name > ---------------------------------------------- > 1: 1003644 643754184 [B > 2: 2000000 48000000 > org.apache.geode.internal.concurrent.CompactConcurrentHashSet2$Node > 3: 499990 35999280 > org.apache.geode.internal.cache.entries.VersionedThinDiskRegionEntryHeapStringKey2 > 4: 500004 24000192 > org.apache.geode.internal.cache.DiskId$PersistenceWithIntOffset > 5: 1503 15457776 > [Lorg.apache.geode.internal.concurrent.CompactConcurrentHashSet2$Node; > 6: 501508 12036192 java.util.concurrent.ConcurrentSkipListMap$Node > 7: 501500 12036000 org.apache.geode.pdx.internal.PdxString > 8: 250889 6021336 java.util.concurrent.ConcurrentSkipListMap$Index > 9: 80 4196096 > [Lorg.apache.geode.internal.util.concurrent.CustomEntryConcurrentHashMap$HashEntry; > 10: 34380 3261936 [C > Total 5476808 813096488 > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)