[jira] Updated: (PIG-1492) DefaultTuple and DefaultMemory understimate their memory footprint
[ https://issues.apache.org/jira/browse/PIG-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated PIG-1492: --- Status: Resolved (was: Patch Available) Resolution: Fixed > DefaultTuple and DefaultMemory understimate their memory footprint > -- > > Key: PIG-1492 > URL: https://issues.apache.org/jira/browse/PIG-1492 > Project: Pig > Issue Type: Bug >Affects Versions: 0.8.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 0.8.0 > > Attachments: PIG-1492.1.patch > > > There are several places where we highly underestimate the memory footprint . > For example, for map datatypes, we don't account for the per entry cost for > the map container data structures. The estimated size of a tuple having map > with 100 integer key-value entries , as per current version of code is 3260 > bytes, while what is observed is around 6775 bytes . To verify the memory > footprint, i checked free memory before and after creating multiple instances > of the object , using code on the lines of > http://www.javaspecialists.eu/archive/Issue029.html . > In PIG-1443 similar change was done to fix this for CHARARRAY . -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1492) DefaultTuple and DefaultMemory understimate their memory footprint
[ https://issues.apache.org/jira/browse/PIG-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated PIG-1492: --- Status: Patch Available (was: Open) Affects Version/s: 0.8.0 > DefaultTuple and DefaultMemory understimate their memory footprint > -- > > Key: PIG-1492 > URL: https://issues.apache.org/jira/browse/PIG-1492 > Project: Pig > Issue Type: Bug >Affects Versions: 0.8.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 0.8.0 > > Attachments: PIG-1492.1.patch > > > There are several places where we highly underestimate the memory footprint . > For example, for map datatypes, we don't account for the per entry cost for > the map container data structures. The estimated size of a tuple having map > with 100 integer key-value entries , as per current version of code is 3260 > bytes, while what is observed is around 6775 bytes . To verify the memory > footprint, i checked free memory before and after creating multiple instances > of the object , using code on the lines of > http://www.javaspecialists.eu/archive/Issue029.html . > In PIG-1443 similar change was done to fix this for CHARARRAY . -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1492) DefaultTuple and DefaultMemory understimate their memory footprint
[ https://issues.apache.org/jira/browse/PIG-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated PIG-1492: --- Attachment: PIG-1492.1.patch This patch updates the memory size calculations . This changes were made so that the estimated sizes are closer to what is seen in 32 bit Java HotSpot(TM) Server VM (build 10.0-b19, mixed mode). It is based on some of the observations in http://www.javamex.com/tutorials/memory/string_memory_usage.shtm . The header sizes of objects has been taken to be 8 bytes. The objects size is rounded to multiple of 8 bytes. Some other adjustments for minimum size of array in a ArrayList were made based on observed size values. The follow tables shows the tuple estimated sizes before/after the patch and what is actually observed, for the types whose calculation logic changed - || type || num of columns of this type in the tuple || before || patched || observed || | BYTEARRAY with 5 bytes| 10|254 | 504|495 | | BYTEARRAY with 5 bytes| 1000| 21044| 44064|44127 | | DOUBLE| 10|364 | 264|255 | | DOUBLE| 1000|32044 | 20064| 20127 | | LONG |10 | 284|264 |255 | | LONG | 1000 | 24044 | 20064 | 20127 | || Tuple containing a single - || patched || observed || | BAG with 10 empty tuples|524| 1092|1159 | | BAG with 1000 empty tuples| 48044| 100092| 100211| | map with 10 integer key-value pairs| 380| 824| 775| | map with 1000 integer key-value pairs| 32060| 64184| 64346| > DefaultTuple and DefaultMemory understimate their memory footprint > -- > > Key: PIG-1492 > URL: https://issues.apache.org/jira/browse/PIG-1492 > Project: Pig > Issue Type: Bug >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 0.8.0 > > Attachments: PIG-1492.1.patch > > > There are several places where we highly underestimate the memory footprint . > For example, for map datatypes, we don't account for the per entry cost for > the map container data structures. The estimated size of a tuple having map > with 100 integer key-value entries , as per current version of code is 3260 > bytes, while what is observed is around 6775 bytes . To verify the memory > footprint, i checked free memory before and after creating multiple instances > of the object , using code on the lines of > http://www.javaspecialists.eu/archive/Issue029.html . > In PIG-1443 similar change was done to fix this for CHARARRAY . -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1492) DefaultTuple and DefaultMemory understimate their memory footprint
[ https://issues.apache.org/jira/browse/PIG-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-1492: Assignee: Thejas M Nair Fix Version/s: 0.8.0 > DefaultTuple and DefaultMemory understimate their memory footprint > -- > > Key: PIG-1492 > URL: https://issues.apache.org/jira/browse/PIG-1492 > Project: Pig > Issue Type: Bug >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 0.8.0 > > > There are several places where we highly underestimate the memory footprint . > For example, for map datatypes, we don't account for the per entry cost for > the map container data structures. The estimated size of a tuple having map > with 100 integer key-value entries , as per current version of code is 3260 > bytes, while what is observed is around 6775 bytes . To verify the memory > footprint, i checked free memory before and after creating multiple instances > of the object , using code on the lines of > http://www.javaspecialists.eu/archive/Issue029.html . > In PIG-1443 similar change was done to fix this for CHARARRAY . -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.