Hi devs,

Last night I checked what happens when uploading a file, and why does 
that action require huge amounts of memory.

So, whenever uploading a file, there are several places where the file 
content is loaded into memory:
- as an XWikiAttachment as byte[] ~= filesize
- as an XWikiAttachmentArchive as Base64 encoded string ~= 2*4*filesize
- as hibernate tokens that are sent to the database, clones of the 
XWikiAttachment and XWikiAttachmentArchive data ~= 9*filesize
- as Cached attachments and attachment archive, clones of the same 2 
objects ~= 9*filesize

Total: ~27*filesize bytes in memory.

So, out of a 10M file, we get at least 270M of needed memory.

Worse, if this is not the first version of the attachment, then the 
complete attachment history is loaded in memory, so add another 
24*versionsize*versions of memory needed during upload.

After the upload is done, most of these are cleared, only the cached 
objects will remain in memory.

However, a problem still remains with the cache. It is a LRU cache with 
a fixed capacity, so even if the memory is full, the cached attachments 
will not be released.

Things we can improve:
- Make the cache use References. This will allow cached attachments to 
be removed from memory when there's a need for more memory
- Do a better attachment archive system. I'm not sure it is a good idea 
to have diff-based versioning of attachments. In theory, it saves space 
when versions are much alike, but it does not really work in practice 
because it does a line-diff, and a base64 encoded string does not have 
newlines. What's more, the space gain would be efficient when there are 
many versions, as one version alone takes 4 times more space than a 
binary dump of the content.

Suppose we switch to a "one version per table row" for attachment 
history, with direct binary dump, then the memory needed for uploading 
would be 6*filesize, which is much less.
-- 
Sergiu Dumitriu
http://purl.org/net/sergiu/
_______________________________________________
devs mailing list
[email protected]
http://lists.xwiki.org/mailman/listinfo/devs

Reply via email to