It may have catches in licenses (just as any of these JCR efforts) but I believe this is a sturdy way to expose streams of varying size. Indeed, it'd need a file-system-storage but that's a good thing certainly or?
I am not really an expert there unfortunately, but last I played with jackRabbit it really seemed like a sturdy piece you could rely on.
paul Le 3 mars 08 à 17:28, Sergiu Dumitriu a écrit :
Vincent Massol wrote:Nice work Sergiu. We should transform this into a jira issue to not forget it.We should vote for it first.One other idea: store attachments on the file system and not in the DB.Thanks -Vincent On Feb 27, 2008, at 3:48 PM, Sergiu Dumitriu wrote:Hi devs,Last night I checked what happens when uploading a file, and why doesthat action require huge amounts of memory.So, whenever uploading a file, there are several places where the filecontent is loaded into memory: - as an XWikiAttachment as byte[] ~= filesize - as an XWikiAttachmentArchive as Base64 encoded string ~= 2*4*filesize - as hibernate tokens that are sent to the database, clones of the XWikiAttachment and XWikiAttachmentArchive data ~= 9*filesize - as Cached attachments and attachment archive, clones of the same 2 objects ~= 9*filesize Total: ~27*filesize bytes in memory. So, out of a 10M file, we get at least 270M of needed memory. Worse, if this is not the first version of the attachment, then the complete attachment history is loaded in memory, so add another 24*versionsize*versions of memory needed during upload. After the upload is done, most of these are cleared, only the cached objects will remain in memory. However, a problem still remains with the cache. It is a LRU cache with a fixed capacity, so even if the memory is full, the cached attachments will not be released. Things we can improve:- Make the cache use References. This will allow cached attachments tobe removed from memory when there's a need for more memory - Do a better attachment archive system. I'm not sure it is a good idea to have diff-based versioning of attachments. In theory, it saves spacewhen versions are much alike, but it does not really work in practice because it does a line-diff, and a base64 encoded string does not havenewlines. What's more, the space gain would be efficient when there are many versions, as one version alone takes 4 times more space than a binary dump of the content. Suppose we switch to a "one version per table row" for attachmenthistory, with direct binary dump, then the memory needed for uploadingwould be 6*filesize, which is much less.-- Sergiu Dumitriu http://purl.org/net/sergiu/ _______________________________________________ devs mailing list [email protected] http://lists.xwiki.org/mailman/listinfo/devs
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ devs mailing list [email protected] http://lists.xwiki.org/mailman/listinfo/devs

