Re: [Zope] Advice on Blob Storage?

2015-10-15 Thread Michael McFadden

On 09/29/2015 07:39 PM, Michael McFadden wrote:

On 09/24/2015 08:47 AM, Jean Jordaan wrote:


If relstorage is growing for blob uploads, I would think something is
wrongly configured.
I'm really thinking the same thing myself, but I wouldn't know the 
first place to look to configure this.



Solved.

Yep.  I found that I made a change where my content type stopped 
implementing and inheriting from ATBlob and went back to implementing 
IFileContent.


"There's your problem"

Must have been a great idea at the time.

I spent the time to learn how schemaextender works now, and the content 
type is back to being based off of ATBlob.


I'm still doing the storage Tom Foolery, but working with blobs instead 
of filedata now.  This makes much more sense.


With the added benefit that I don't take file data and write it out as a 
temp file.
  plone.app.blob.utils.openBlob() does that work for me now in a 
smarter fashion.


I have a slight worry that when I close the blob using the file object 
that openBlob gave me, then immediately call consumeFile on the 
ZODB.blob, garbage collection may not have time to destroy the weakref 
in ZODB.blob and I'll get a 'file opened' exception.


I'm not savvy enough with garbage collection and weakrefs in python to 
really be sure about this.


Thanks guys.

--
Mike McFadden
Radio Free Asia
Technical Operations Division
2025 M Street NW
Washington DC 20036 USA

This e-mail message is intended only for the use of the addressee and may 
contain information that is privileged and confidential.  Any unauthorized 
dissemination, distribution or copying is strictly prohibited.  If you receive 
this transmission in error, please contact netw...@rfa.org.


___
Zope maillist  -  Zope@zope.org
https://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists -
https://mail.zope.org/mailman/listinfo/zope-announce
https://mail.zope.org/mailman/listinfo/zope-dev )


Re: [Zope] Advice on Blob Storage?

2015-09-29 Thread Michael McFadden

On 09/24/2015 02:52 AM, Tom Russell wrote:


Mike,

First of all, kudos on your candor and being willing to share your "hack"
(storage.py).
Thanks.  I'm humbly getting things done without a lot of python 
knowledge and no zope experience.  I wouldn't recommend this approach to 
anyone, it's just some cowboy hacking.


My 1st thought is, why don't you create a content type and store it in the
ZODB at the time the video is uploaded? The type would include the video
metadata (vanilla RSS, Dublin Core, etc) and a link to the off-site content.
Much more helpful than a "not here" message, yeah?
Yep, the content type is created - it's just the file field that has the 
odd storage.  All other fields are normal.


Secondly, I'm wondering why you're using SQL. Is it to interface with legacy
system(s)? But that's probably just my purist streak talking. :-)
Relstorage for load balancing and replication.  that's about it.   I 
inherited the setup (yeah, I know, bad excuse).  But Rethinking a 
Data.fs solution to the problem is probably not going to help anyway.




IIRC, there are hooks in Zope like "manage_before_save()", "...after_save",
etc. This would be ideal, as you could strip the blob from the request before
doing an insert. Yeah?

I'm going to look into this stuff.
I stumbled on some docs that hinted you could call 'pack()' directly on 
a single piece of storage, but I've yet to find them again.  This might 
be a solution.


Anyway, sorry I can't be more help w/ the specifics of your installation.

No worries.  Happy to get a reply.



--
Mike McFadden
Radio Free Asia
Technical Operations Division
2025 M Street NW
Washington DC 20036 USA

This e-mail message is intended only for the use of the addressee and may 
contain information that is privileged and confidential.  Any unauthorized 
dissemination, distribution or copying is strictly prohibited.  If you receive 
this transmission in error, please contact netw...@rfa.org.


___
Zope maillist  -  Zope@zope.org
https://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists -
https://mail.zope.org/mailman/listinfo/zope-announce
https://mail.zope.org/mailman/listinfo/zope-dev )


Re: [Zope] Advice on Blob Storage?

2015-09-29 Thread Michael McFadden

On 09/24/2015 08:47 AM, Jean Jordaan wrote:


If relstorage is growing for blob uploads, I would think something is
wrongly configured.
I'm really thinking the same thing myself, but I wouldn't know the first 
place to look to configure this.


This installation was a Plone 2 archetypes-based build that was upgraded 
to Plone 4, and that's when the Relstorage changeover was done.  I don't 
know if that gives any hints.


We have so much archetypes content with specialized code that we've 
stuck with archetypes.  I have a feeling that archetypes is tightly 
coupled with filestorage somehow.



Can this behavior be turned off for a specific field or content type?  So
undo logs are preserved for everything BUT this monster of a content type?

Seems strange to do this tho.

Yes, that seems like a plaster on top of a broken bone.

I hope we don't go down that path.




Going deeper down the rabbit hole, although I don't think it's relevant, is
the fact that I hacked and replaced the storage class for the field.
Instead of using AnnotationStorage

This sounds dangerous to me ..
It's actually working perfectly, and was the original intent.   When I 
did some quick maths to show how much blob storage would grow based on 
how much video content we create, it became cost-preventative to store 
the videos in blob storage.   The subclassed AnnotationStorage works.   
However, I'll be looking into collective.xsendfile to see if I can make 
things a bit better.


In a nutshell, Blob Storage is happy - the data are stored elsewhere 
happily.  The upshot is that you cannot fetch the file from Plone, and 
that's just fine for now.  If you do fetch it through plone's download, 
you get about 80 bytes that say "your file is not here"


The fact that relstorage grows with the upload (and shrinks back with 
the pack) is what's troubling.


My spelunking (which I enjoy) has gone deep enough to confirm that the 
relstorage growth happens with the transaction's tcp_finish() call, and 
I haven't gone deeper yet.  What's strange is that the data of the File 
field has been replaced by then (in the dangerous manner mentioned 
above) and I'm not sure where it's finding it.


All the tom foolery can be seen on github where I replace the field 
storage for the file field.


Thanks for the reply.

--
Mike McFadden
Radio Free Asia
Technical Operations Division
2025 M Street NW
Washington DC 20036 USA

This e-mail message is intended only for the use of the addressee and may 
contain information that is privileged and confidential.  Any unauthorized 
dissemination, distribution or copying is strictly prohibited.  If you receive 
this transmission in error, please contact netw...@rfa.org.


___
Zope maillist  -  Zope@zope.org
https://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists -
https://mail.zope.org/mailman/listinfo/zope-announce
https://mail.zope.org/mailman/listinfo/zope-dev )


Re: [Zope] Advice on Blob Storage?

2015-09-24 Thread Tom Russell
On Mon, 21 Sep 12:24:45 PM Michael McFadden wrote:
> This may be more of a zodb / relstorage question - I hope it's ok to ask
> on the Zope list.
> 
> I'm seeing behavior using relstorage and blobs that I didn't expect:
> If I upload a large file, say 2 gigs, I am noticing that our SQL
> database also grows by 2 Gigs, along with the blob storage.
>After a pack, the space is reclaimed on the SQL side, and everyone is
> happy.
>FWIW - it's videos that are doing this.
> 
> I am pretty sure it's the undo log that's growing, based on the fact
> that a pack reclaims the space.
> 
> Can this behavior be turned off for a specific field or content type?
> So undo logs are preserved for everything BUT this monster of a content
> type?
> 
> Seems strange to do this tho.
> 
> Are there other alternatives, like calling .pack() directly on the
> field's storage after it's set?
> 
> Our problem is that our sql database grows to a huge size between our
> weekly packs, and backups of the sql dumps are becoming unmanageable.
> Our blob backups are ready to deal with this kind of size, but not the
> sql backups.
> 
> --
> Going deeper down the rabbit hole, although I don't think it's relevant,
> is the fact that I hacked and replaced the storage class for the field.
> Instead of using AnnotationStorage - which I found used as default for
> ImageField - I intercept the data during storage.set(), ship it out to a
> separate storage facility, and replace the data with a happy message
> "This is not where your data is" which is then written to the blobs.
> It works just great - keeping our blob storage growth from going
> crazy.If you try to 'download' the file from Plone, you'll get the
> text file with the happy message.
> 
> Now that I've been shown that the Blob Storage is functioning just fine,
> but the SQL storage size is going off the charts, I hope I'm not back at
> square one.
> 
> The goal is to allow users to think they are uploading 4Gb videos into
> Plone, when under the covers, we're actually shipping the video files
> off to some fancy off-site storage. (Akamai)  So we don't have to store
> them and back them up on-site, and our blob directories remain
> manageable in size.
> 
> The storage hack can be seen here:
> https://github.com/RadioFreeAsia/rfa.kaltura/blob/master/rfa/kaltura/storage
> /storage.py
> 
> 
> I'm not proud of it, but it works.

Mike,

First of all, kudos on your candor and being willing to share your "hack" 
(storage.py).

I've been out of the Zope  loop for a while but just thought I'd pony up a 
response since your posting was interesting to me, regardless how out of touch 
w/ reality my response might be. And being out of the loop, I don't have to 
worry any more about looking dumb!

My 1st thought is, why don't you create a content type and store it in the 
ZODB at the time the video is uploaded? The type would include the video 
metadata (vanilla RSS, Dublin Core, etc) and a link to the off-site content. 
Much more helpful than a "not here" message, yeah?

Secondly, I'm wondering why you're using SQL. Is it to interface with legacy 
system(s)? But that's probably just my purist streak talking. :-)

IIRC, there are hooks in Zope like "manage_before_save()", "...after_save", 
etc. This would be ideal, as you could strip the blob from the request before 
doing an insert. Yeah?

Anyway, sorry I can't be more help w/ the specifics of your installation.

Best,
-Tom

___
Zope maillist  -  Zope@zope.org
https://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists -
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope-dev )


Re: [Zope] Advice on Blob Storage?

2015-09-24 Thread Jean Jordaan
Hi Michael

Without knowing anything about your setup let me chuck a few stones
into the bushes ..

> I'm seeing behavior using relstorage and blobs that I didn't expect:
>If I upload a large file, say 2 gigs, I am noticing that our SQL database
> also grows by 2 Gigs, along with the blob storage.

If relstorage is growing for blob uploads, I would think something is
wrongly configured.

> Can this behavior be turned off for a specific field or content type?  So
> undo logs are preserved for everything BUT this monster of a content type?
>
> Seems strange to do this tho.

Yes, that seems like a plaster on top of a broken bone.

> Going deeper down the rabbit hole, although I don't think it's relevant, is
> the fact that I hacked and replaced the storage class for the field.
> Instead of using AnnotationStorage

This sounds dangerous to me ..

> The goal is to allow users to think they are uploading 4Gb videos into
> Plone, when under the covers, we're actually shipping the video files off to
> some fancy off-site storage. (Akamai)

Configure caching such that client/CDN/varnish/nginx keeps all the big
files that they should.
Use collective.xsendfile to make file requests go directly to the
front-end server (but note "Blob handling in ZODB is very effective
already (async sockets, just like Apache or nginx would do ). [...]
This add-on only removes the need to proxy the file data over socket
connection").

See
  http://www.slideshare.net/jensens/2014-ploneconfbristolspeedupplone
for a lot of good tips.

-- 
jean  . ..  //\\\oo///\\
___
Zope maillist  -  Zope@zope.org
https://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists -
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope-dev )