That patch broke signed cloudfront URLs, as the S3 content-disposition query string has to be included in the URL which gets signed. Sigh. So, there's a PR upstream at https://github.com/jschneier/django-storages/pull/1004 which resolves that. PR 1003 also allows the signing key to work easily when passed in via env variable. There's a forked django-storages with both incorporated at https://github.com/Kong/django-storages/tree/release/kong-prod, should someone else need to use that before upstream integrates the changes. It's fairly easy to just `pip3 install "django-storages[boto3] @ git+ https://github.com/Kong/django-storages@release/kong-prod"` instead of the usual documented step. Hope that helps someone. :)
--Danny On Fri, Apr 16, 2021 at 11:35 AM Danny Sauer <[email protected]> wrote: > FWIW, I worked around this by just patching pulp in our adjusted docker > build. > > https://github.com/Kong/docker-pulp/blob/main/pulp-core/patches/content_parameter_filename_fix.patch > > Upstream patch hasn't been submitted yet because I'm still scrambling to > get this implemented before our current hosted provider goes away. Which > is also why it took a week to share the workaround I had in place last week > (and why my documentation PRs still don't have issues associated). :D > > I know the project is preferring to go the Kube operator / Ansible route, > but speaking of Docker and Kubernetes and a CDN, we do have a helm chart > for this whole thing that I'm hoping we can open source soon as well. > Someday... > > --Danny > > On Wed, Apr 7, 2021 at 10:39 AM David Davis <[email protected]> wrote: > >> Interesting. Keep us posted. >> >> David >> >> >> On Tue, Apr 6, 2021 at 9:37 PM Danny Sauer <[email protected]> >> wrote: >> >>> Thanks for following up. Yes, the query string *should* be there. I >>> found this bug last week when I was looking in to it, though (basically, >>> telling Django-storages to use cloudfront breaks the query string appending >>> code). I'm back from away-from-keyboard vacation tomorrow, and should be >>> able to get a some patches sent upstream. :) >>> >>> https://github.com/jschneier/django-storages/issues/997 >>> >>> --Danny >>> >>> On Tue, Apr 6, 2021, 2:07 PM David Davis <[email protected]> wrote: >>> >>>> Hi Danny, >>>> >>>> I don't know much about AWS logging but Pulp does set the filename in >>>> the response-content-disposition[0]. Could that be used to determine the >>>> filename for each request? >>>> >>>> If not, I'm looking at the boto3 docs for get_object[1] to see if >>>> there's another parameter we could set to help you track the filename in >>>> requests but I'm seeing anything useful. My knowledge of s3 is a bit >>>> limited so if you have a suggestion how we can construct a request to S3 >>>> that would help you to track the filenames of requests to s3, I could >>>> probably look at how we could support it in Pulp 3. >>>> >>>> [0] >>>> https://github.com/pulp/pulpcore/blob/f38f955425b185749b3c8d4d878a7e166cfc05b9/pulpcore/content/handler.py#L613-L614 >>>> [1] >>>> https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.get_object >>>> >>>> David >>>> >>>> >>>> On Tue, Mar 30, 2021 at 10:43 AM Danny Sauer <[email protected]> >>>> wrote: >>>> >>>>> I've got Pulp set up to serve all the content from S3 behind >>>>> CloudFront. This works really well, except for a minor issue: the content >>>>> URLs are all the UUIDs for artifacts, not, for example, the pretty name of >>>>> the RPM being downloaded. That's an issue in my situation because we'd >>>>> really like to generate download analytics using off-the-shelf tools which >>>>> consume the AWS CDN standard log format. >>>>> >>>>> My initial thought was that it might be easy to have the redirects >>>>> include a query string in the generated URL which notes the original >>>>> filename or relative path requested. But I don't have sufficiently >>>>> developed Django skills to know the easiest way to do that (or if it's >>>>> even >>>>> reasonable to think that's easy). Using the content server's logs is >>>>> another option, but I have some other content on the same S3 bucket which >>>>> may not necessarily be reached solely through Pulp's content server, so >>>>> that means two log locations, etc. If it was easy to make Django / >>>>> Gunicorn log to an S3 bucket in a manner similar to Cloudfront, that might >>>>> also be ok. Post-processing logs with a series of API calls to work out >>>>> what artifact maps to what repository content would ideally be a last >>>>> resort. >>>>> >>>>> Anyone have some great insights which might help me out here? :) If >>>>> it helps, I'm building my own Docker images which ultimately run in EKS. >>>>> So patches / extra modules are an option, but I'd prefer to stay as close >>>>> to vanilla upstream as possible with environment variable-based config >>>>> adjustments. >>>>> >>>>> Thanks. >>>>> --Danny >>>>> _______________________________________________ >>>>> Pulp-list mailing list >>>>> [email protected] >>>>> https://listman.redhat.com/mailman/listinfo/pulp-list >>>> >>>>
_______________________________________________ Pulp-list mailing list [email protected] https://listman.redhat.com/mailman/listinfo/pulp-list
