Hi Danny, I don't know much about AWS logging but Pulp does set the filename in the response-content-disposition[0]. Could that be used to determine the filename for each request?
If not, I'm looking at the boto3 docs for get_object[1] to see if there's another parameter we could set to help you track the filename in requests but I'm seeing anything useful. My knowledge of s3 is a bit limited so if you have a suggestion how we can construct a request to S3 that would help you to track the filenames of requests to s3, I could probably look at how we could support it in Pulp 3. [0] https://github.com/pulp/pulpcore/blob/f38f955425b185749b3c8d4d878a7e166cfc05b9/pulpcore/content/handler.py#L613-L614 [1] https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.get_object David On Tue, Mar 30, 2021 at 10:43 AM Danny Sauer <[email protected]> wrote: > I've got Pulp set up to serve all the content from S3 behind CloudFront. > This works really well, except for a minor issue: the content URLs are all > the UUIDs for artifacts, not, for example, the pretty name of the RPM being > downloaded. That's an issue in my situation because we'd really like to > generate download analytics using off-the-shelf tools which consume the AWS > CDN standard log format. > > My initial thought was that it might be easy to have the redirects include > a query string in the generated URL which notes the original filename or > relative path requested. But I don't have sufficiently developed Django > skills to know the easiest way to do that (or if it's even reasonable to > think that's easy). Using the content server's logs is another option, but > I have some other content on the same S3 bucket which may not necessarily > be reached solely through Pulp's content server, so that means two log > locations, etc. If it was easy to make Django / Gunicorn log to an S3 > bucket in a manner similar to Cloudfront, that might also be ok. > Post-processing logs with a series of API calls to work out what artifact > maps to what repository content would ideally be a last resort. > > Anyone have some great insights which might help me out here? :) If it > helps, I'm building my own Docker images which ultimately run in EKS. So > patches / extra modules are an option, but I'd prefer to stay as close to > vanilla upstream as possible with environment variable-based config > adjustments. > > Thanks. > --Danny > _______________________________________________ > Pulp-list mailing list > [email protected] > https://listman.redhat.com/mailman/listinfo/pulp-list
_______________________________________________ Pulp-list mailing list [email protected] https://listman.redhat.com/mailman/listinfo/pulp-list
