Re: [Django] #23517: Collect static files in parallel

2014-11-21 Thread Django
#23517: Collect static files in parallel
-+-
 Reporter:  thenewguy|Owner:  nobody
 Type:  Uncategorized|   Status:  closed
Component:  contrib.staticfiles  |  Version:  1.7
 Severity:  Normal   |   Resolution:  wontfix
 Keywords:   | Triage Stage:
Has patch:  0|  Unreviewed
  Needs tests:  0|  Needs documentation:  0
Easy pickings:  0|  Patch needs improvement:  0
 |UI/UX:  0
-+-
Changes (by timgraham):

 * status:  new => closed
 * resolution:   => wontfix


Comment:

 I think Aymeric was trying to say that if Django has enough sufficient
 hooks so that users can implement this on their own, then that's enough.
 Maybe `StaticS3Storage` would like to include this in their code, but it's
 not obvious to me that we should include this in Django itself.

--
Ticket URL: 
Django 
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-updates+unsubscr...@googlegroups.com.
To post to this group, send email to django-updates@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/067.8f148386e84da7fa45cf122df85b8fce%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.


Re: [Django] #23517: Collect static files in parallel

2014-11-08 Thread Django
#23517: Collect static files in parallel
-+-
 Reporter:  thenewguy|Owner:  nobody
 Type:  Uncategorized|   Status:  new
Component:  contrib.staticfiles  |  Version:  1.7
 Severity:  Normal   |   Resolution:
 Keywords:   | Triage Stage:
Has patch:  0|  Unreviewed
  Needs tests:  0|  Needs documentation:  0
Easy pickings:  0|  Patch needs improvement:  0
 |UI/UX:  0
-+-
Changes (by thenewguy):

 * cc: wgordonw1@… (added)


--
Ticket URL: 
Django 
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-updates+unsubscr...@googlegroups.com.
To post to this group, send email to django-updates@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/067.273338de868959a5def44194db5fb1e4%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.


Re: [Django] #23517: Collect static files in parallel

2014-11-08 Thread Django
#23517: Collect static files in parallel
-+-
 Reporter:  thenewguy|Owner:  nobody
 Type:  Uncategorized|   Status:  new
Component:  contrib.staticfiles  |  Version:  1.7
 Severity:  Normal   |   Resolution:
 Keywords:   | Triage Stage:
Has patch:  0|  Unreviewed
  Needs tests:  0|  Needs documentation:  0
Easy pickings:  0|  Patch needs improvement:  0
 |UI/UX:  0
-+-
Changes (by thenewguy):

 * status:  closed => new
 * resolution:  needsinfo =>


--
Ticket URL: 
Django 
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-updates+unsubscr...@googlegroups.com.
To post to this group, send email to django-updates@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/067.0954e6903e48a60feaf26839dd022ee3%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.


Re: [Django] #23517: Collect static files in parallel

2014-11-08 Thread Django
#23517: Collect static files in parallel
-+-
 Reporter:  thenewguy|Owner:  nobody
 Type:  Uncategorized|   Status:  closed
Component:  contrib.staticfiles  |  Version:  1.7
 Severity:  Normal   |   Resolution:  needsinfo
 Keywords:   | Triage Stage:
Has patch:  0|  Unreviewed
  Needs tests:  0|  Needs documentation:  0
Easy pickings:  0|  Patch needs improvement:  0
 |UI/UX:  0
-+-

Comment (by thenewguy):

 Just wanted to post back on this.  I was able to write a quick 20 line
 proof of concept for this using the threading module.  The speedup was
 pretty significant so I figured I would reopen this again.  I could be
 wrong, but I imagine something like this would be beneficial to the
 general django userbase.  Granted, I don't know if others get as restless
 as I do while waiting on static files to upload.

 I've quickly tested collectstatic with 957 static files.  All files are
 post processed in some fashion (at least being hashed by
 ManifestFilesMixin) and also a gzipped file is created if the saved file
 benefits from gzip compression.  The storage backend stored the files on
 AWS S3.  The AWS S3 console listed 3254 files were deleted when I deleted
 the files after each test.  So in total, 3254 files were created during
 collectstatic per case.

 The following times are generated by the command line and should not be
 interpreted as quality benchmarks... but they are good enough to show the
 significance.

 {{{
 set startTime=%time%
 python manage.py collectstatic --noinput
 echo Start Time:  %startTime%
 echo Finish Time: %time%
 }}}

 Times (keep in mind staticfiles collectstatic does not output the count
 for gzipped files, so there are roughly 957*2 more files than it reports)
 {{{
 957 static files copied, 957 post-processed.
 async using 100 threads (ParallelUploadStaticS3Storage)
 Start Time:  16:43:57.01
 Finish Time: 16:49:30.31
 Duration: 5.55500 minutes

 sync using regular s3 storage (StaticS3Storage)
 Start Time:  16:19:24.21
 Finish Time: 16:41:46.78
 Duration: 22.3761667 minutes
 }}}


 This storage is derived from ManifestFilesMixin and a subclass of
 S3BotoStorage (django-storages) that creates gzipped copies and checks for
 file changes to keep reliable modification dates before saving:
 {{{
 class ParallelUploadStaticS3Storage(StaticS3Storage):
 """
 THIS STORAGE ASSUMES THAT UPLOADS ONLY OCCUR
 FROM CALLS TO THE COLLECTSTATIC MANAGEMENT
 COMMAND. SAVING TO THIS STORAGE DIRECTLY IS
 NOT RECOMMENDED BECAUSE THE UPLOAD THREADS
 ARE NOT JOINED UNTIL POST_PROCESS IS CALLED.
 """

 active_uploads = []
 thread_count = 100

 def remove_completed_uploads(self):
 for i, thread in reversed(list(enumerate(self.active_uploads))):
 if not thread.is_alive():
 del self.active_uploads[i]

 def _save_content(self, key, content, **kwargs):
 while self.thread_count < len(self.active_uploads):
 self.remove_completed_uploads()

 # copy the file to memory for the moment to get around file closed
 errors -- BAD HACK FIXME FIX
 content = ContentFile(content.read(), name=content.name)

 f = super(ParallelUploadStaticS3Storage, self)._save_content
 thread = threading.Thread(target=f, args=(key, content),
 kwargs=kwargs)

 self.active_uploads.append(thread)
 thread.start()

 def post_process(self, *args, **kwargs):
 # perform post processing
 for post_processed in super(ParallelUploadStaticS3Storage,
 self).post_process(*args, **kwargs):
 yield post_processed

 # wait for the remaining uploads to finish
 print "Post processing completed. Now waiting for the remaining
 uploads to finish."
 for thread in self.active_uploads:
 thread.join()
 }}}

--
Ticket URL: 
Django 
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-updates+unsubscr...@googlegroups.com.
To post to this group, send email to django-updates@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/067.ed7c869ce244d15f56241a7b520efcd7%40djangoproject.com.

Re: [Django] #23517: Collect static files in parallel

2014-09-28 Thread Django
#23517: Collect static files in parallel
-+-
 Reporter:  thenewguy|Owner:  nobody
 Type:  Uncategorized|   Status:  closed
Component:  contrib.staticfiles  |  Version:  1.7
 Severity:  Normal   |   Resolution:  needsinfo
 Keywords:   | Triage Stage:
Has patch:  0|  Unreviewed
  Needs tests:  0|  Needs documentation:  0
Easy pickings:  0|  Patch needs improvement:  0
 |UI/UX:  0
-+-
Changes (by aaugustin):

 * status:  new => closed
 * needs_docs:   => 0
 * resolution:   => needsinfo
 * needs_tests:   => 0
 * needs_better_patch:   => 0


Comment:

 I'm afraid we'll be reluctant to hardcode concurrent behavior in Django if
 there's another solution.

 You shoud be able to implement parallel upload in the storage backend
 with:

 - a `save` method that enqueues the operation for processing by a thread
 pool and returns immediately,
 - a `post_process` method that waits until the thread pool has completed
 all uploads.

 Can you try that approach, and if it doesn't work, reopen this ticket?

 Thanks!

--
Ticket URL: 
Django 
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-updates+unsubscr...@googlegroups.com.
To post to this group, send email to django-updates@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/067.51b21ae6a330d6d8ad41abcfa76d7063%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.