[Bug-wget] [bug #52705] HTML assets embedding with --page-requisites

2018-11-12 Thread Darshit Shah
Update of bug #52705 (project wget):

  Status:None => Wont Fix   
 Open/Closed:Open => Closed 


___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[Bug-wget] [bug #52705] HTML assets embedding with --page-requisites

2017-12-21 Thread Darshit Shah
Follow-up Comment #2, bug #52705 (project wget):

While MHTML was a convenient way to create snapshots of pages, sadly it was
never properly standardized and most popular browsers no longer support it.

WARC has been almost standardized and is considered the de-facto way of
archiving a web page / web site.

Wget supports saving into the WARC format. So you may want to look into using
that. 

Else, implementing MHTML should not be too hard. Just some postprocessing code
in all the places where WARC data is stored. However, none of the developers
currently have time to work on a new feature. So, if you could write a patch,
we might review  and accept it.

Implementing this as a plugin for Wget would however be easier and cleaner.

___

Reply to this item at:

  

___
  Message sent via/by Savannah
  http://savannah.gnu.org/




[Bug-wget] [bug #52705] HTML assets embedding with --page-requisites

2017-12-21 Thread Dale Worley
Follow-up Comment #1, bug #52705 (project wget):

I believe that wget can save a page and all its assets into a directory
structure, which can be archived in a single file in many ways.

Are there good, compatible ways to save all page assets embedded into one HTML
file?


___

Reply to this item at:

  

___
  Message sent via/by Savannah
  http://savannah.gnu.org/




[Bug-wget] [bug #52705] HTML assets embedding with --page-requisites

2017-12-20 Thread anonymous
URL:
  

 Summary: HTML assets embedding with --page-requisites
 Project: GNU Wget
Submitted by: None
Submitted on: Wed 20 Dec 2017 10:16:35 AM UTC
Category: Feature Request
Severity: 3 - Normal
Priority: 5 - Normal
  Status: None
 Privacy: Public
 Assigned to: None
 Originator Name: Artur Shayhutvinov
Originator Email: mirrorim...@yandex.ru
 Open/Closed: Open
 Discussion Lock: Any
 Release: None
Operating System: None
 Reproducibility: None
   Fixed Release: None
 Planned Release: None
  Regression: None
   Work Required: None
  Patch Included: None

___

Details:

It would be great to have an option that enforces wget to save all page assets
(images, styles, scripts) embedded in one html file.

Old proprietary MHTML format was very useful to save/share copies of articles
from internet, but now it isn't supported by most browsers anymore. I was
replacing this need with printing pages to PDF documents but it's not perfect
way because some sites looks broken in print mode and others may loose
important parts of content. Since HTML standard supports inline images the
problem can be solved just through pretty simple postprocessing.

Sorry for bad English. Thanks.





___

Reply to this item at:

  

___
  Message sent via/by Savannah
  http://savannah.gnu.org/