[Bug-wget] [bug #54825] unexpected wget appends .1 after the file extension

2018-10-12 Thread Tim Ruehsen
Follow-up Comment #8, bug #54825 (project wget):

If you have certain things in mind, issues at
https://gitlab.com/gnuwget/wget2/issues are appreciated.

___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[Bug-wget] [bug #54826] too much output on wget --version

2018-10-12 Thread Tim Ruehsen
Follow-up Comment #7, bug #54826 (project wget):

Oh good point, we forgot it so far :-)

At least we have to add the Copyright and bug report lines.

The Wgetrc/Locale/Compile/Link stuff should be ok with a different option,
e.g. --version in combination with --debug or so.

___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[Bug-wget] [bug #54826] too much output on wget --version

2018-10-12 Thread J
Follow-up Comment #6, bug #54826 (project wget):

Good link, thank you.

at least wget2 --version doesn't have this. Let's hope no one adds it to
wget2! ;-)

___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[Bug-wget] [bug #54826] too much output on wget --version

2018-10-12 Thread Tim Ruehsen
Follow-up Comment #5, bug #54826 (project wget):

The best reason to *keep* that ouput is that is was put like that ~10years (or
maybe 20 ?) ago. There are production scripts out there that parse the output
of wget --version.

Apart from that, you are right. And I wish we could just drop bw compat. At
least that would be more fun - but we have to be responsible as well.

And LLVM is just a completely different approach to compilation / translation.
LowLevelVirtualMachine... gcc could translate into it as well... but the
ethics are different and thus the licenses.

And comparing clang / gcc as frontend to the C language... clang still refuses
to compile nested functions which is a blocker to use it in my company. We use
nested functions since 20 years, no way to rewrite the stable code (makes it
much more complicated and would introduce bugs).

And gcc can so so much more, see
https://www.linux.com/blog/2018/10/gcc-optimizing-linux-internet-and-everything

___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[Bug-wget] [bug #54825] unexpected wget appends .1 after the file extension

2018-10-12 Thread J
Follow-up Comment #7, bug #54825 (project wget):

Fair enough.

I would probably use the opportunity for wget2 to be more evolutionary, and
adopt some new defaults. Old users could of course use old behavours.

This is a bit like when G++ changes the default C++ language spec to compile
against. Can't hang on with old things for ever..

___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[Bug-wget] [bug #54825] unexpected wget appends .1 after the file extension

2018-10-12 Thread Tim Ruehsen
Follow-up Comment #6, bug #54825 (project wget):

You can make your own default and put it into ~/.wget2rc or any other file set
by $WGET2RC or by --config.

But the hard-coded default will be kept as to make wget2 as nearly compatible
to wget as possible. So it can be used as drop-in replacement for wget, at
least in most cases.


___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[Bug-wget] [bug #54826] too much output on wget --version

2018-10-12 Thread J
Follow-up Comment #4, bug #54826 (project wget):

There is no reason to cling to old output, just because it was put that way
~10 years ago.

Evolution, and rationalisation are vital for software. Unfortunately in
software engineering there are engineers unwilling to change, that really
holds back a lot of packages. 


Look what happened when GCC wouldn't open up, that came back together as
2.95.

Again, look what happen when GCC again wouldn't open up. LLVM was born.

I've said my piece - I'm always positive about improving software :)

___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[Bug-wget] [bug #54825] unexpected wget appends .1 after the file extension

2018-10-12 Thread J
Follow-up Comment #5, bug #54825 (project wget):

That is good. 
As wget2 is new, maybe it can be revolutionary and adopt that as default?

___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[Bug-wget] [bug #54825] unexpected wget appends .1 after the file extension

2018-10-12 Thread Tim Ruehsen
Follow-up Comment #4, bug #54825 (project wget):

Just want to add that Wget2 has --keep-extension (commit
c9796a174d04099d07dd7dc70da47ea92d3ba3a6).

E.g. that generates index_1.html instead of index.html.1.


___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[Bug-wget] [bug #54828] wget stalled download shows wrong speed per second

2018-10-12 Thread Darshit Shah
Update of bug #54828 (project wget):

  Status:None => Invalid
 Open/Closed:Open => Closed 

___

Follow-up Comment #1:

This is actually two different issues.

Anyways, firstly, the speed being stuck is a technical limitation of Wget
being a single threaded application. Wget uses only blocking network sockets
and while it is blocked on a read() call, there is nothing it can do to update
the UI. Similarly, mentioning that it is stalled is also not possible without
adding threading support. Something that we do not intend to do. These issues
don't exist in Wget2 which has been designed with multi-threading from the
very beginning.

Regarding the "(Success)" string, that is not controlled by Wget at all. It is
in fact the string reported by the kernel for the last error. In your case, it
seems to be that the last socket operation was a success, but the connection
was still terminated. 

___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[Bug-wget] [bug #54825] unexpected wget appends .1 after the file extension

2018-10-12 Thread Darshit Shah
Update of bug #54825 (project wget):

  Status:None => Wont Fix   
 Open/Closed:Open => Closed 

___

Follow-up Comment #3:

EDIT: Will fix in Wget2.

___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[Bug-wget] [bug #54826] too much output on wget --version

2018-10-12 Thread Darshit Shah
Update of bug #54826 (project wget):

  Status:None => Wont Fix   
 Open/Closed:Open => Closed 

___

Follow-up Comment #3:

Sure, you've shown that a bunch of other programs don't dump the compile time
options. But that's not a rationale for removing it. I still don't see a
reason why I should remove it now that it is already in there? We've had this
output for ~10 years now and as a developer, I know it's been helpful on
occasion when debugging something for a user.

I'm closing this bug for now. If you have a good reason for why it should not
be there, apart from, "I don't like it", or "the others don't do it", please
feel free to re-open.

___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[Bug-wget] [bug #54825] unexpected wget appends .1 after the file extension

2018-10-12 Thread J
Follow-up Comment #2, bug #54825 (project wget):

Thank you for your reply. Will check out wget2

___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[Bug-wget] [bug #54826] too much output on wget --version

2018-10-12 Thread J
Follow-up Comment #2, bug #54826 (project wget):

wget is a user application. No other GNU package clutters the --version output
in such a manner. What's the rationale for dumping the build config into the
--version output? I can't see any valid reason to keep build config output
there.

jonny@asus:~$ gcc --version
gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

jonny@asus:~$ bash --version
GNU bash, version 4.4.19(1)-release (x86_64-pc-linux-gnu)
Copyright © 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 

This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
jonny@asus:~$ ls --version
ls (GNU coreutils) 8.28
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Richard M. Stallman and David MacKenzie.
jonny@asus:~$ cp --version
cp (GNU coreutils) 8.28
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Torbjorn Granlund, David MacKenzie and Jim Meyering.
jonny@asus:~$ ld --version
GNU ld (GNU Binutils for Ubuntu) 2.30
Copyright (C) 2018 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public Licence version 3 or (at your option) a later version.
This program has absolutely no warranty.
jonny@asus:~$ 


___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[Bug-wget] [bug #54828] wget stalled download shows wrong speed per second

2018-10-12 Thread J
URL:
  

 Summary: wget stalled download shows wrong speed per second
 Project: GNU Wget
Submitted by: now3d
Submitted on: Fri 12 Oct 2018 11:17:23 AM UTC
Category: None
Severity: 3 - Normal
Priority: 5 - Normal
  Status: None
 Privacy: Public
 Assigned to: None
 Originator Name: 
Originator Email: 
 Open/Closed: Open
 Discussion Lock: Any
 Release: 1.19.4
Operating System: None
 Reproducibility: None
   Fixed Release: None
 Planned Release: None
  Regression: None
   Work Required: None
  Patch Included: None

___

Details:

wget, where download stalls, display is stuck on "1.32MB/s" you can see below.
It should probably say "0MB/s (stalled)" ?

it could probably also have a second counter, as it is ambiguous it is
stalled, which lasts several minutes until it retries.

Also I am surprised it says "(Success)" after a read error, can that be
removed or replaced with "(Partial)"

I've obscured the URL, but you can use your own test case to reproduce.



$ wget https://www.mydomain123.com/tmp/mydomain123.zip
--2018-10-12 10:49:39--  https://www.mydomain123.com/tmp/mydomain123.zip
Resolving www.mydomain123.com (www.mydomain123.com)... 12.34.56.78
Connecting to www.mydomain123.com (www.mydomain123.com)|12.34.56.78|:443...
connected.
HTTP request sent, awaiting response... 200 OK
Length: 4954873 (4.7M) [application/zip]
Saving to: ‘mydomain123.zip’

mydomain123  57%[==> ]   2.72M  1.40MB/sin 1.9s

2018-10-12 11:04:41 (1.40 MB/s) - Read error at byte 2850816/4954873
(Success). Retrying.

--2018-10-12 11:04:42--  (try: 2) 
https://www.mydomain123.com/tmp/mydomain123.zip
Connecting to www.mydomain123.com (www.mydomain123.com)|12.34.56.78|:443...
connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 4954873 (4.7M), 2104057 (2.0M) remaining [application/zip]
Saving to: ‘mydomain123.zip’

mydomain123 100%[+++>]   4.72M  1.37MB/sin 1.5s

2018-10-12 11:04:44 (1.37 MB/s) - ‘mydomain123.zip’ saved
[4954873/4954873]

$ 




___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




Re: [Bug-wget] Hello again

2018-10-12 Thread michael


Hello Darshit Shah,

Converting a CMS system to static HTML pages is not a solution that suite all. 
Some sites which want to be 'dynamic' and retain "backward flik-flak" abilities 
might not use wget2 and retain their CMS or software behavior.

Many people creating a website use CMS to generate the site because of its 
abilities to retain uniform website and make every change in GUI site-wide. 
Those people might want to have the static website as it is faster to download 
(Google SEO factor) and much more secure - hiding the CMS location and 
preventing login attempts.

If those people would want to retain features as RSS feeds, we might be able to 
tell them how they can have it.

If a website contains some hidden pages that are connected by JavaScript code, 
the programmer might create a shell script calling wget2 specifying each hidden 
page location.

Have a good weekend!

Michael



-Original Message-
From: 'Darshit Shah'  
Sent: Thursday, 11 October, 2018 12:35 PM
To: mich...@cyber-dome.com
Cc: bug-wget@gnu.org
Subject: Re: [Bug-wget] Hello again

* mich...@cyber-dome.com  [181009 17:12]:
> 
> Hello Darshit Shah,
> 
> Thank you for your welcome message. I am glad to be part of your project!
> 
> I don't understand the term "javascript engine". AFAK javascript is code that 
> run on the browser side, and we have no problem fetching it.
>
Exactly! Javascript is code that is executed on the client side and hence
requires a javascript engine which interprets the code and executes it.
However, Wget does not and will not package a javscript engine in order to run
those scripts. This means, sites where Javascript is used to create hyperlinks
won't work well when scraped through Wget.
> 
> There might be an "ajax" issues with sites rely on it. Ajax is dealt heavy by 
> programmers and they will have to take some action on their site to 
> incorporate the engine.

Similarly, sites that use Javascript to show menus or create AJAX requests are
usually not amenable to being scraped as a static HTML page.
> 
> POST requests to comments and mail will need to taken care of so they will 
> work on static site. One solution is to do hosted supplier that will carry 
> the task and deliver spam removal as well.
> I think I will be able to a howto document on that.
> 
> Michael
> 
> -Original Message-
> From: Darshit Shah  
> Sent: Tuesday, 9 October, 2018 2:52 PM
> To: mich...@cyber-dome.com
> Cc: bug-wget@gnu.org
> Subject: Re: [Bug-wget] Hello again
> 
> Hi Michael,
> 
> Nice to hear from you again. I vaguely remember a mention of someone who 
> wanted
> to work on this feature. When deciding to make this work, please remember that
> any of this can only work if the site does not rely on Javascript; which given
> Wordpress is a difficult thing. The reason for this is that we do _not_ intend
> to ship a javascript engine alongwith Wget2. It is too large, unwieldy and too
> much of a maintenance nightmare. However, if the site can work without
> Javascript, then I would assume that Wget2 can already handle making a static
> copy. If it can't handle something, please let us know / file a bug report
> about it.
> 
> Of course, I welcome you to work on Wget2 as you see fit. And we would love to
> look at any contributions you can make. We will also try and help you out as
> much as possible when dealing with the codebase.
> 
> About the dev setup, I only use vim and gdb to work with Wget. As Tim has
> already mentioned, he uses Netbeans and might be able to help you out.
> 
> You also mentioned something about the lib/ directory. That is an
> auto-generated dir with compatibility libs that you don't need to care about.
> All the code for Wget2 is in src/ and the code for the library is in libwget/.
> Those are the two main directories you need to care about. And of course 
> tests/
> for the tests.
> 
> * mich...@cyber-dome.com  [181008 21:22]:
> > 
> > Hello again,
> > 
> > My name is Michael. I have approached you about a year ago.
> > 
> > I am interested in making wget2 a tool that can convert content management
> > systems (like WordPress) output to HTML. This actually limits the content
> > management system to generate the website every time it is changed, and the
> > presentation is done using the HTTP server only.
> > 
> > This is an important feature as it prevents security risk - penetration of
> > hacker to the site and installing viruses or stealing data.
> > It also allows the website to be delivered much faster as no PHP code needs
> > to run in order to deliver the content. Google already announced that site
> > download speed is a factor in its SEO evaluation.
> > 
> > I will be able to work for 3 hours every week on the project. I do need some
> > guidance from you.
> > 
> > I have started to configure Netbeans IDE as using a debugger can help me
> > delve into the code much faster. There are some issues with the Netbeans. Do
> > you use Id? Which one?
> > 
> > Best regards,
> > 
> > 

[Bug-wget] [bug #54825] unexpected wget appends .1 after the file extension

2018-10-12 Thread Darshit Shah
Follow-up Comment #1, bug #54825 (project wget):

I kind of agree here. Over time Wget has had the --no-clobber option do many
things and trying to preserve backwards compatibility has only complicated
everything. It would be ideal to have a --force option which causes Wget to
overwrite the file. You can currently do that by explicitly specifying the
filename using -O.


However, all new features are currently being added only to Wget2. Which is
the next version of Wget, with (almost) complete command-line parity. However,
we are indeed making some backwards incompatible modification which allow
making changes as you've proposed a lot easier. Please take a look at the
source available here: https://www.gitlab.com/gnuwget/wget2.git. It is also
available on Savannah and has been packaged for Debian already.

___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[Bug-wget] [bug #54826] too much output on wget --version

2018-10-12 Thread Darshit Shah
Update of bug #54826 (project wget):

Severity:  3 - Normal => 1 - Wish   

___

Follow-up Comment #1:

Why exactly? As a developer I like that information in the --version output
since it gives me clear information about the build when trying to debug an
issue remotely.

And I don't see how a little extra information in the --version output harms
anything at all. Do you have any concrete reasons apart from a personal
preference?

___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[Bug-wget] [bug #54826] too much output on wget --version

2018-10-12 Thread J
URL:
  

 Summary: too much output on wget --version
 Project: GNU Wget
Submitted by: now3d
Submitted on: Fri 12 Oct 2018 09:55:35 AM UTC
Category: None
Severity: 3 - Normal
Priority: 5 - Normal
  Status: None
 Privacy: Public
 Assigned to: None
 Originator Name: 
Originator Email: 
 Open/Closed: Open
 Discussion Lock: Any
 Release: 1.19.4
Operating System: None
 Reproducibility: None
   Fixed Release: None
 Planned Release: None
  Regression: None
   Work Required: None
  Patch Included: None

___

Details:

Hello

I noticed contary to other GNU tools, wget --version has a lot of extra output
about the compiler build. Can that be removed?

If needed, wget --build-info could be added?

Thanks, Jonny


$ wget --version
GNU Wget 1.19.4 built on linux-gnu.

-cares +digest -gpgme +https +ipv6 +iri +large-file -metalink +nls 
+ntlm +opie +psl +ssl/openssl 

Wgetrc: 
/etc/wgetrc (system)
Locale: 
/usr/share/locale 
Compile: 
gcc -DHAVE_CONFIG_H -DSYSTEM_WGETRC="/etc/wgetrc" 
-DLOCALEDIR="/usr/share/locale" -I. -I../../src -I../lib 
-I../../lib -Wdate-time -D_FORTIFY_SOURCE=2 -DHAVE_LIBSSL -DNDEBUG 
-g -O2 -fdebug-prefix-map=/build/wget-LB8XFP/wget-1.19.4=. 
-fstack-protector-strong -Wformat -Werror=format-security 
-DNO_SSLv2 -D_FILE_OFFSET_BITS=64 -g -Wall 
Link: 
gcc -DHAVE_LIBSSL -DNDEBUG -g -O2 
-fdebug-prefix-map=/build/wget-LB8XFP/wget-1.19.4=. 
-fstack-protector-strong -Wformat -Werror=format-security 
-DNO_SSLv2 -D_FILE_OFFSET_BITS=64 -g -Wall -Wl,-Bsymbolic-functions 
-Wl,-z,relro -Wl,-z,now -lpcre -luuid -lidn2 -lssl -lcrypto -lpsl 
ftp-opie.o openssl.o http-ntlm.o ../lib/libgnu.a 

Copyright (C) 2015 Free Software Foundation, Inc.
Licence GPLv3+: GNU GPL version 3 or later
.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Originally written by Hrvoje Niksic .
Please send bug reports and questions to .





___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[Bug-wget] [bug #54825] unexpected wget appends .1 after the file extension

2018-10-12 Thread J
URL:
  

 Summary: unexpected wget appends .1 after the file extension
 Project: GNU Wget
Submitted by: now3d
Submitted on: Fri 12 Oct 2018 09:54:13 AM UTC
Category: None
Severity: 3 - Normal
Priority: 5 - Normal
  Status: None
 Privacy: Public
 Assigned to: None
 Originator Name: 
Originator Email: 
 Open/Closed: Open
 Discussion Lock: Any
 Release: 1.19.4
Operating System: GNU/Linux
 Reproducibility: None
   Fixed Release: None
 Planned Release: None
  Regression: None
   Work Required: None
  Patch Included: None

___

Details:

Hello

Can wget not append the .1 after the file extension? It means manual changes
are needed by the user. If Firefox or Chrome downloads the same file twice, it
does not do this.

GNU Wget 1.19.4 built on linux-gnu.
ubuntu


2018-10-12 10:49:24 (582 KB/s) - ‘e.zip.1’ saved [491905/491905]





___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/