Re: wget: unable to resolve host address

2022-02-16 Thread Seymour J Metz
Given that RFCs 3490-3492 came out in 2003 and 5890-5895  came out in 2010, I 
would have expected IDNA support by now. Does anybody know for sure?


From: Bug-wget  on behalf of 
pythonomor...@gmail.com 
Sent: Tuesday, February 8, 2022 1:26 PM
To: bug-wget@gnu.org
Subject: wget: unable to resolve host address

Hello,

I am trying to download from a list of files (jpeg images). The website
utilizes Cyrillic in its URL. I get the following error message: wget:
unable to resolve host address 'xn--h-xubc'

I've checked the links manually and the do work.

I am enclosing a shortened version of the file list.

I've tried different commands to no avail:

wget.exe -i C:\dl_files\url-list.txt --secure-protocol=auto
--remote-encoding=Windows-1251 -nc -c -P C:\dl_files\

I've used Windows-1251 as I did not see a list of encoding names in the
manual 
https://secure-web.cisco.com/1ooTZPy8h-fBRcp0Zjk_hT6tQbv4w0wsk879mz0uB6aG15KQwcB5um7xiytswPhvpEx2CdU9QntWH_SPxAnAAG2ARAaxmvTXfptU_z__MN1SAGF4Sez144I6e5o6wRDx_cSKPXoTDNyplauirv54vbnDS5kLuXXsirRhFl1o3guYaHHwaf3LYbyLEOP1sfTL44_bLjOocvGciGnBwA68K2ME4JREkRcBuegw_-t6YfWN3v9vCCIziBr8G5DQ-u2wZVCytrHEb423jdgKX3xtQJQrfCnNBUT243xpqVx57lS8cbrgaBTxvUOBIKj0Se4FctlqI9ZanNX4VKAbM5laWTi54FjwlpdEqS5p2a-_mHFAGnfVznDud3Ng47NLEw8LBwKlZSNA26ms9KzvmbbG0zDq3PF5CE_nwWxjc01-0kGa2qeRISiPFM58HpVsAG3Pt/https%3A%2F%2Fwww.gnu.org%2Fsoftware%2Fwget%2Fmanual%2Fwget.html%23Wgetrc-Commands

wget.exe -i C:\dl_files\url-list.txt --secure-protocol=auto -nc -c -P
C:\dl_files\



Apparently the problem is caused by Cyrillic characters. I have inkling that
I am not using the correct options for the program.

I would appreciate if you gave me a hint on how to solve the problem.



Regards,

Max








RE: [bug #58354] Wget doesn't parse URIs starting with http:/

2020-05-12 Thread Seymour J Metz
I've got code for parsing broken URLs at 
http://mason.gmu.edu/~smetz3/source/unobfuscate.zip if that's of any use to 
you. 


--
Shmuel (Seymour J.) Metz
http://mason.gmu.edu/~smetz3


From: Bug-wget [bug-wget-bounces+smetz3=gmu@gnu.org] on behalf of Luca 
Bernardi [invalid.nore...@gnu.org]
Sent: Tuesday, May 12, 2020 6:57 AM
To: Luca Bernardi; gscriv...@gnu.org; tim.rueh...@gmx.de; bug-wget@gnu.org; 
dar...@gnu.org
Subject: [bug #58354] Wget doesn't parse URIs starting with http:/

Follow-up Comment #1, bug #58354 (project wget):

PS This bug has happened when trying to crawl a website with default Wordpress
template.

___

Reply to this item at:

  
<https://secure-web.cisco.com/1q_9r4L4Y69ONAuRRi0ugNjuqo2Tj_fFoBQbF5ioU-bnyA1vRNKC2qjgGrGzNsMeAi9WBFuCZq5ZbRgGNcUnwFXhwPut6uzco1g0e7u7DGjIlIzN1O2Kb8A7lcd1hGFvVO2RlJOXPPbaPfPz1vWjpt1lp_MSi15q_ApZl5XAVjS7RRw_8hl0LW1Vlav9F86E8xj6U0j7w1Rb17wjLXaH3YDyCxaR2rYYNb5aMPjo-HUQgiErPIGkmU5OTyscR3nnY5AZZ-gRcgT7fDYF-9BIsYRmM1WK1zcfH5YaUF08mWkkbcQcl4uZEgkb53ewOM5Hc2ze5rHP40EGGXdoHzHZCnFQ-tEzuTrjgYf4u8kaLWS_mLhOUPdnuK0TVTYUcKWVhJJLvOlsmp7YPRnhtDQNzNqDbDLbFFtg7nplUPJo8CIC74qShVvDvMPALoH0UviH4/https%3A%2F%2Fsavannah.gnu.org%2Fbugs%2F%3F58354>

___
  Message sent via Savannah
  
https://secure-web.cisco.com/1t7bdydvsCxBYK2hviWUK34edpVCbTtcc7hvoEjsGxp7TF7YcwxQ4wHZDEeqhx7ckLh33IjhN6G3CTT6UK6Nhhq-1MBzaLtKN3ycAbQu9cLQX_Is4dFUdOLYzPUdtaX4csfyBmvz-h5-D-HjK5ZoEEYyJLkpqwjCVh8FrDCzMX3GPuG7Gc47pGRmt4cAoaa64gi3TWmRF9Rlac3d-3JLYmkzxyBl6DMT_eeYR9YQIZLnWPYhJhdG4367UOEV6eEJPSzbApw6N0xoxr7bE9EhRLs509MOh6MRMnCQPJk6JpDttjn_xSjlybWQzZRlYmm87zlzgsopx_leVwUGOHKtEcCDJqMajmWHC4NDH2M3DPfHGQ5uSYTbaoVmgMMZBuHksYzhBaW8pWLkIYDTAe288H6u12Rr1qbRMeJA6v5UeUTNSgb5ebn2ld1j9hvKPDnN-/https%3A%2F%2Fsavannah.gnu.org%2F






RE: [bug #57884] wget reveals my operating system to the server

2020-02-24 Thread Seymour J Metz
Which raises far more serious security concerns than reporting browser 
capabilities.


--
Shmuel (Seymour J.) Metz
http://mason.gmu.edu/~smetz3


From: Bug-wget [bug-wget-bounces+smetz3=gmu@gnu.org] on behalf of Bruno 
Haible [br...@clisp.org]
Sent: Monday, February 24, 2020 6:42 AM
To: ge...@mweb.co.za; Tim Ruehsen
Cc: bug-wget
Subject: Re: [bug #57884] wget reveals my operating system to the server

ge...@mweb.co.za wrote:
> I wonder about the reason given: "To avoid compatibility issues."
> That was - if I recall correctly - the reason for having the string
> to start with: So that servers can format pages to suit the capabilities
> of the browser and version used.

That was how web applications were written 15-20 years ago. 10 years ago, the
browser capabilities are queried by the JavaScript toolkit [1][2]. Nowadays,
they prefer feature detection in JavaScript.

Bruno

[1] 
https://secure-web.cisco.com/1yxilmEYA0e_5D6kvA8W5Cqm4kLz7h7_Ye2VnfxQYqkm9N5qlYZSFt6ngcSYQysbe7ePDJeVOpzlAGq44PHRXdWMlXd6AozIn2B-QQ00LfnnlSynWCurXgcAyVxpnW-4s70vww8NvO8jBboJnb0vcvOoY4Rx_k9ak4zmgPbkDkmRc5OF5X7GXC5Sllh9M_A89zAoTeJ4Q5aHOU5M7io_xkP2-SV1t67Emos6BKN0Eixj9mejKPe27JFKXBVpgIzeXquux9HMR3XLEHe67qd5ojjG8LDkYJmPldP9JAz31DHH-WIJBk3RKoX6JyvOjzZjYCCw8itfdbd_0tS5m157ff-kv08SLGrOIQgjexjO7_zyer_-ihCJubx7krfmWMXGk8wwusXzNU3LtVCyYfDWC5cDJcGIpEP5GQ79aB23QXwkcLkZUEu03lkFPOXOPVWpY/https%3A%2F%2Fdojotoolkit.org%2Freference-guide%2F1.7%2Fquickstart%2Fbrowser-sniffing.html
[2] 
https://secure-web.cisco.com/1cdYJDVpsOUXTFN9ygkgkR4_DBO4vlgE2j2QphPYaQgLlsortmJpgrLdRCbCoQgTsynxSE5GISz85Qp1ck_jjAz0M4hrOQ5CHKVoXqtu11b50PX3AoxjXLI2VeCC_8_G5GHMYQxp32nRo5PYUX3yHcmHZYRjut_xzl7nWNWc4Eb0adTaI1r3raH9dBt1y_yn14Uk5U1Z27FhC_0DLCHG0Hx-mTj4tawa4dcVTUYfG8kXPHWqbvCzOQnITtFd7SCeJhHqcaM88nnVPn6MgmzAYnFkRQYgnj02brU4ODRHpIxCKd9oXc6J9gDoAB7dXs8SDxiLCrd3cyd_fbDRf8BlAlMg8xWvED1-4LV3Juv0xMN-4NIh6W3uBRoAdr6fI0iPO_WoaitaKdbrc852h-5hTjf6bXX7foXMoI8-iGl5IravBl05HyOXrugDaZ8rQ1VlD/https%3A%2F%2Fapi.jquery.com%2FjQuery.browser%2F






Re: [bug #57356] Don't use smart quotes in output messages

2019-12-04 Thread Seymour J Metz
If the code tailors the delimiters to the local then I see nothing wrong with 
“text” or ‘text’. OTOH, if its hardwired then I agree that only ASCII 
characters should be used.

--
Shmuel (Seymour J.) Metz
http://mason.gmu.edu/~smetz3



From: Bug-wget  on behalf of anonymous 

Sent: Wednesday, December 4, 2019 10:03 AM
To: gscriv...@gnu.org; tim.rueh...@gmx.de; bug-wget@gnu.org; dar...@gnu.org
Subject: [bug #57356] Don't use smart quotes in output messages

URL:
  
<https://secure-web.cisco.com/1utMRam1wCgY4KgzqUK76rk89f7jO9i0WaMYNWk8gG3F3KjTFkPuSeZvYkNqNfFVfoOIRic-nBDbgMlmKMNtY5IduxnyVzTGVrIrBJ5CyFEUcN_XU6zx899dJxnK7ErFWTymk092zn_lvlNhg-nrFzZI8bV7WKJA1ruhBt6THiyrgx7rB3sVk4qamwzUwL9e_XQo1efeYrp9gtr3ZLlZQdRP8rQgvW-Qa-020Q2nUSOI0CwOySC135xwtHIFX77sjW4ueKsXGYmfbVNLcKodyLIqoXlfPJGjekZPVMQKl-YB5ud90b1r5f_dclGdSrZru9mlcb-FzqTfpLn0A7vltE_uydVkJvUFeWA-bupp6COBHEeURJDT74KB0ymedr58A85GwNIT1VF3MjzYxSWH8aw2ZTk0Rvlb2Req02BoWolu0R8-m06prkbkazI1EVE5H/https%3A%2F%2Fsavannah.gnu.org%2Fbugs%2F%3F57356>

 Summary: Don't use smart quotes in output messages
 Project: GNU Wget
Submitted by: None
Submitted on: Wed 04 Dec 2019 03:03:56 PM UTC
Category: User Interface
Severity: 3 - Normal
Priority: 5 - Normal
  Status: None
 Privacy: Public
 Assigned to: None
 Originator Name:
Originator Email:
 Open/Closed: Open
 Discussion Lock: Any
 Release: trunk
Operating System: None
 Reproducibility: Every Time
   Fixed Release: None
 Planned Release: None
  Regression: None
   Work Required: None
  Patch Included: None

___

Details:

The redirect message _Redirecting output to ‘wget-log’_ uses unicode smart
quotes

Expected: use normal typed ' ' quote characters.

programs used by programmers / technical users should NEVER display smart
quotes

1.20.3 homebrew
macOS 10.14.6





___

Reply to this item at:

  
<https://secure-web.cisco.com/1utMRam1wCgY4KgzqUK76rk89f7jO9i0WaMYNWk8gG3F3KjTFkPuSeZvYkNqNfFVfoOIRic-nBDbgMlmKMNtY5IduxnyVzTGVrIrBJ5CyFEUcN_XU6zx899dJxnK7ErFWTymk092zn_lvlNhg-nrFzZI8bV7WKJA1ruhBt6THiyrgx7rB3sVk4qamwzUwL9e_XQo1efeYrp9gtr3ZLlZQdRP8rQgvW-Qa-020Q2nUSOI0CwOySC135xwtHIFX77sjW4ueKsXGYmfbVNLcKodyLIqoXlfPJGjekZPVMQKl-YB5ud90b1r5f_dclGdSrZru9mlcb-FzqTfpLn0A7vltE_uydVkJvUFeWA-bupp6COBHEeURJDT74KB0ymedr58A85GwNIT1VF3MjzYxSWH8aw2ZTk0Rvlb2Req02BoWolu0R8-m06prkbkazI1EVE5H/https%3A%2F%2Fsavannah.gnu.org%2Fbugs%2F%3F57356>

___
  Message sent via Savannah
  
https://secure-web.cisco.com/10dPS8Hx_IFek6cDkUZA2_4xuV-Xwb879Qj9bsZbcB8FdquqsJBXzgKoOM9noUMaEQJOqyLROdefVIohduJnmDWu4hbR82PnnAkCUwb4HMPhtAoxUM_hSUoCpyrqW5eSoYfJbRFk5J1oX2kBbAplwHfk1t6amtyMUky62oLfT3MOSLt2hkAFXfeqp3ZxSsJeVizuYQgH-LzfI17RV2X0ycVNjMerLhpsFSbekX4TIMF2oDNis8xBTF0N0XiME9rZVHZ5F2dF-Y_mspqJCgEPze3iV8590KIZSns-9YE6PpoED7y6M8Dvle5VsS9PoScAf97EUQ0v01jeGODY7syGzC89F0SXRprCCWWUyxlZE2kQO4qkWYHTl7OPhGugJoqcEsJEevFJlu1v4YXA034K7osq0ztGs4CmbAdLaWgPohc2PKADCit89E9JLs808SsTX/https%3A%2F%2Fsavannah.gnu.org%2F






Re: [Bug-wget] Standard cookie file extension

2019-10-23 Thread Seymour J Metz
RFC 6265 does not define cookie files; the way that a browser stores cookies is 
up to the browser.


--
Shmuel (Seymour J.) Metz
http://mason.gmu.edu/~smetz3



From: Bug-wget  on behalf of Peng Yu 

Sent: Wednesday, October 23, 2019 9:49 AM
To: bug-wget
Subject: [Bug-wget] Standard cookie file extension

Hi, I am wondering if there is a standard cookie file extension for
cookie files written by wget. So far, I only see filenames like
cookie.txt cookies.txt. So the extension is just .txt. But .txt is not
a specific extension for cookie files. I'd like an extension dedicated
to cookie files for disambignuity purposes. Thanks.

--
Regards,
Peng