I tried the 3.9.1 version and I get Could not get Weather Underground data. Exiting.
Curl still works, even for yesterday's data. Tracing the script it doesn't appear to be actually attempting the download. On Wednesday, May 22, 2019 at 7:03:49 PM UTC-6, Leon Shaner wrote: > > Say, we need a tester who is still on 3.9.1 or there abouts to try this > out: > > https://raw.githubusercontent.com/UberEclectic/weewx/master/bin/wunderfixer > > Can't do anything to workaround WU's sporadic 404 and 503 errors, but at > least the 403 error should be gone. > > I was able to test the 4.0 / development version myself on both Python2 > and 3, so hopefully Tom will merge that soon. It's over here if you are > impatient. =D > > > https://raw.githubusercontent.com/UberEclectic/weewx/development/bin/wunderfixer > > Regards, > \Leon > -- > Leon Shaner :: Dearborn, Michigan (iPad Pro) > > On May 22, 2019, at 7:24 PM, Leon Shaner <[email protected] <javascript:>> > wrote: > > Hey WeeWX'ers!!! =D > > I have a fix in the hopper. > > There's nothing that can be done for the occasional HTTP 404, or even > 503's I am now seeing, but the HTTP 403 was due to a change on WU's part > where they are rejecting certain HTTP User-Agent strings. The fact that > they are putting Akamai in the middle is almost certainly a great thing re: > their scalability issues; however, they probably inherited some default > settings that filter "bots" and malware and such, which is likely why the > HTTP User-Agent now matters. > > I have set the User-Agent to "CURL" and it works. > I have set it to "Mozilla" and it works. I'm going with that one, since > it means Mosaic Killer, both of which were among the the very first > User-Agents I ever worked with, circa 1993 back before there was such as > thing as Netscape. =D > > /ye-olde-farte mode off ;-) > > My testing has so far been under Python3, but coincidentally (and not a > causation), WU started throwing HTTP 503's around the time that I tried > validating my code also under Python2. > > Everything is working against today's date. > It's when I go after yesterday's date that I get the HTTP server error 503. > > I expect the 404's and 503's to go away eventually, but at least for now I > have a fix for the 403 (forbidden)'s, just based on the User-Agent string. > > I'll submit a change for wunderfixer both to the 3.9.x "master" and 4.0.x > "development" branches in a moment and reply back with direct links for > anyone who wants a fix sooner. =D > > Isn't this fun? =D > > Regards, > \Leon > -- > Leon Shaner :: Dearborn, Michigan (iPad Pro) > > On May 22, 2019, at 4:20 PM, Leon Shaner <[email protected] <javascript:>> > wrote: > > I'm still working on this. > CURL is telling me they are not only using https, but also TLSv1.2. > Here is a transcript, in case one of y'all beats me to the fix. =D > > -- > You received this message because you are subscribed to the Google Groups > "weewx-user" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] <javascript:>. > To view this discussion on the web visit > https://groups.google.com/d/msgid/weewx-user/DA01E425-B99A-4959-8FB2-B564A61B3E77%40isylum.org > > <https://groups.google.com/d/msgid/weewx-user/DA01E425-B99A-4959-8FB2-B564A61B3E77%40isylum.org?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > > <wu.txt> > > > > > Working from here: > https://docs.python.org/2/library/ssl.html > > So far I have tried this, to no avail. > Really just doing the "import ssl" and using https in the URL, and adding > context=ssl_context to the urllib.request. > > A snippet of that looks as follows, but still getting 403 forbidden. :-( > > # For new WU interface which uses SSL and TLSv1.2 > import ssl > > ... > > _url = " > https://www.wunderground.com/weatherstation/WXDailyHistory.asp?ID=%s" \ > "&month=%d&day=%d&year=%d&format=1" % (self.station, > dayRequested_tt[1], > dayRequested_tt[2], > dayRequested_tt[0]) > > # specify TLSv1.2 and SSLv2, but not SSLv3 > ssl_context = ssl.SSLContext(ssl.PROTOCOL_TLSv1_2) > ssl_context.options |= ssl.PROTOCOL_SSLv23 > ssl_context.options |= ssl.OP_NO_SSLv3 > > try : > # Hit the weather underground site: > _wudata = urllib.request.urlopen(_url, context=ssl_context) > > > > Regards, > \Leon > -- > Leon Shaner :: Dearborn, Michigan (iPad Pro) > > On May 22, 2019, at 2:42 PM, Leon Shaner <[email protected] <javascript:>> > wrote: > > Jarom, > > CURL is pretty sophisticated in its ability to emulate browser state in > pretty much any way but JavaScript. When it worked this morning, I saw > some cookies were involved. > It may well be that the python way isn't handling that part. > I don't know enough about how python fetches pages to work that out, but I > am very familiar with CURL, so if I can find a path that works > consistently, then I'll go back to the python to see about how to implement > same. > > I was getting 404's in the browser even, when I looked at it earlier. > > I'll keep working on it, but not too hard, so as to not get on their radar > in any unwanted sort of way. ;-) > > Regards, > \Leon > -- > Leon Shaner :: Dearborn, Michigan (iPad Pro) > > On May 22, 2019, at 2:04 PM, Jarom Hatch <[email protected] <javascript:>> > wrote: > > Interesting, using curl sometimes I can it fine, but wunderfixer is always > getting a 403 Forbidden, as if it is actively being blocked... When it > doesn't work in curl I get `HTTP/1.1 404 Not Found` and when it does work I > get `HTTP/1.1 200 OK`. Curl never gets a 403 error. > > On Wednesday, May 22, 2019 at 11:48:08 AM UTC-6, Jarom Hatch wrote: >> >> I was able to get it to work twice in my web browser, but as you said, it >> is sporadic. I don't ever recall them using Akamai before so that may very >> well be a contributing factor. >> >> I wonder if we can find out the origin address and see what happens if we >> can bypass Akamai... >> >> On Wednesday, May 22, 2019 at 7:35:18 AM UTC-6, Leon Shaner wrote: >>> >>> For one thing, the URL of this form: >>> >>> >>> http://www.wunderground.com/weatherstation/WXDailyHistory.asp?ID=SOMESTATION&month=5&day=22&year=2019&format=1 >>> >>> Is now redirecting to one using HTTPS: >>> >>> >>> https://www.wunderground.com/weatherstation/WXDailyHistory.asp?ID=SOMESTATION&month=5&day=22&year=2019&format=1 >>> >>> Also, the redirect itself takes an excruciatingly long time. >>> So I just changed the URL to https directly... >>> >>> The first time I tried any of the above using CURL this morning it >>> worked, but then after that I started getting: >>> >>> An error occurred while processing your request. >>> >>> Reference #30.6f451160.1558531514.16ced4f6 >>> It looks as if they've put some kind of Akamai proxy in the middle, >>> which is fine for static content, but not so fine for a query of this >>> nature. Strange that it worked for me the very first time. It's almost as >>> if the Akamai "farm" has lost some "state" information and not all nodes >>> have the same content, so if you get stuck going through a bad node you get >>> a bogus response. >>> >>> Attached is a transcript of a failed attempt. I put SOMESTATION there >>> only after the fact. The actual query was for my actual station, which >>> used to work. >>> >>> -- > You received this message because you are subscribed to the Google Groups > "weewx-user" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] <javascript:>. > To view this discussion on the web visit > https://groups.google.com/d/msgid/weewx-user/07ac6f86-ae4d-4854-8398-ce4ab8d846c1%40googlegroups.com > > <https://groups.google.com/d/msgid/weewx-user/07ac6f86-ae4d-4854-8398-ce4ab8d846c1%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > > -- > You received this message because you are subscribed to the Google Groups > "weewx-user" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] <javascript:>. > To view this discussion on the web visit > https://groups.google.com/d/msgid/weewx-user/DA01E425-B99A-4959-8FB2-B564A61B3E77%40isylum.org > > <https://groups.google.com/d/msgid/weewx-user/DA01E425-B99A-4959-8FB2-B564A61B3E77%40isylum.org?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > > -- > You received this message because you are subscribed to the Google Groups > "weewx-user" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] <javascript:>. > To view this discussion on the web visit > https://groups.google.com/d/msgid/weewx-user/FA3780B4-F4CB-4897-9CA5-87557D62DAF7%40isylum.org > > <https://groups.google.com/d/msgid/weewx-user/FA3780B4-F4CB-4897-9CA5-87557D62DAF7%40isylum.org?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > > -- You received this message because you are subscribed to the Google Groups "weewx-user" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/weewx-user/49f5ecce-f082-4d64-848c-98e07e3d6349%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
