Sad, now I'm getting the 503. I'll keep trying. On Thursday, May 23, 2019 at 8:27:37 AM UTC-6, Leon Shaner wrote: > > Jarom, > > Thanks so much! I see what I did wrong and I was able to make a stubbed > down version for basic testing to prove it's at least trying to connect. > > Same location (for WeeWX 3.9.1): > > https://raw.githubusercontent.com/UberEclectic/weewx/master/bin/wunderfixer > > The thing is, I'm still constantly getting 404 (Not Found) even with CURL, > and just a bit ago the site started throwing 503 (Service Not Available). > So... It's kinda hard to test under these conditions. But as long as > you don't get a 403, then at least my User-Agent "hack" will be "proven." > :-/ > > Regards, > \Leon > -- > Leon Shaner :: Dearborn, Michigan (iPad Pro) > > On May 23, 2019, at 1:57 AM, Jarom Hatch <[email protected] <javascript:>> > wrote: > > I tried the 3.9.1 version and I get Could not get Weather Underground > data. Exiting. > > Curl still works, even for yesterday's data. Tracing the script it > doesn't appear to be actually attempting the download. > > > On Wednesday, May 22, 2019 at 7:03:49 PM UTC-6, Leon Shaner wrote: >> >> Say, we need a tester who is still on 3.9.1 or there abouts to try this >> out: >> >> >> https://raw.githubusercontent.com/UberEclectic/weewx/master/bin/wunderfixer >> >> Can't do anything to workaround WU's sporadic 404 and 503 errors, but at >> least the 403 error should be gone. >> >> I was able to test the 4.0 / development version myself on both Python2 >> and 3, so hopefully Tom will merge that soon. It's over here if you are >> impatient. =D >> >> >> https://raw.githubusercontent.com/UberEclectic/weewx/development/bin/wunderfixer >> >> Regards, >> \Leon >> -- >> Leon Shaner :: Dearborn, Michigan (iPad Pro) >> >> On May 22, 2019, at 7:24 PM, Leon Shaner <[email protected]> wrote: >> >> Hey WeeWX'ers!!! =D >> >> I have a fix in the hopper. >> >> There's nothing that can be done for the occasional HTTP 404, or even >> 503's I am now seeing, but the HTTP 403 was due to a change on WU's part >> where they are rejecting certain HTTP User-Agent strings. The fact that >> they are putting Akamai in the middle is almost certainly a great thing re: >> their scalability issues; however, they probably inherited some default >> settings that filter "bots" and malware and such, which is likely why the >> HTTP User-Agent now matters. >> >> I have set the User-Agent to "CURL" and it works. >> I have set it to "Mozilla" and it works. I'm going with that one, since >> it means Mosaic Killer, both of which were among the the very first >> User-Agents I ever worked with, circa 1993 back before there was such as >> thing as Netscape. =D >> >> /ye-olde-farte mode off ;-) >> >> My testing has so far been under Python3, but coincidentally (and not a >> causation), WU started throwing HTTP 503's around the time that I tried >> validating my code also under Python2. >> >> Everything is working against today's date. >> It's when I go after yesterday's date that I get the HTTP server error >> 503. >> >> I expect the 404's and 503's to go away eventually, but at least for now >> I have a fix for the 403 (forbidden)'s, just based on the User-Agent string. >> >> I'll submit a change for wunderfixer both to the 3.9.x "master" and 4.0.x >> "development" branches in a moment and reply back with direct links for >> anyone who wants a fix sooner. =D >> >> Isn't this fun? =D >> >> Regards, >> \Leon >> -- >> Leon Shaner :: Dearborn, Michigan (iPad Pro) >> >> On May 22, 2019, at 4:20 PM, Leon Shaner <[email protected]> wrote: >> >> I'm still working on this. >> CURL is telling me they are not only using https, but also TLSv1.2. >> Here is a transcript, in case one of y'all beats me to the fix. =D >> >> -- >> You received this message because you are subscribed to the Google Groups >> "weewx-user" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/weewx-user/DA01E425-B99A-4959-8FB2-B564A61B3E77%40isylum.org >> >> <https://groups.google.com/d/msgid/weewx-user/DA01E425-B99A-4959-8FB2-B564A61B3E77%40isylum.org?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> >> <wu.txt> >> >> >> >> >> Working from here: >> https://docs.python.org/2/library/ssl.html >> >> So far I have tried this, to no avail. >> Really just doing the "import ssl" and using https in the URL, and adding >> context=ssl_context to the urllib.request. >> >> A snippet of that looks as follows, but still getting 403 forbidden. :-( >> >> # For new WU interface which uses SSL and TLSv1.2 >> import ssl >> >> ... >> >> _url = " >> https://www.wunderground.com/weatherstation/WXDailyHistory.asp?ID=%s" \ >> "&month=%d&day=%d&year=%d&format=1" % (self.station, >> dayRequested_tt[1], >> dayRequested_tt[2], >> dayRequested_tt[0]) >> >> # specify TLSv1.2 and SSLv2, but not SSLv3 >> ssl_context = ssl.SSLContext(ssl.PROTOCOL_TLSv1_2) >> ssl_context.options |= ssl.PROTOCOL_SSLv23 >> ssl_context.options |= ssl.OP_NO_SSLv3 >> >> try : >> # Hit the weather underground site: >> _wudata = urllib.request.urlopen(_url, context=ssl_context) >> >> >> >> Regards, >> \Leon >> -- >> Leon Shaner :: Dearborn, Michigan (iPad Pro) >> >> On May 22, 2019, at 2:42 PM, Leon Shaner <[email protected]> wrote: >> >> Jarom, >> >> CURL is pretty sophisticated in its ability to emulate browser state in >> pretty much any way but JavaScript. When it worked this morning, I saw >> some cookies were involved. >> It may well be that the python way isn't handling that part. >> I don't know enough about how python fetches pages to work that out, but >> I am very familiar with CURL, so if I can find a path that works >> consistently, then I'll go back to the python to see about how to implement >> same. >> >> I was getting 404's in the browser even, when I looked at it earlier. >> >> I'll keep working on it, but not too hard, so as to not get on their >> radar in any unwanted sort of way. ;-) >> >> Regards, >> \Leon >> -- >> Leon Shaner :: Dearborn, Michigan (iPad Pro) >> >> On May 22, 2019, at 2:04 PM, Jarom Hatch <[email protected]> wrote: >> >> Interesting, using curl sometimes I can it fine, but wunderfixer is >> always getting a 403 Forbidden, as if it is actively being blocked... When >> it doesn't work in curl I get `HTTP/1.1 404 Not Found` and when it does >> work I get `HTTP/1.1 200 OK`. Curl never gets a 403 error. >> >> On Wednesday, May 22, 2019 at 11:48:08 AM UTC-6, Jarom Hatch wrote: >>> >>> I was able to get it to work twice in my web browser, but as you said, >>> it is sporadic. I don't ever recall them using Akamai before so that may >>> very well be a contributing factor. >>> >>> I wonder if we can find out the origin address and see what happens if >>> we can bypass Akamai... >>> >>> On Wednesday, May 22, 2019 at 7:35:18 AM UTC-6, Leon Shaner wrote: >>>> >>>> For one thing, the URL of this form: >>>> >>>> >>>> http://www.wunderground.com/weatherstation/WXDailyHistory.asp?ID=SOMESTATION&month=5&day=22&year=2019&format=1 >>>> >>>> Is now redirecting to one using HTTPS: >>>> >>>> >>>> https://www.wunderground.com/weatherstation/WXDailyHistory.asp?ID=SOMESTATION&month=5&day=22&year=2019&format=1 >>>> >>>> Also, the redirect itself takes an excruciatingly long time. >>>> So I just changed the URL to https directly... >>>> >>>> The first time I tried any of the above using CURL this morning it >>>> worked, but then after that I started getting: >>>> >>>> An error occurred while processing your request. >>>> >>>> Reference #30.6f451160.1558531514.16ced4f6 >>>> It looks as if they've put some kind of Akamai proxy in the middle, >>>> which is fine for static content, but not so fine for a query of this >>>> nature. Strange that it worked for me the very first time. It's almost >>>> as >>>> if the Akamai "farm" has lost some "state" information and not all nodes >>>> have the same content, so if you get stuck going through a bad node you >>>> get >>>> a bogus response. >>>> >>>> Attached is a transcript of a failed attempt. I put SOMESTATION there >>>> only after the fact. The actual query was for my actual station, which >>>> used to work. >>>> >>>> -- >> You received this message because you are subscribed to the Google Groups >> "weewx-user" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/weewx-user/07ac6f86-ae4d-4854-8398-ce4ab8d846c1%40googlegroups.com >> >> <https://groups.google.com/d/msgid/weewx-user/07ac6f86-ae4d-4854-8398-ce4ab8d846c1%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "weewx-user" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/weewx-user/DA01E425-B99A-4959-8FB2-B564A61B3E77%40isylum.org >> >> <https://groups.google.com/d/msgid/weewx-user/DA01E425-B99A-4959-8FB2-B564A61B3E77%40isylum.org?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "weewx-user" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/weewx-user/FA3780B4-F4CB-4897-9CA5-87557D62DAF7%40isylum.org >> >> <https://groups.google.com/d/msgid/weewx-user/FA3780B4-F4CB-4897-9CA5-87557D62DAF7%40isylum.org?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> >> -- > You received this message because you are subscribed to the Google Groups > "weewx-user" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] <javascript:>. > To view this discussion on the web visit > https://groups.google.com/d/msgid/weewx-user/49f5ecce-f082-4d64-848c-98e07e3d6349%40googlegroups.com > > <https://groups.google.com/d/msgid/weewx-user/49f5ecce-f082-4d64-848c-98e07e3d6349%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > >
-- You received this message because you are subscribed to the Google Groups "weewx-user" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/weewx-user/3f05e4eb-67d5-4dfe-94bd-3d95e1cc7eb9%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
