Some Travis tests show that this patch still breaks on the Russian locale. However, all tests pass without this patch. While I don't see anything obvious that is causing the breakage, it remains a fact that the test suite is not passing.
The issue however *may* just be a Valgrind bug. The failure is caused due to Valgrind, and the output is: ==79821== Conditional jump or move depends on uninitialised value(s) ==79821== at 0x5D8F0BC: memrchr (memrchr.S:299) ==79821== by 0x41B3E0: extract_param (in /home/travis/build/darnir/wget/wget-UNKNOWN/_build/src/wget) ==79821== by 0x40A0CB: parse_set_cookie.constprop.6 (in /home/travis/build/darnir/wget/wget-UNKNOWN/_build/src/wget) ==79821== by 0x40A587: cookie_handle_set_cookie (in /home/travis/build/darnir/wget/wget-UNKNOWN/_build/src/wget) ==79821== by 0x41D152: gethttp (in /home/travis/build/darnir/wget/wget-UNKNOWN/_build/src/wget) ==79821== by 0x420284: http_loop (in /home/travis/build/darnir/wget/wget-UNKNOWN/_build/src/wget) ==79821== by 0x42AB3E: retrieve_url (in /home/travis/build/darnir/wget/wget-UNKNOWN/_build/src/wget) ==79821== by 0x406CA0: main (in /home/travis/build/darnir/wget/wget-UNKNOWN/_build/src/wget) ==79821== Uninitialised value was created by a stack allocation ==79821== at 0x41C8A3: gethttp (in /home/travis/build/darnir/wget/wget-UNKNOWN/_build/src/wget) ==79821== While we wait on Kushagra to finish writing the tests, we should find a way to identify the root cause of this issue. On 29 January 2016 at 09:32, Darshit Shah <[email protected]> wrote: > Looks good now. Would like to see tests for the same though. > > On 29 January 2016 at 09:19, Kushagra Singh > <[email protected]> wrote: >> Hi, >> >> >> >> On Thu, Jan 28, 2016 at 2:02 AM, Darshit Shah <[email protected]> wrote: >>> >>> On 27 January 2016 at 20:52, Kushagra Singh >>> <[email protected]> wrote: >>> > >>> > >>> > On Wed, Jan 27, 2016 at 5:06 PM, Tim Ruehsen <[email protected]> wrote: >>> >> >>> >> > > What about the '#ifdef HAVE_SSL' ? Don't we need the check always ? >>> >> >>> >> Sorry for my irritating text. What I tried to ask/say was "Do we need >>> >> the >>> >> #ifdef in cookie_handle_set_cookie() at all ?". >>> >> >>> >> And btw, do we need it in parse_set_cookie() ? >>> >> >>> > >>> > I think it is required in parse_set_cookie(). It does not create a >>> > secure >>> > only cookie in case the connection is insecure. Now this can happen >>> > because >>> > of two reasons, (i) communication over simple HTTP despite wget >>> > configured >>> > with SSL, (ii) wget configured with the "--without-ssl" option. The log >>> > output in both the cases should be different, right? >>> >>> I don't see the point of it. Why should it be any different? In fact, >>> why does the end user, who probably installed a distro-packaged >>> version of Wget care about the configure options? >>> Every #ifdef statement you add increases the complexity of the code >>> since it changes what portion of the code is compiled. As long as the >>> connection is insecure, Wget refuses to set a secure cookie, period. >>> Don't overengineer the situation. >>> >> >> I have made the changes accordingly, not checking using preprocessor >> directives now >> >>> > >>> >> Darshit said it with clearer words (and I agree with him): >>> >> "When a user loads a file backed cookie jar, they expect it to work >>> >> according to the RFC, irrespective of whether the client supports SSL >>> >> or not. And especially since support for this does not depend on the >>> >> actual linking of any SSL library, it shouldn't be hard to implement." >>> >> >>> > >>> > In this case, then can we simply remove the #ifdef check, and and the if >>> > else statement check whether (scheme == SCHEME_HTTP) and not (scheme != >>> > SCHEME_HTTPS), since they would essentially mean the same. This should >>> > take >>> > care of the problem you mention. I have attached a patch with these >>> > changes. >>> >>> Seems okay to me right now. >>> >>> Please prefer to not move functions around. Adding a prototype to the >>> top of the file is a better option. Moving a function around like this >>> causes things like git blame to not work very well. >>> >>> On a sidenote, I think the find_chains_of_host() method could use some >>> refactoring. >>> >>> One is that count_char could use a library function like strchr() >>> instead of trying to run a pass manually. >> >> >> Something like the way I have done in this patch? >> >>> >>> Apart from that, I think some parts could use help from Libpsl. >>> @Tim: When progressing to less specific domains, I think Libpsl could >>> provide a way to test if we're moving into a new TLD? >>> >>> There's also the section with a while(1) loop that exits using a >>> simple condition and a break statement. This is bad programming >>> practice. We should avoid using such a pattern since it obscures the >>> actual loop condition. Maybe we can refactor it slightly? >>> >>> > >>> > A question about the way things are done in the Wget project, should I >>> > attach a patch that should be applied in continuation to the last patch >>> > I >>> > sent, or one generated by all the commits? The patch I have attached is >>> > the >>> > one generated of the last commit only. >>> > >>> You should resend the entire patch again. So that everyone has context >>> and we can simply apply the final version directly. There is no point >>> in keeping a history of personal edits on the master branch. >>> When you have a large change that can be logically split into >>> different commits, you should have multiple patches for each of the >>> commit. Think of it this way, once you're done implementing a feature >>> or a bug fix, what version of your code do you want the rest of the >>> world to see? The one where you made a hundred stupid errors and later >>> fixed it, or just the final clean version? >> >> >> That makes a lot of sense, thanks a lot for that! >> >>> >>> > >>> > Kush >>> >>> >>> >>> -- >>> Thanking You, >>> Darshit Shah >> >> >> >> Thanks, >> Kush >> >> >> On Thu, Jan 28, 2016 at 2:02 AM, Darshit Shah <[email protected]> wrote: >>> >>> On 27 January 2016 at 20:52, Kushagra Singh >>> <[email protected]> wrote: >>> > >>> > >>> > On Wed, Jan 27, 2016 at 5:06 PM, Tim Ruehsen <[email protected]> wrote: >>> >> >>> >> > > What about the '#ifdef HAVE_SSL' ? Don't we need the check always ? >>> >> >>> >> Sorry for my irritating text. What I tried to ask/say was "Do we need >>> >> the >>> >> #ifdef in cookie_handle_set_cookie() at all ?". >>> >> >>> >> And btw, do we need it in parse_set_cookie() ? >>> >> >>> > >>> > I think it is required in parse_set_cookie(). It does not create a >>> > secure >>> > only cookie in case the connection is insecure. Now this can happen >>> > because >>> > of two reasons, (i) communication over simple HTTP despite wget >>> > configured >>> > with SSL, (ii) wget configured with the "--without-ssl" option. The log >>> > output in both the cases should be different, right? >>> >>> I don't see the point of it. Why should it be any different? In fact, >>> why does the end user, who probably installed a distro-packaged >>> version of Wget care about the configure options? >>> Every #ifdef statement you add increases the complexity of the code >>> since it changes what portion of the code is compiled. As long as the >>> connection is insecure, Wget refuses to set a secure cookie, period. >>> Don't overengineer the situation. >>> >>> > >>> >> Darshit said it with clearer words (and I agree with him): >>> >> "When a user loads a file backed cookie jar, they expect it to work >>> >> according to the RFC, irrespective of whether the client supports SSL >>> >> or not. And especially since support for this does not depend on the >>> >> actual linking of any SSL library, it shouldn't be hard to implement." >>> >> >>> > >>> > In this case, then can we simply remove the #ifdef check, and and the if >>> > else statement check whether (scheme == SCHEME_HTTP) and not (scheme != >>> > SCHEME_HTTPS), since they would essentially mean the same. This should >>> > take >>> > care of the problem you mention. I have attached a patch with these >>> > changes. >>> >>> Seems okay to me right now. >>> >>> Please prefer to not move functions around. Adding a prototype to the >>> top of the file is a better option. Moving a function around like this >>> causes things like git blame to not work very well. >>> >>> On a sidenote, I think the find_chains_of_host() method could use some >>> refactoring. >>> >>> One is that count_char could use a library function like strchr() >>> instead of trying to run a pass manually. >>> Apart from that, I think some parts could use help from Libpsl. >>> @Tim: When progressing to less specific domains, I think Libpsl could >>> provide a way to test if we're moving into a new TLD? >>> >>> There's also the section with a while(1) loop that exits using a >>> simple condition and a break statement. This is bad programming >>> practice. We should avoid using such a pattern since it obscures the >>> actual loop condition. Maybe we can refactor it slightly? >>> >>> > >>> > A question about the way things are done in the Wget project, should I >>> > attach a patch that should be applied in continuation to the last patch >>> > I >>> > sent, or one generated by all the commits? The patch I have attached is >>> > the >>> > one generated of the last commit only. >>> > >>> You should resend the entire patch again. So that everyone has context >>> and we can simply apply the final version directly. There is no point >>> in keeping a history of personal edits on the master branch. >>> When you have a large change that can be logically split into >>> different commits, you should have multiple patches for each of the >>> commit. Think of it this way, once you're done implementing a feature >>> or a bug fix, what version of your code do you want the rest of the >>> world to see? The one where you made a hundred stupid errors and later >>> fixed it, or just the final clean version? >>> > >>> > Kush >>> >>> >>> >>> -- >>> Thanking You, >>> Darshit Shah >> >> > > > > -- > Thanking You, > Darshit Shah -- Thanking You, Darshit Shah
