Re: [Bug-wget] Google Summer of Code 2016
Hi Ander, that would really be much work since many of the dependencies have to be implemented first. But maybe a some point in time, we have all the dependencies handled by libwget !? At that point, I would suggest a separate tool. Maybe Wget evolves into an ecosystem of web tools one day, who knows. It is likely that the webdriver specs will change in many points until they become final. Tim On Sunday 06 March 2016 22:20:28 Ander Juaristi wrote: > I just wanted to share with you another idea I was thinking on some time > now: WebDriver [1]. > > It's basically a protocol/API to communicate with UAs. It's intended to > be UA-agnostic, so any client should be able to use WebDriver to > communicate with a compliant UA. From the standard: > > "WebDriver is a remote control interface that enables introspection > and control of user agents. It provides a platform- and language-neutral > wire protocol as a way for out-of-process programs to remotely instruct > the behaviour of web browsers." > > There are some requirements not at all supported in wget, such as XPath > DOM traversal, so at first glance I can't give an estimate on whether > how much time would be needed for this. It will not be too short, sure, > but might be too big for a GSoC. > > Regards, > - AJ > > [1] https://www.w3.org/TR/webdriver/ > > El 03/03/2016 a las 11:21, Tim Ruehsen escribió: > > Just more ideas for you, Kushagra: > > > > There are many command line options from Wget still missing in Wget2, you > > should have a look at > > https://github.com/rockdaboot/wget2/wiki anyways - feel free to work on > > the > > wiki yourself (e.g. fork the wiki pages: > > https://help.github.com/articles/adding-and-editing-wiki-pages-locally/ or > > let me know and I'll give you write access). > > > > You can search the Wget bug tracker > > (https://savannah.gnu.org/bugs/?group=wget) for wishlist items. > > My favorite is https://savannah.gnu.org/bugs/?45803. > > Special popen(2|3) functions/code is already in libwget/ directory. > > E.g., that would allow Wget2 to be used as part of a recursive website > > malware checker. > > > > The authorization code in the test suite is not complete/not implemented - > > I once tested authorization (MD5, MD5-sess) 'by hand' with my local > > Apache. But a automated test is badly needed. > > > > We thought of a statistic module (very basic code exists) for spider mode > > to output diagnostics very detailed. Missing pages, response times, > > server load (e.g. using the RTT/ping time), etc. > > > > Tim > > > > On Wednesday 02 March 2016 10:51:02 Kushagra Singh wrote: > >> Hi, > >> > >> Thanks for the quick reply. I went through the repository and the issues, > >> and found a couple of things I would like to work on. > >> > >> I have a couple of questions about Wget2. Is it a complete rewrite of the > >> Wget project, available at git://git.savannah.gnu.org/wget.git, or are we > >> using existing code and extending functionality? I guess it is the second > >> one because I saw `libwget` in the repo. However if such is the case, > >> then > >> how do we change existing functions in wget? For example, implementing > >> [2] > >> would require making changes to the file cookies.c, which is present in > >> /src in the wget repo, but not in /src in the wget2 repo. > >> > >> I was looking at #43 [1], and have already submitted a patch for > >> consideration for the first suggestion [2]. The second suggestion > >> mentioned > >> [3] is one of the things I'd like to work on, however this is not > >> something > >> which will take three months :) > >> > >> Another project I am interested in, is implementing FTPS. I saw this > >> listed > >> under one of the ideas of GSoC 2015, but I'm not sure whether it was > >> implemented, as I didn't see it under 'Development Status' in the wget2 > >> readme on Github. > >> > >> Also, in #67 [4], we are talking about adhering to some specific parts of > >> RFC 7230. I'm not sure which all parts would be right, as the discussion > >> thread mentions that it won't be good to stick to each point of the RFC. > >> WDYT? > >> > >> > >> [1] https://github.com/rockdaboot/wget2/issues/43 > >> [2] https://tools.ietf.org/html/draft-west-leave-secure-cookies-alone-04 > >> [3] https://tools.ietf.org/html/draft-west-cookie-prefixes-05 > >> [4] https://github.com/rockdaboot/wget2/issues/67 > >> > >> On Tue, Mar 1, 2016 at 9:57 PM, Giuseppe Scrivanowrote: > >>> Kushagra Singh writes: > Hi, > > Will we be taking part in GSoC this year? I would really like to work > on > >>> > >>> a > >>> > project related to Wget this summer. Any specific ideas that are of > importance to the community presently? > >>> > >>> yes, we will be take part in GSoC. I think we would like to see more > >>> work happening on wget2, at the moment there is a list of issues on > >>> > >>> github that can be useful
Re: [Bug-wget] Google Summer of Code 2016
I just wanted to share with you another idea I was thinking on some time now: WebDriver [1]. It's basically a protocol/API to communicate with UAs. It's intended to be UA-agnostic, so any client should be able to use WebDriver to communicate with a compliant UA. From the standard: "WebDriver is a remote control interface that enables introspection and control of user agents. It provides a platform- and language-neutral wire protocol as a way for out-of-process programs to remotely instruct the behaviour of web browsers." There are some requirements not at all supported in wget, such as XPath DOM traversal, so at first glance I can't give an estimate on whether how much time would be needed for this. It will not be too short, sure, but might be too big for a GSoC. Regards, - AJ [1] https://www.w3.org/TR/webdriver/ El 03/03/2016 a las 11:21, Tim Ruehsen escribió: Just more ideas for you, Kushagra: There are many command line options from Wget still missing in Wget2, you should have a look at https://github.com/rockdaboot/wget2/wiki anyways - feel free to work on the wiki yourself (e.g. fork the wiki pages: https://help.github.com/articles/adding-and-editing-wiki-pages-locally/ or let me know and I'll give you write access). You can search the Wget bug tracker (https://savannah.gnu.org/bugs/?group=wget) for wishlist items. My favorite is https://savannah.gnu.org/bugs/?45803. Special popen(2|3) functions/code is already in libwget/ directory. E.g., that would allow Wget2 to be used as part of a recursive website malware checker. The authorization code in the test suite is not complete/not implemented - I once tested authorization (MD5, MD5-sess) 'by hand' with my local Apache. But a automated test is badly needed. We thought of a statistic module (very basic code exists) for spider mode to output diagnostics very detailed. Missing pages, response times, server load (e.g. using the RTT/ping time), etc. Tim On Wednesday 02 March 2016 10:51:02 Kushagra Singh wrote: Hi, Thanks for the quick reply. I went through the repository and the issues, and found a couple of things I would like to work on. I have a couple of questions about Wget2. Is it a complete rewrite of the Wget project, available at git://git.savannah.gnu.org/wget.git, or are we using existing code and extending functionality? I guess it is the second one because I saw `libwget` in the repo. However if such is the case, then how do we change existing functions in wget? For example, implementing [2] would require making changes to the file cookies.c, which is present in /src in the wget repo, but not in /src in the wget2 repo. I was looking at #43 [1], and have already submitted a patch for consideration for the first suggestion [2]. The second suggestion mentioned [3] is one of the things I'd like to work on, however this is not something which will take three months :) Another project I am interested in, is implementing FTPS. I saw this listed under one of the ideas of GSoC 2015, but I'm not sure whether it was implemented, as I didn't see it under 'Development Status' in the wget2 readme on Github. Also, in #67 [4], we are talking about adhering to some specific parts of RFC 7230. I'm not sure which all parts would be right, as the discussion thread mentions that it won't be good to stick to each point of the RFC. WDYT? [1] https://github.com/rockdaboot/wget2/issues/43 [2] https://tools.ietf.org/html/draft-west-leave-secure-cookies-alone-04 [3] https://tools.ietf.org/html/draft-west-cookie-prefixes-05 [4] https://github.com/rockdaboot/wget2/issues/67 On Tue, Mar 1, 2016 at 9:57 PM, Giuseppe Scrivanowrote: Kushagra Singh writes: Hi, Will we be taking part in GSoC this year? I would really like to work on a project related to Wget this summer. Any specific ideas that are of importance to the community presently? yes, we will be take part in GSoC. I think we would like to see more work happening on wget2, at the moment there is a list of issues on github that can be useful to you to pick some ideas to work on: https://github.com/rockdaboot/wget2/issues Could you take a look at it? Do you see anything interesting that you would like to work on? Regards, Giuseppe
Re: [Bug-wget] Google Summer of Code 2016
> You mentioned FTPS... Ander Juaristi implemented this for Wget during GSOC > 2015. Wget2 currently is lacking FTP and FTPS support (I just added some code > for the test suite - tested only with Wget). Yes, I wrote FTPS in wget, albeit not complete. There are some FTPS commands, such as CCC, that were impossible to implement with the current wget SSL/TLS API. Implementing them would require enhancing the SSL/TLS API. I have some notes at home about how to do that, and promised I would show them to you, but still haven't. My fault. I'll try to do it tomorrow, since today I'm in a hotel in the center of Madrid, and won't be able to. Right now, wget2 lacks both FTP and FTPS support. So I guess you have to first implement FTP in order to have FTPS. Well, in theory, it's not a technical impediment to implement FTPS directly, but makes more sense to have FTP first, since FTPS is just extending it to tunnel its traffic through TLS. Regards, - AJ
Re: [Bug-wget] Google Summer of Code 2016
Just more ideas for you, Kushagra: There are many command line options from Wget still missing in Wget2, you should have a look at https://github.com/rockdaboot/wget2/wiki anyways - feel free to work on the wiki yourself (e.g. fork the wiki pages: https://help.github.com/articles/adding-and-editing-wiki-pages-locally/ or let me know and I'll give you write access). You can search the Wget bug tracker (https://savannah.gnu.org/bugs/?group=wget) for wishlist items. My favorite is https://savannah.gnu.org/bugs/?45803. Special popen(2|3) functions/code is already in libwget/ directory. E.g., that would allow Wget2 to be used as part of a recursive website malware checker. The authorization code in the test suite is not complete/not implemented - I once tested authorization (MD5, MD5-sess) 'by hand' with my local Apache. But a automated test is badly needed. We thought of a statistic module (very basic code exists) for spider mode to output diagnostics very detailed. Missing pages, response times, server load (e.g. using the RTT/ping time), etc. Tim On Wednesday 02 March 2016 10:51:02 Kushagra Singh wrote: > Hi, > > Thanks for the quick reply. I went through the repository and the issues, > and found a couple of things I would like to work on. > > I have a couple of questions about Wget2. Is it a complete rewrite of the > Wget project, available at git://git.savannah.gnu.org/wget.git, or are we > using existing code and extending functionality? I guess it is the second > one because I saw `libwget` in the repo. However if such is the case, then > how do we change existing functions in wget? For example, implementing [2] > would require making changes to the file cookies.c, which is present in > /src in the wget repo, but not in /src in the wget2 repo. > > I was looking at #43 [1], and have already submitted a patch for > consideration for the first suggestion [2]. The second suggestion mentioned > [3] is one of the things I'd like to work on, however this is not something > which will take three months :) > > Another project I am interested in, is implementing FTPS. I saw this listed > under one of the ideas of GSoC 2015, but I'm not sure whether it was > implemented, as I didn't see it under 'Development Status' in the wget2 > readme on Github. > > Also, in #67 [4], we are talking about adhering to some specific parts of > RFC 7230. I'm not sure which all parts would be right, as the discussion > thread mentions that it won't be good to stick to each point of the RFC. > WDYT? > > > [1] https://github.com/rockdaboot/wget2/issues/43 > [2] https://tools.ietf.org/html/draft-west-leave-secure-cookies-alone-04 > [3] https://tools.ietf.org/html/draft-west-cookie-prefixes-05 > [4] https://github.com/rockdaboot/wget2/issues/67 > > On Tue, Mar 1, 2016 at 9:57 PM, Giuseppe Scrivanowrote: > > Kushagra Singh writes: > > > Hi, > > > > > > Will we be taking part in GSoC this year? I would really like to work on > > > > a > > > > > project related to Wget this summer. Any specific ideas that are of > > > importance to the community presently? > > > > yes, we will be take part in GSoC. I think we would like to see more > > work happening on wget2, at the moment there is a list of issues on > > > > github that can be useful to you to pick some ideas to work on: > > https://github.com/rockdaboot/wget2/issues > > > > Could you take a look at it? Do you see anything interesting that you > > would like to work on? > > > > Regards, > > Giuseppe signature.asc Description: This is a digitally signed message part.
Re: [Bug-wget] Google Summer of Code 2016
Hi Kushagra, I can only add a few things to Darshit's answer. Wget2/libwget has been completely written from scratch. We just moved the code/project to Savannah as part of GNU Wget, transferred the copyrights to FSF and integrated gnulib. Wget2 has not been released yet, but there are not many blockers right now (mainly Documentation). None of us found much time to work on the details to transfer issues into GSOC projects. But we are going to and appreciate any help of course. So, if you want to work on any of the issues as a GSOC project (or if you have your own idea), let us know and we are happy to work together with you on a detailed specification. You mentioned FTPS... Ander Juaristi implemented this for Wget during GSOC 2015. Wget2 currently is lacking FTP and FTPS support (I just added some code for the test suite - tested only with Wget). Maybe you could take one or two of the smaller issues as a warm up to get familiar with the new code !? Feel free to ask/discuss any questions with us - we enjoy working together with other devs. Tim On Wednesday 02 March 2016 10:51:02 Kushagra Singh wrote: > Hi, > > Thanks for the quick reply. I went through the repository and the issues, > and found a couple of things I would like to work on. > > I have a couple of questions about Wget2. Is it a complete rewrite of the > Wget project, available at git://git.savannah.gnu.org/wget.git, or are we > using existing code and extending functionality? I guess it is the second > one because I saw `libwget` in the repo. However if such is the case, then > how do we change existing functions in wget? For example, implementing [2] > would require making changes to the file cookies.c, which is present in > /src in the wget repo, but not in /src in the wget2 repo. > > I was looking at #43 [1], and have already submitted a patch for > consideration for the first suggestion [2]. The second suggestion mentioned > [3] is one of the things I'd like to work on, however this is not something > which will take three months :) > > Another project I am interested in, is implementing FTPS. I saw this listed > under one of the ideas of GSoC 2015, but I'm not sure whether it was > implemented, as I didn't see it under 'Development Status' in the wget2 > readme on Github. > > Also, in #67 [4], we are talking about adhering to some specific parts of > RFC 7230. I'm not sure which all parts would be right, as the discussion > thread mentions that it won't be good to stick to each point of the RFC. > WDYT? > > > [1] https://github.com/rockdaboot/wget2/issues/43 > [2] https://tools.ietf.org/html/draft-west-leave-secure-cookies-alone-04 > [3] https://tools.ietf.org/html/draft-west-cookie-prefixes-05 > [4] https://github.com/rockdaboot/wget2/issues/67 > > On Tue, Mar 1, 2016 at 9:57 PM, Giuseppe Scrivanowrote: > > Kushagra Singh writes: > > > Hi, > > > > > > Will we be taking part in GSoC this year? I would really like to work on > > > > a > > > > > project related to Wget this summer. Any specific ideas that are of > > > importance to the community presently? > > > > yes, we will be take part in GSoC. I think we would like to see more > > work happening on wget2, at the moment there is a list of issues on > > > > github that can be useful to you to pick some ideas to work on: > > https://github.com/rockdaboot/wget2/issues > > > > Could you take a look at it? Do you see anything interesting that you > > would like to work on? > > > > Regards, > > Giuseppe signature.asc Description: This is a digitally signed message part.
Re: [Bug-wget] Google Summer of Code 2016
On 03/02, Kushagra Singh wrote: Hi, Thanks for the quick reply. I went through the repository and the issues, and found a couple of things I would like to work on. I have a couple of questions about Wget2. Is it a complete rewrite of the Wget project, available at git://git.savannah.gnu.org/wget.git, or are we using existing code and extending functionality? I guess it is the second one because I saw `libwget` in the repo. However if such is the case, then how do we change existing functions in wget? For example, implementing [2] would require making changes to the file cookies.c, which is present in /src in the wget repo, but not in /src in the wget2 repo. Wget2 is a complete rewrite of GNU Wget. It is also available on the savannah server as its own repository at [1]. Wget2 is meant to be a modern (almost) drop-in replacement for Wget. It strives to maintain backward compatible command line options and behaviour as far as it makes sense. The codebase for the two projects has diverged by significant amounts and hence new features need to be implemented separately for each. I was looking at #43 [1], and have already submitted a patch for consideration for the first suggestion [2]. The second suggestion mentioned [3] is one of the things I'd like to work on, however this is not something which will take three months :) You submitted a patch for Wget. This is the Wget2 repository. Anyways, I already have a working patch for most of that issue, got sidetracked when writing the tests and eventually forgot about it. I think I'll spend some time on it this week and have that patch merged. Don't spend time on that part. Another thing to remember is, not all GitHub issues are valid GSoC projects. Since the number of issues is few, it is easy to scout out the larger ones. Some issues are pretty tiny, just need someone willing to spend time working on them. Another project I am interested in, is implementing FTPS. I saw this listed under one of the ideas of GSoC 2015, but I'm not sure whether it was implemented, as I didn't see it under 'Development Status' in the wget2 readme on Github. Wget2 as far as I'm aware is still lacking FTPS support. Remember that Wget and Wget2 are two different projects. Also, in #67 [4], we are talking about adhering to some specific parts of RFC 7230. I'm not sure which all parts would be right, as the discussion thread mentions that it won't be good to stick to each point of the RFC. WDYT? This is a minor grievance I raised. We stick to most of it anyways. As Tim points out, being completely RFC compliant may make the tool unusuable thanks to the number of bad servers out there. If anything, that issue needs to be split into multiple smaller issues about specific parts of the RFC that we want to adhere to. Open projects I currently see are: 1. FTP / FTPS support 2. SOCKS5 Proxy support (This may be too small.) 3. Progress Bar implementation (Looks deceptively simple, isn't) 4. WARC support and tests 5. Brotli compression (May be too small) The README file also has more pointers on features not implemented in Wget2. You may get some ideas from there. Request pipelining and DNSSEC are two features I'd be interested in seeing implemented. Moreover, you are always welcome to submit your own ideas for either Wget or Wget2. Tim can add more details or comment on whether something is too small to work on for a GSoC project. [1]: git://git.savannah.gnu.org/wget/wget2.git [1] https://github.com/rockdaboot/wget2/issues/43 [2] https://tools.ietf.org/html/draft-west-leave-secure-cookies-alone-04 [3] https://tools.ietf.org/html/draft-west-cookie-prefixes-05 [4] https://github.com/rockdaboot/wget2/issues/67 On Tue, Mar 1, 2016 at 9:57 PM, Giuseppe Scrivanowrote: Kushagra Singh writes: > Hi, > > Will we be taking part in GSoC this year? I would really like to work on a > project related to Wget this summer. Any specific ideas that are of > importance to the community presently? yes, we will be take part in GSoC. I think we would like to see more work happening on wget2, at the moment there is a list of issues on github that can be useful to you to pick some ideas to work on: https://github.com/rockdaboot/wget2/issues Could you take a look at it? Do you see anything interesting that you would like to work on? Regards, Giuseppe -- Thanking You, Darshit Shah signature.asc Description: PGP signature
Re: [Bug-wget] Google Summer of Code 2016
Hi, Thanks for the quick reply. I went through the repository and the issues, and found a couple of things I would like to work on. I have a couple of questions about Wget2. Is it a complete rewrite of the Wget project, available at git://git.savannah.gnu.org/wget.git, or are we using existing code and extending functionality? I guess it is the second one because I saw `libwget` in the repo. However if such is the case, then how do we change existing functions in wget? For example, implementing [2] would require making changes to the file cookies.c, which is present in /src in the wget repo, but not in /src in the wget2 repo. I was looking at #43 [1], and have already submitted a patch for consideration for the first suggestion [2]. The second suggestion mentioned [3] is one of the things I'd like to work on, however this is not something which will take three months :) Another project I am interested in, is implementing FTPS. I saw this listed under one of the ideas of GSoC 2015, but I'm not sure whether it was implemented, as I didn't see it under 'Development Status' in the wget2 readme on Github. Also, in #67 [4], we are talking about adhering to some specific parts of RFC 7230. I'm not sure which all parts would be right, as the discussion thread mentions that it won't be good to stick to each point of the RFC. WDYT? [1] https://github.com/rockdaboot/wget2/issues/43 [2] https://tools.ietf.org/html/draft-west-leave-secure-cookies-alone-04 [3] https://tools.ietf.org/html/draft-west-cookie-prefixes-05 [4] https://github.com/rockdaboot/wget2/issues/67 On Tue, Mar 1, 2016 at 9:57 PM, Giuseppe Scrivanowrote: > Kushagra Singh writes: > > > Hi, > > > > Will we be taking part in GSoC this year? I would really like to work on > a > > project related to Wget this summer. Any specific ideas that are of > > importance to the community presently? > > yes, we will be take part in GSoC. I think we would like to see more > work happening on wget2, at the moment there is a list of issues on > github that can be useful to you to pick some ideas to work on: > > https://github.com/rockdaboot/wget2/issues > > Could you take a look at it? Do you see anything interesting that you > would like to work on? > > Regards, > Giuseppe >
Re: [Bug-wget] Google Summer of Code 2016
Kushagra Singhwrites: > Hi, > > Will we be taking part in GSoC this year? I would really like to work on a > project related to Wget this summer. Any specific ideas that are of > importance to the community presently? yes, we will be take part in GSoC. I think we would like to see more work happening on wget2, at the moment there is a list of issues on github that can be useful to you to pick some ideas to work on: https://github.com/rockdaboot/wget2/issues Could you take a look at it? Do you see anything interesting that you would like to work on? Regards, Giuseppe
[Bug-wget] Google Summer of Code 2016
Hi, Will we be taking part in GSoC this year? I would really like to work on a project related to Wget this summer. Any specific ideas that are of importance to the community presently? A quick introduction, I'm Kushagra Singh, a second year student at IIIT Delhi, India. My major is Computer Science. I have gone through a particular chunk of wget's source code and understand it well, and have submitted a patch for consideration. I successfully completed GSoC, working with lmonade last summer. Looking forward to code on this project this summer! Thank you, Kushagra Singh