Re: [Bug-wget] Google Summer of Code 2016

2016-03-07 Thread Tim Ruehsen
Hi Ander,

that would really be much work since many of the dependencies have to be 
implemented first. But maybe a some point in time, we have all the 
dependencies handled by libwget !? At that point, I would suggest a separate 
tool. Maybe Wget evolves into an ecosystem of web tools one day, who knows.

It is likely that the webdriver specs will change in many points until they 
become final.

Tim

On Sunday 06 March 2016 22:20:28 Ander Juaristi wrote:
> I just wanted to share with you another idea I was thinking on some time
> now: WebDriver [1].
> 
> It's basically a protocol/API to communicate with UAs. It's intended to
> be UA-agnostic, so any client should be able to use WebDriver to
> communicate with a compliant UA. From the standard:
> 
>  "WebDriver is a remote control interface that enables introspection
> and control of user agents. It provides a platform- and language-neutral
> wire protocol as a way for out-of-process programs to remotely instruct
> the behaviour of web browsers."
> 
> There are some requirements not at all supported in wget, such as XPath
> DOM traversal, so at first glance I can't give an estimate on whether
> how much time would be needed for this. It will not be too short, sure,
> but might be too big for a GSoC.
> 
> Regards,
> - AJ
> 
> [1] https://www.w3.org/TR/webdriver/
> 
> El 03/03/2016 a las 11:21, Tim Ruehsen escribió:
> > Just more ideas for you, Kushagra:
> > 
> > There are many command line options from Wget still missing in Wget2, you
> > should have a look at
> > https://github.com/rockdaboot/wget2/wiki anyways - feel free to work on
> > the
> > wiki yourself (e.g. fork the wiki pages:
> > https://help.github.com/articles/adding-and-editing-wiki-pages-locally/ or
> > let me know and I'll give you write access).
> > 
> > You can search the Wget bug tracker
> > (https://savannah.gnu.org/bugs/?group=wget) for wishlist items.
> > My favorite is https://savannah.gnu.org/bugs/?45803.
> > Special popen(2|3) functions/code is already in libwget/ directory.
> > E.g., that would allow Wget2 to be used as part of a recursive website
> > malware checker.
> > 
> > The authorization code in the test suite is not complete/not implemented -
> > I once tested authorization (MD5, MD5-sess) 'by hand' with my local
> > Apache. But a automated test is badly needed.
> > 
> > We thought of a statistic module (very basic code exists) for spider mode
> > to output diagnostics very detailed. Missing pages, response times,
> > server load (e.g. using the RTT/ping time), etc.
> > 
> > Tim
> > 
> > On Wednesday 02 March 2016 10:51:02 Kushagra Singh wrote:
> >> Hi,
> >> 
> >> Thanks for the quick reply. I went through the repository and the issues,
> >> and found a couple of things I would like to work on.
> >> 
> >> I have a couple of questions about Wget2. Is it a complete rewrite of the
> >> Wget project, available at git://git.savannah.gnu.org/wget.git, or are we
> >> using existing code and extending functionality? I guess it is the second
> >> one because I saw `libwget` in the repo. However if such is the case,
> >> then
> >> how do we change existing functions in wget? For example, implementing
> >> [2]
> >> would require making changes to the file cookies.c, which is present in
> >> /src in the wget repo, but not in /src in the wget2 repo.
> >> 
> >> I was looking at #43 [1], and have already submitted a patch for
> >> consideration for the first suggestion [2]. The second suggestion
> >> mentioned
> >> [3] is one of the things I'd like to work on, however this is not
> >> something
> >> which will take three months :)
> >> 
> >> Another project I am interested in, is implementing FTPS. I saw this
> >> listed
> >> under one of the ideas of GSoC 2015, but I'm not sure whether it was
> >> implemented, as I didn't see it under 'Development Status' in the wget2
> >> readme on Github.
> >> 
> >> Also, in #67 [4], we are talking about adhering to some specific parts of
> >> RFC 7230. I'm not sure which all parts would be right, as the discussion
> >> thread mentions that it won't be good to stick to each point of the RFC.
> >> WDYT?
> >> 
> >> 
> >> [1] https://github.com/rockdaboot/wget2/issues/43
> >> [2] https://tools.ietf.org/html/draft-west-leave-secure-cookies-alone-04
> >> [3] https://tools.ietf.org/html/draft-west-cookie-prefixes-05
> >> [4] https://github.com/rockdaboot/wget2/issues/67
> >> 
> >> On Tue, Mar 1, 2016 at 9:57 PM, Giuseppe Scrivano  
wrote:
> >>> Kushagra Singh  writes:
>  Hi,
>  
>  Will we be taking part in GSoC this year? I would really like to work
>  on
> >>> 
> >>> a
> >>> 
>  project related to Wget this summer. Any specific ideas that are of
>  importance to the community presently?
> >>> 
> >>> yes, we will be take part in GSoC.  I think we would like to see more
> >>> work happening on wget2, at the moment there is a list of issues on
> >>> 
> >>> github that can be useful 

Re: [Bug-wget] Google Summer of Code 2016

2016-03-06 Thread Ander Juaristi
I just wanted to share with you another idea I was thinking on some time 
now: WebDriver [1].


It's basically a protocol/API to communicate with UAs. It's intended to 
be UA-agnostic, so any client should be able to use WebDriver to 
communicate with a compliant UA. From the standard:


"WebDriver is a remote control interface that enables introspection 
and control of user agents. It provides a platform- and language-neutral 
wire protocol as a way for out-of-process programs to remotely instruct 
the behaviour of web browsers."


There are some requirements not at all supported in wget, such as XPath 
DOM traversal, so at first glance I can't give an estimate on whether 
how much time would be needed for this. It will not be too short, sure, 
but might be too big for a GSoC.


Regards,
- AJ

[1] https://www.w3.org/TR/webdriver/

El 03/03/2016 a las 11:21, Tim Ruehsen escribió:

Just more ideas for you, Kushagra:

There are many command line options from Wget still missing in Wget2, you
should have a look at
https://github.com/rockdaboot/wget2/wiki anyways - feel free to work on the
wiki yourself (e.g. fork the wiki pages:
https://help.github.com/articles/adding-and-editing-wiki-pages-locally/ or let
me know and I'll give you write access).

You can search the Wget bug tracker
(https://savannah.gnu.org/bugs/?group=wget) for wishlist items.
My favorite is https://savannah.gnu.org/bugs/?45803.
Special popen(2|3) functions/code is already in libwget/ directory.
E.g., that would allow Wget2 to be used as part of a recursive website malware
checker.

The authorization code in the test suite is not complete/not implemented - I
once tested authorization (MD5, MD5-sess) 'by hand' with my local Apache. But
a automated test is badly needed.

We thought of a statistic module (very basic code exists) for spider mode to
output diagnostics very detailed. Missing pages, response times, server load
(e.g. using the RTT/ping time), etc.

Tim

On Wednesday 02 March 2016 10:51:02 Kushagra Singh wrote:

Hi,

Thanks for the quick reply. I went through the repository and the issues,
and found a couple of things I would like to work on.

I have a couple of questions about Wget2. Is it a complete rewrite of the
Wget project, available at git://git.savannah.gnu.org/wget.git, or are we
using existing code and extending functionality? I guess it is the second
one because I saw `libwget` in the repo. However if such is the case, then
how do we change existing functions in wget? For example, implementing [2]
would require making changes to the file cookies.c, which is present in
/src in the wget repo, but not in /src in the wget2 repo.

I was looking at #43 [1], and have already submitted a patch for
consideration for the first suggestion [2]. The second suggestion mentioned
[3] is one of the things I'd like to work on, however this is not something
which will take three months :)

Another project I am interested in, is implementing FTPS. I saw this listed
under one of the ideas of GSoC 2015, but I'm not sure whether it was
implemented, as I didn't see it under 'Development Status' in the wget2
readme on Github.

Also, in #67 [4], we are talking about adhering to some specific parts of
RFC 7230. I'm not sure which all parts would be right, as the discussion
thread mentions that it won't be good to stick to each point of the RFC.
WDYT?


[1] https://github.com/rockdaboot/wget2/issues/43
[2] https://tools.ietf.org/html/draft-west-leave-secure-cookies-alone-04
[3] https://tools.ietf.org/html/draft-west-cookie-prefixes-05
[4] https://github.com/rockdaboot/wget2/issues/67

On Tue, Mar 1, 2016 at 9:57 PM, Giuseppe Scrivano  wrote:

Kushagra Singh  writes:

Hi,

Will we be taking part in GSoC this year? I would really like to work on


a


project related to Wget this summer. Any specific ideas that are of
importance to the community presently?


yes, we will be take part in GSoC.  I think we would like to see more
work happening on wget2, at the moment there is a list of issues on

github that can be useful to you to pick some ideas to work on:
   https://github.com/rockdaboot/wget2/issues

Could you take a look at it?  Do you see anything interesting that you
would like to work on?

Regards,
Giuseppe




Re: [Bug-wget] Google Summer of Code 2016

2016-03-04 Thread Ander Juaristi
> You mentioned FTPS... Ander Juaristi implemented this for Wget during GSOC
> 2015. Wget2 currently is lacking FTP and FTPS support (I just added some code
> for the test suite - tested only with Wget).

Yes, I wrote FTPS in wget, albeit not complete.

There are some FTPS commands, such as CCC, that were impossible to implement 
with the current wget SSL/TLS API. Implementing them would require enhancing 
the SSL/TLS API. I have some notes at home about how to do that, and promised I 
would show them to you, but still haven't. My fault. I'll try to do it 
tomorrow, since today I'm in a hotel in the center of Madrid, and won't be able 
to.

Right now, wget2 lacks both FTP and FTPS support. So I guess you have to first 
implement FTP in order to have FTPS. Well, in theory, it's not a technical 
impediment to implement FTPS directly, but makes more sense to have FTP first, 
since FTPS is just extending it to tunnel its traffic through TLS.
 
Regards,

- AJ



Re: [Bug-wget] Google Summer of Code 2016

2016-03-03 Thread Tim Ruehsen
Just more ideas for you, Kushagra:

There are many command line options from Wget still missing in Wget2, you 
should have a look at
https://github.com/rockdaboot/wget2/wiki anyways - feel free to work on the 
wiki yourself (e.g. fork the wiki pages: 
https://help.github.com/articles/adding-and-editing-wiki-pages-locally/ or let 
me know and I'll give you write access).

You can search the Wget bug tracker 
(https://savannah.gnu.org/bugs/?group=wget) for wishlist items.
My favorite is https://savannah.gnu.org/bugs/?45803.
Special popen(2|3) functions/code is already in libwget/ directory.
E.g., that would allow Wget2 to be used as part of a recursive website malware 
checker.

The authorization code in the test suite is not complete/not implemented - I 
once tested authorization (MD5, MD5-sess) 'by hand' with my local Apache. But 
a automated test is badly needed.

We thought of a statistic module (very basic code exists) for spider mode to 
output diagnostics very detailed. Missing pages, response times, server load 
(e.g. using the RTT/ping time), etc.

Tim

On Wednesday 02 March 2016 10:51:02 Kushagra Singh wrote:
> Hi,
> 
> Thanks for the quick reply. I went through the repository and the issues,
> and found a couple of things I would like to work on.
> 
> I have a couple of questions about Wget2. Is it a complete rewrite of the
> Wget project, available at git://git.savannah.gnu.org/wget.git, or are we
> using existing code and extending functionality? I guess it is the second
> one because I saw `libwget` in the repo. However if such is the case, then
> how do we change existing functions in wget? For example, implementing [2]
> would require making changes to the file cookies.c, which is present in
> /src in the wget repo, but not in /src in the wget2 repo.
> 
> I was looking at #43 [1], and have already submitted a patch for
> consideration for the first suggestion [2]. The second suggestion mentioned
> [3] is one of the things I'd like to work on, however this is not something
> which will take three months :)
> 
> Another project I am interested in, is implementing FTPS. I saw this listed
> under one of the ideas of GSoC 2015, but I'm not sure whether it was
> implemented, as I didn't see it under 'Development Status' in the wget2
> readme on Github.
> 
> Also, in #67 [4], we are talking about adhering to some specific parts of
> RFC 7230. I'm not sure which all parts would be right, as the discussion
> thread mentions that it won't be good to stick to each point of the RFC.
> WDYT?
> 
> 
> [1] https://github.com/rockdaboot/wget2/issues/43
> [2] https://tools.ietf.org/html/draft-west-leave-secure-cookies-alone-04
> [3] https://tools.ietf.org/html/draft-west-cookie-prefixes-05
> [4] https://github.com/rockdaboot/wget2/issues/67
> 
> On Tue, Mar 1, 2016 at 9:57 PM, Giuseppe Scrivano  wrote:
> > Kushagra Singh  writes:
> > > Hi,
> > > 
> > > Will we be taking part in GSoC this year? I would really like to work on
> > 
> > a
> > 
> > > project related to Wget this summer. Any specific ideas that are of
> > > importance to the community presently?
> > 
> > yes, we will be take part in GSoC.  I think we would like to see more
> > work happening on wget2, at the moment there is a list of issues on
> > 
> > github that can be useful to you to pick some ideas to work on:
> >   https://github.com/rockdaboot/wget2/issues
> > 
> > Could you take a look at it?  Do you see anything interesting that you
> > would like to work on?
> > 
> > Regards,
> > Giuseppe

signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Google Summer of Code 2016

2016-03-02 Thread Tim Ruehsen
Hi Kushagra,

I can only add a few things to Darshit's answer.

Wget2/libwget has been completely written from scratch.
We just moved the code/project to Savannah as part of GNU Wget, transferred 
the copyrights to FSF and integrated gnulib.

Wget2 has not been released yet, but there are not many blockers right now 
(mainly Documentation).

None of us found much time to work on the details to transfer issues into GSOC 
projects. But we are going to and appreciate any help of course.
So, if you want to work on any of the issues as a GSOC project (or if you have 
your own idea), let us know and we are happy to work together with you on a 
detailed specification.

You mentioned FTPS... Ander Juaristi implemented this for Wget during GSOC 
2015. Wget2 currently is lacking FTP and FTPS support (I just added some code 
for the test suite - tested only with Wget).

Maybe you could take one or two of the smaller issues as a warm up to get 
familiar with the new code !? Feel free to ask/discuss any questions with us -  
we enjoy working together with other devs.

Tim

On Wednesday 02 March 2016 10:51:02 Kushagra Singh wrote:
> Hi,
> 
> Thanks for the quick reply. I went through the repository and the issues,
> and found a couple of things I would like to work on.
> 
> I have a couple of questions about Wget2. Is it a complete rewrite of the
> Wget project, available at git://git.savannah.gnu.org/wget.git, or are we
> using existing code and extending functionality? I guess it is the second
> one because I saw `libwget` in the repo. However if such is the case, then
> how do we change existing functions in wget? For example, implementing [2]
> would require making changes to the file cookies.c, which is present in
> /src in the wget repo, but not in /src in the wget2 repo.
> 
> I was looking at #43 [1], and have already submitted a patch for
> consideration for the first suggestion [2]. The second suggestion mentioned
> [3] is one of the things I'd like to work on, however this is not something
> which will take three months :)
> 
> Another project I am interested in, is implementing FTPS. I saw this listed
> under one of the ideas of GSoC 2015, but I'm not sure whether it was
> implemented, as I didn't see it under 'Development Status' in the wget2
> readme on Github.
> 
> Also, in #67 [4], we are talking about adhering to some specific parts of
> RFC 7230. I'm not sure which all parts would be right, as the discussion
> thread mentions that it won't be good to stick to each point of the RFC.
> WDYT?
> 
> 
> [1] https://github.com/rockdaboot/wget2/issues/43
> [2] https://tools.ietf.org/html/draft-west-leave-secure-cookies-alone-04
> [3] https://tools.ietf.org/html/draft-west-cookie-prefixes-05
> [4] https://github.com/rockdaboot/wget2/issues/67
> 
> On Tue, Mar 1, 2016 at 9:57 PM, Giuseppe Scrivano  wrote:
> > Kushagra Singh  writes:
> > > Hi,
> > > 
> > > Will we be taking part in GSoC this year? I would really like to work on
> > 
> > a
> > 
> > > project related to Wget this summer. Any specific ideas that are of
> > > importance to the community presently?
> > 
> > yes, we will be take part in GSoC.  I think we would like to see more
> > work happening on wget2, at the moment there is a list of issues on
> > 
> > github that can be useful to you to pick some ideas to work on:
> >   https://github.com/rockdaboot/wget2/issues
> > 
> > Could you take a look at it?  Do you see anything interesting that you
> > would like to work on?
> > 
> > Regards,
> > Giuseppe

signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Google Summer of Code 2016

2016-03-01 Thread Darshit Shah

On 03/02, Kushagra Singh wrote:

Hi,

Thanks for the quick reply. I went through the repository and the issues,
and found a couple of things I would like to work on.

I have a couple of questions about Wget2. Is it a complete rewrite of the
Wget project, available at git://git.savannah.gnu.org/wget.git, or are we
using existing code and extending functionality? I guess it is the second
one because I saw `libwget` in the repo. However if such is the case, then
how do we change existing functions in wget? For example, implementing [2]
would require making changes to the file cookies.c, which is present in
/src in the wget repo, but not in /src in the wget2 repo.


Wget2 is a complete rewrite of GNU Wget. It is also available on the 
savannah server as its own repository at [1]. Wget2 is meant to be a 
modern (almost) drop-in replacement for Wget. It strives to maintain 
backward compatible command line options and behaviour as far as it 
makes sense. The codebase for the two projects has diverged by 
significant amounts and hence new features need to be implemented 
separately for each.


I was looking at #43 [1], and have already submitted a patch for
consideration for the first suggestion [2]. The second suggestion mentioned
[3] is one of the things I'd like to work on, however this is not something
which will take three months :)


You submitted a patch for Wget. This is the Wget2 repository. Anyways, I 
already have a working patch for most of that issue, got sidetracked 
when writing the tests and eventually forgot about it. I think I'll 
spend some time on it this week and have that patch merged. Don't spend 
time on that part.


Another thing to remember is, not all GitHub issues are valid GSoC 
projects. Since the number of issues is few, it is easy to scout out the 
larger ones. Some issues are pretty tiny, just need someone willing to 
spend time working on them.




Another project I am interested in, is implementing FTPS. I saw this listed
under one of the ideas of GSoC 2015, but I'm not sure whether it was
implemented, as I didn't see it under 'Development Status' in the wget2
readme on Github.

Wget2 as far as I'm aware is still lacking FTPS support. Remember that 
Wget and Wget2 are two different projects.


Also, in #67 [4], we are talking about adhering to some specific parts of
RFC 7230. I'm not sure which all parts would be right, as the discussion
thread mentions that it won't be good to stick to each point of the RFC.
WDYT?


This is a minor grievance I raised. We stick to most of it anyways. As 
Tim points out, being completely RFC compliant may make the tool 
unusuable thanks to the number of bad servers out there. If anything,  
that issue needs to be split into multiple smaller issues about specific 
parts of the RFC that we want to adhere to.


Open projects I currently see are:
1. FTP / FTPS support
2. SOCKS5 Proxy support (This may be too small.)
3. Progress Bar implementation (Looks deceptively simple, isn't)
4. WARC support and tests
5. Brotli compression (May be too small)

The README file also has more pointers on features not implemented in 
Wget2. You may get some ideas from there. Request pipelining and DNSSEC 
are two features I'd be interested in seeing implemented.


Moreover, you are always welcome to submit your own ideas for either 
Wget or Wget2.


Tim can add more details or comment on whether something is too small to 
work on for a GSoC project.


[1]: git://git.savannah.gnu.org/wget/wget2.git


[1] https://github.com/rockdaboot/wget2/issues/43
[2] https://tools.ietf.org/html/draft-west-leave-secure-cookies-alone-04
[3] https://tools.ietf.org/html/draft-west-cookie-prefixes-05
[4] https://github.com/rockdaboot/wget2/issues/67

On Tue, Mar 1, 2016 at 9:57 PM, Giuseppe Scrivano  wrote:


Kushagra Singh  writes:

> Hi,
>
> Will we be taking part in GSoC this year? I would really like to work on
a
> project related to Wget this summer. Any specific ideas that are of
> importance to the community presently?

yes, we will be take part in GSoC.  I think we would like to see more
work happening on wget2, at the moment there is a list of issues on
github that can be useful to you to pick some ideas to work on:

  https://github.com/rockdaboot/wget2/issues

Could you take a look at it?  Do you see anything interesting that you
would like to work on?

Regards,
Giuseppe



--
Thanking You,
Darshit Shah


signature.asc
Description: PGP signature


Re: [Bug-wget] Google Summer of Code 2016

2016-03-01 Thread Kushagra Singh
Hi,

Thanks for the quick reply. I went through the repository and the issues,
and found a couple of things I would like to work on.

I have a couple of questions about Wget2. Is it a complete rewrite of the
Wget project, available at git://git.savannah.gnu.org/wget.git, or are we
using existing code and extending functionality? I guess it is the second
one because I saw `libwget` in the repo. However if such is the case, then
how do we change existing functions in wget? For example, implementing [2]
would require making changes to the file cookies.c, which is present in
/src in the wget repo, but not in /src in the wget2 repo.

I was looking at #43 [1], and have already submitted a patch for
consideration for the first suggestion [2]. The second suggestion mentioned
[3] is one of the things I'd like to work on, however this is not something
which will take three months :)

Another project I am interested in, is implementing FTPS. I saw this listed
under one of the ideas of GSoC 2015, but I'm not sure whether it was
implemented, as I didn't see it under 'Development Status' in the wget2
readme on Github.

Also, in #67 [4], we are talking about adhering to some specific parts of
RFC 7230. I'm not sure which all parts would be right, as the discussion
thread mentions that it won't be good to stick to each point of the RFC.
WDYT?


[1] https://github.com/rockdaboot/wget2/issues/43
[2] https://tools.ietf.org/html/draft-west-leave-secure-cookies-alone-04
[3] https://tools.ietf.org/html/draft-west-cookie-prefixes-05
[4] https://github.com/rockdaboot/wget2/issues/67

On Tue, Mar 1, 2016 at 9:57 PM, Giuseppe Scrivano  wrote:

> Kushagra Singh  writes:
>
> > Hi,
> >
> > Will we be taking part in GSoC this year? I would really like to work on
> a
> > project related to Wget this summer. Any specific ideas that are of
> > importance to the community presently?
>
> yes, we will be take part in GSoC.  I think we would like to see more
> work happening on wget2, at the moment there is a list of issues on
> github that can be useful to you to pick some ideas to work on:
>
>   https://github.com/rockdaboot/wget2/issues
>
> Could you take a look at it?  Do you see anything interesting that you
> would like to work on?
>
> Regards,
> Giuseppe
>


Re: [Bug-wget] Google Summer of Code 2016

2016-03-01 Thread Giuseppe Scrivano
Kushagra Singh  writes:

> Hi,
>
> Will we be taking part in GSoC this year? I would really like to work on a
> project related to Wget this summer. Any specific ideas that are of
> importance to the community presently?

yes, we will be take part in GSoC.  I think we would like to see more
work happening on wget2, at the moment there is a list of issues on
github that can be useful to you to pick some ideas to work on:

  https://github.com/rockdaboot/wget2/issues

Could you take a look at it?  Do you see anything interesting that you
would like to work on?

Regards,
Giuseppe



[Bug-wget] Google Summer of Code 2016

2016-03-01 Thread Kushagra Singh
Hi,

Will we be taking part in GSoC this year? I would really like to work on a
project related to Wget this summer. Any specific ideas that are of
importance to the community presently?

A quick introduction, I'm Kushagra Singh, a second year student at IIIT
Delhi, India. My major is Computer Science. I have gone through a
particular chunk of wget's source code and understand it well, and have
submitted a patch for consideration. I successfully completed GSoC, working
with lmonade last summer.

Looking forward to code on this project this summer!

Thank you,
Kushagra Singh