Re: [Bug-wget] Wget not finding image references in javascript source

2016-03-03 Thread Zane Staggs
Sounds reasonable Darshit thanks for the explanation.  Rather than
actually parsing javascript (or use a headless browser etc) I was
thinking wget could use a regex for the simplest case of an image with
the jpg/png/gif extension embedded in javascript strings.  But I do
realize that there is overhead to that and there are many edge cases
to how a javascript string might be built dynamically so it may be
just too much risk to even try to do that but maybe not if it's only
for the specific case of a valid absolute/relative path to an image.

-- Zane


On Mon, Feb 29, 2016 at 10:59 PM, Darshit Shah  wrote:
> Hi Zane,
>
> The question of supporting links and images emebdded via javascript props up
> fairly often. JS is a dynamic scripting language and the code path taken
> depends on the user's interaction with the page. To simulate this, we would
> need a full JS engine inside of Wget. Apart from being large and clumsy this
> would also be impossible for us to maintain. As a result, we do not and have
> no plans to support parsing JS code in Wget in the near future.
>
> If you have any ideas that would help implement this without needing a full
> JS engine, do let us know. We'd be interested in hearing and evaluating new
> options.
>
> On 02/29, Zane Staggs wrote:
>>
>> It seems wget ignores image paths that exist in javascript source like
>> in a simple path string like "/path/to/my/image.jpg".  I realize it's
>> probably not easy to do parse every js string for an image path but
>> wondering if there are ways to make it work or plans to implement it.
>> I got around it for now by creating a dummy hidden img element with
>> the src so wget could find it in the dom.  Thanks.
>>
>
> --
> Thanking You,
> Darshit Shah



Re: [Bug-wget] Google Summer of Code 2016

2016-03-03 Thread Tim Ruehsen
Just more ideas for you, Kushagra:

There are many command line options from Wget still missing in Wget2, you 
should have a look at
https://github.com/rockdaboot/wget2/wiki anyways - feel free to work on the 
wiki yourself (e.g. fork the wiki pages: 
https://help.github.com/articles/adding-and-editing-wiki-pages-locally/ or let 
me know and I'll give you write access).

You can search the Wget bug tracker 
(https://savannah.gnu.org/bugs/?group=wget) for wishlist items.
My favorite is https://savannah.gnu.org/bugs/?45803.
Special popen(2|3) functions/code is already in libwget/ directory.
E.g., that would allow Wget2 to be used as part of a recursive website malware 
checker.

The authorization code in the test suite is not complete/not implemented - I 
once tested authorization (MD5, MD5-sess) 'by hand' with my local Apache. But 
a automated test is badly needed.

We thought of a statistic module (very basic code exists) for spider mode to 
output diagnostics very detailed. Missing pages, response times, server load 
(e.g. using the RTT/ping time), etc.

Tim

On Wednesday 02 March 2016 10:51:02 Kushagra Singh wrote:
> Hi,
> 
> Thanks for the quick reply. I went through the repository and the issues,
> and found a couple of things I would like to work on.
> 
> I have a couple of questions about Wget2. Is it a complete rewrite of the
> Wget project, available at git://git.savannah.gnu.org/wget.git, or are we
> using existing code and extending functionality? I guess it is the second
> one because I saw `libwget` in the repo. However if such is the case, then
> how do we change existing functions in wget? For example, implementing [2]
> would require making changes to the file cookies.c, which is present in
> /src in the wget repo, but not in /src in the wget2 repo.
> 
> I was looking at #43 [1], and have already submitted a patch for
> consideration for the first suggestion [2]. The second suggestion mentioned
> [3] is one of the things I'd like to work on, however this is not something
> which will take three months :)
> 
> Another project I am interested in, is implementing FTPS. I saw this listed
> under one of the ideas of GSoC 2015, but I'm not sure whether it was
> implemented, as I didn't see it under 'Development Status' in the wget2
> readme on Github.
> 
> Also, in #67 [4], we are talking about adhering to some specific parts of
> RFC 7230. I'm not sure which all parts would be right, as the discussion
> thread mentions that it won't be good to stick to each point of the RFC.
> WDYT?
> 
> 
> [1] https://github.com/rockdaboot/wget2/issues/43
> [2] https://tools.ietf.org/html/draft-west-leave-secure-cookies-alone-04
> [3] https://tools.ietf.org/html/draft-west-cookie-prefixes-05
> [4] https://github.com/rockdaboot/wget2/issues/67
> 
> On Tue, Mar 1, 2016 at 9:57 PM, Giuseppe Scrivano  wrote:
> > Kushagra Singh  writes:
> > > Hi,
> > > 
> > > Will we be taking part in GSoC this year? I would really like to work on
> > 
> > a
> > 
> > > project related to Wget this summer. Any specific ideas that are of
> > > importance to the community presently?
> > 
> > yes, we will be take part in GSoC.  I think we would like to see more
> > work happening on wget2, at the moment there is a list of issues on
> > 
> > github that can be useful to you to pick some ideas to work on:
> >   https://github.com/rockdaboot/wget2/issues
> > 
> > Could you take a look at it?  Do you see anything interesting that you
> > would like to work on?
> > 
> > Regards,
> > Giuseppe

signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] buildbot failure in OpenCSW Buildbot on wget-solaris10-sparc

2016-03-03 Thread Tim Ruehsen
The problem is not related to the latest commit(s).
It it SSLv2 related stuff on the build farm:

Running Test HSTS basic test
Traceback (most recent call last):
  File "./Test-hsts.py", line 75, in 
test.setup()
  File "/home/rockdaboot/wget/testenv/test/http_test.py", line 30, in setup
self.server_setup()
  File "/home/rockdaboot/wget/testenv/test/base_test.py", line 85, in 
server_setup
instance = self.instantiate_server_by(protocol)
  File "/home/rockdaboot/wget/testenv/test/http_test.py", line 51, in 
instantiate_server_by
HTTPS: HTTPSd}[protocol]()
  File "/home/rockdaboot/wget/testenv/server/http/http_server.py", line 470, 
in __init__
self.server_inst = self.server_class(addr, self.handler)
  File "/home/rockdaboot/wget/testenv/server/http/http_server.py", line 38, in 
__init__
import ssl
  File "/opt/csw/lib/python3.3/ssl.py", line 60, in 
import _ssl # if we can't import it, let the error propagate
ImportError: ld.so.1: python3.3: fatal: relocation error: file 
/opt/csw/lib/python3.3/lib-dynload/_ssl.so: symbol SSLv2_method: referenced 
symbol not found
FAIL Test-hsts.py (exit status: 1)


I *guess* that is has to do with the latest SSLv2 vulnerability and that the 
underlying OpenSSL library has been exchanged without taking care for the 
python module.

Tim

On Thursday 03 March 2016 10:13:55 build...@opencsw.org wrote:
> The Buildbot has detected a new failure on builder wget-solaris10-sparc
> while building wget. Full details are available at:
>  https://buildfarm.opencsw.org/buildbot/builders/wget-solaris10-sparc/builds
> /131
> 
> Buildbot URL: https://buildfarm.opencsw.org/buildbot/
> 
> Buildslave for this Build: unstable10s
> 
> Build Reason: scheduler
> Build Source Stamp: [branch master] 44aedd832197e32abbb4cb9582774c2ca8b8fa43
> Blamelist: Giuseppe Scrivano ,Maks Orlovich
> 
> 
> BUILD FAILED: failed shell_3
> 
> sincerely,
>  -The Buildbot

signature.asc
Description: This is a digitally signed message part.


[Bug-wget] buildbot failure in OpenCSW Buildbot on wget-solaris10-sparc

2016-03-03 Thread buildbot
The Buildbot has detected a new failure on builder wget-solaris10-sparc while 
building wget.
Full details are available at:
 https://buildfarm.opencsw.org/buildbot/builders/wget-solaris10-sparc/builds/131

Buildbot URL: https://buildfarm.opencsw.org/buildbot/

Buildslave for this Build: unstable10s

Build Reason: scheduler
Build Source Stamp: [branch master] 44aedd832197e32abbb4cb9582774c2ca8b8fa43
Blamelist: Giuseppe Scrivano ,Maks Orlovich 


BUILD FAILED: failed shell_3

sincerely,
 -The Buildbot






[Bug-wget] buildbot failure in OpenCSW Buildbot on wget-solaris10-i386

2016-03-03 Thread buildbot
The Buildbot has detected a new failure on builder wget-solaris10-i386 while 
building wget.
Full details are available at:
 https://buildfarm.opencsw.org/buildbot/builders/wget-solaris10-i386/builds/126

Buildbot URL: https://buildfarm.opencsw.org/buildbot/

Buildslave for this Build: unstable10x

Build Reason: scheduler
Build Source Stamp: [branch master] 44aedd832197e32abbb4cb9582774c2ca8b8fa43
Blamelist: Giuseppe Scrivano ,Maks Orlovich 


BUILD FAILED: failed shell_3

sincerely,
 -The Buildbot






Re: [Bug-wget] Patch for understanding srcset= on img tags.

2016-03-03 Thread Giuseppe Scrivano
Maksim Orlovich  writes:

>> should the condition be (c == ')' && in_paren)  ?
>
> Indeed.
>
> Thanks,
> Maks

Thanks for the changes, I am going to push it shortly.

Regards,
Giuseppe