trouble loading and installing wget
Hello, I'm trying to install wget on my itanium 11.23 system and getting the following error when executing it: [EMAIL PROTECTED]:/tmp$ wget http://hpux.cs.utah.edu/hppd/cgi-bin/redirect?hpux/Gnu/wget-1.10.2/wget- 1.10.2-ia64-11.23.depot.gz --08:31:36-- http://hpux.cs.utah.edu/hppd/cgi-bin/redirect?hpux/Gnu/wget-1.10.2/wget- 1.10.2-ia64-11.23.depot.gz = `redirect?hpux%2FGnu%2Fwget-1.10.2%2Fwget-1.10.2-ia64-11.23.depot.gz' Resolving hpux.cs.utah.edu... 155.98.64.90 /usr/lib/hpux32/dld.so: Unsatisfied code symbol '__umodsi3' in load module '/usr/local/bin/wget'. Killed If I use the source code and run the configure script, then do a 'make install' I get the following error: [EMAIL PROTECTED]:/tmp/wget-1.10.2$ make install cd src make CC='gcc' CPPFLAGS='-O' DEFS='-DHAVE_CONFIG_H -DSYSTEM_WGETRC=\/usr/local/etc/wgetrc\ -DLOCALEDIR=\/usr/local/share/locale\' CFLAGS='-O' LDFLAGS='' LIBS='-ldl -L/usr/local/lib/hpux32 /usr/local/lib/hpux32/libssl.so /usr/local/lib/hpux32/libcrypto.so' prefix='/usr/local' exec_prefix='/usr/local' bindir='/usr/local/bin' infodir='/usr/local/info' mandir='/usr/local/man' manext='1' install.bin gcc -I. -I. -O -DHAVE_CONFIG_H -DSYSTEM_WGETRC=\/usr/local/etc/wgetrc\ -DLOCALEDIR=\/usr/local/share/locale\ -O -c cmpt.c gcc -I. -I. -O -DHAVE_CONFIG_H -DSYSTEM_WGETRC=\/usr/local/etc/wgetrc\ -DLOCALEDIR=\/usr/local/share/locale\ -O -c connect.c In file included from connect.c:41: /usr/include/sys/socket.h:535: error: static declaration of 'sendfile' follows non-static declaration /usr/include/sys/socket.h:506: error: previous declaration of 'sendfile' was here /usr/include/sys/socket.h:536: error: static declaration of 'sendpath' follows non-static declaration /usr/include/sys/socket.h:508: error: previous declaration of 'sendpath' was here connect.c: In function 'bind_local': connect.c:457: warning: passing argument 3 of 'getsockname' from incompatible pointer type connect.c: In function 'accept_connection': connect.c:507: warning: passing argument 3 of 'accept' from incompatible pointer type connect.c: In function 'socket_ip_address': connect.c:528: warning: passing argument 3 of 'getsockname' from incompatible pointer type connect.c:530: warning: passing argument 3 of 'getpeername' from incompatible pointer type Any idea's and assistance would be greatly appreciated. Thank you very much. -Kashif
Re: wget css parsing, updated to trunk
On 12/5/06, Ted Mielczarek [EMAIL PROTECTED] wrote: Hello all, I have updated my CSS parsing code for wget to trunk, the results are here: http://ted.mielczarek.org/code/wget-modified/trunk/ I will submit the patch to wget-patches shortly. My original posting: http://www.mail-archive.com/wget@sunsite.dk/msg09142.html Is there any interest in this? I'm using it for a private project, and I've had some off-list interest in it, but nothing on-list. Is it worth my time to pursue this, or should I just consider it a private fork? Regards, -Ted
directory-prefix bug in Win32
Hi! When using -P or --directory-prefix in v1.11 Beta 1 and later v1.11 Beta 1(with spider patch) command-line switches wget does not pay attention to neither of them. It saves files in the current directory. Wget v1.10.2 worked right. Hope, this bug won't live long :).
Re: wget css parsing, updated to trunk
On Mon, 11 Dec 2006 09:14:11 -0500 Ted Mielczarek wrote: Is there any interest in this? I'm using it for a private project, and I've had some off-list interest in it, but nothing on-list. Is it worth my time to pursue this, or should I just consider it a private fork? As a user, I'm certainly interested and think css parsing is important. - Richard -- Richard Kimber http://www.psr.keele.ac.uk/
RE: ERROR 500 problem
Maybe the server has some sort of limitations to hits from the same IP address over a time period. 1400 pages is a lot, maybe they got mad at you and send you all 500's from then on :) Ranjit Sandhu 703.803.1755 SRA -Original Message- From: Yoav Atzmony [mailto:[EMAIL PROTECTED] Sent: Monday, December 11, 2006 12:24 PM To: [EMAIL PROTECTED] Subject: ERROR 500 problem Hi, I hope someone can shed light on this problem. I am trying to crawl a particular site, and am getting strange results. I had crawled it successfully in the past but lately I only am able to crawl about 1400 of the 8000 pages. I am constantly getting (as reported in a verbose log file): HTTP request sent, awaiting response... 500 Internal Server Error 11:04:18 ERROR 500: Internal Server Error. This error happens intermitently on some pages during the first 1400 pages, i.e. http://www.ryland.com/find-your-new-home/29-northern-kentucky/1115-claib orne/11777-shenandoah.html But then this is what happens in the log file, and subsequently ALL files receive the error 500 (as shown in a non-verbose log file): WARNING: Certificate verification error for www.ryland.com: unable to get local issuer certificate 16:58:03 URL:https://www.ryland.com/home/contact-us/29-1361-community-and-floor-p lan-information.html [257903/257903] - files/www.ryland.com/home/contact-us/29-1361-community-and-floor-plan-i nformation.html [1] 16:58:19 URL:http://www.ryland.com/home/29-1034-contact-us.html [115640/115640] - files/www.ryland.com/home/29-1034-contact-us.html [1] http://www.ryland.com/find-your-new-home/29-northern-kentucky/1034-frenc [EMAIL PROTECTED]/driving-directions.html: 16:58:23 ERROR 500: Internal Server Error. http://www.ryland.com/find-your-new-home/29-northern-kentucky/1034-frenc h-quarter-orleans/11256-summit.html: 16:58:27 ERROR 500: Internal Server Error. Now if I was to call wget only on the page that failed with ERROR 500, it would crawl just fine. Here are settings I am using: wget www.ryland.com -o LogRyland3.txt -t 5 --random-wait -v I have run wget numerous times on this site, and I receive ERROR 500 on different pages before it reaches the point where all pages fail from then on, as shown above. And I have appended the INI file which has more settings (below). Again, crawling this site used to work fine. And crawling failed pages works when I crawl them individually. Any help would be GREATLY appreciated! INI FILE STARTS HERE--- # Rewrote the wgetrc / wget.ini file from scratch, based on the manual for version 1.9 logfile = log.txt tries = 1 timeout = 30 wait = 1 randomwait = on quota = 5000m restrict_file_names = windows add_hostdir = on span_hosts = off dir_prefix = files cache = off recursive = on use_proxy = off robots = off verbose = off keep-session-cookies = on save-cookies = sw_cookies.txt check_certificate = off #reclevel = 7 reject = GIF,jpg,JPG,jpeg,JPEG,bmp,BMP,pdf,PDF,css,CSS,js,JS,mpeg,MPEG,mov,MOV,av i,AVI,wmv,WMV,doc,DOC,ppt,PPT,csv,CSV,xls,XLS,txt,TXT,png,PNG,ra,RA,ram, RAM,tif,TIF,zip,ZIP,rar,RAR,class,CLASS,swf,SWF,pl,xml,XML,mp3,MP3,sid,S ID,ivr,IVR,psd,PSD,rft,RTF,dwf,DWF,abk,acl,acm,acp,act,acv,ad,adb,add,ad m,adp,adr,af2,af3,afm,ai,aif,alb,all,ams,anc,ani,ans,api,apr,aps,arc,arj ,art,asa,asc,asd,asf,asm,ast,asx,att,avi,awd,b4,bak,bas,bat,bfc,bg,bi,bi f,bin,bk,bks,bm1,bmk,bmp,brx,bs1,bsp,btm,cab,cal,cas,cat,cb,ccb,ccf,cch, ccm,cda,cdf,cdi,cdr,cdt,cdx,cel,cfb,cfg,cgm,ch,chk,chp,cil,cim,cin,ck1,c k2,ck3,ck4,ck5,ck6,cla,clp,cls,cmd,cmf,cmp,cmv,cnf,cnm,cnq,cnt,cob,cod,c om,cpd,cpe,cpi,cpl,cpp,cpr,cpt,cpx,crd,crp,crt,csc,csp,css,csv,ct,ctl,cu e,cur,cut,cv,cwk,cws,cxx,dat,dbf,dbx,dcr,dcs,dcx,ddf,def,der,dib,dic,dif ,dir,diz,dlg,dll,dmf,dmg,doc,dot,dpr,drv,drw,dsg,dsm,dsp,dsq,dsw,dwg,dxf ,emf,enc,eps,er1,erx,evy,ewl,exe,f77,f90,far,fav,fax,fh3,fif,fit,flc,fli ,flt,fmb,fmt,fmx,fog,fon,for,fot,fp,fp1,fp3,fpx,frm,frx,gal,gcp,ged,gem, gen,gfc,gfi,gfx,gid,gif,gim,gix,gna,gnx,gra,grd,grp,gt2,gtk,gwx,gwz,gz,h ed,hel,hex,hgl,hlp,hog,hpj,hpp,hqx,hst,ht,htx,ica,icb,icm,ico,idd,idq,if f,igf,iif,ima,img,inc,inf,ini,inp,ins,iso,isp,isu,it,iw,jar,jav,jbf,jff, jif,jmp,jn1,jpe,jpg,js,jtf,kdc,kfx,kye,lbm,ldb,leg,lha,lib,lis,log,lpd,l rc,lst,lwo,lwp,lzh,lzs,m3d,mad,maf,mak,mam,map,maq,mar,mas,mat,max,maz,m b1,mcc,mcs,mcw,mda,mdb,mde,mdl,mdn,mdw,mdz,med,mer,met,mi,mic,mid,mmf,mm m,mod,mov,mp3,mpe,mpg,mpp,msg,msi,msn,msp,mtm,mus,mvb,mwp,nap,ncb,nsf,ns t,ntf,obd,obj,obz,ocx,ofn,oft,okt,olb,ole,opt,or2,or3,org,p10,p65,pab,pa k,pal,pat,pbk,pbm,pcd,pcl,pcs,pct,pcx,pdf,pdq,pfa,pfb,pfc,pfm,pgl,pgm,pi c,pif,pig,pin,pix,pj,pkg,pl,plt,pm5,pm6,png,pnt,pot,pp4,ppa,ppm,pps,ppt, pre,prf,prn,prs,prz,ps,psd,pst,ptm,pub,pwd,pwz,pxl,qad,qbw,qdt,qlb,qry,q t,qtm,qxd,ra,ram,ras,raw,rc,rec,reg,res,rft,rle,rm,rmi,rov,rpt,rtf,rtm,s 3m,sam,sav,sc2,scc,scd,sch,scn,scp,scr,sct,sdl,sdr,sdt,sea,sep,shb,shg,s hs,shw,sit,slk,snd,sqc,sqr,sty,svx,sys,t2t,tar,taz,tex,tga,tgz,the,thn,t
Re: trouble loading and installing wget
From: Siddiqui, Kashif I'm trying to install wget on my itanium 11.23 system [...] I assume that that's HP-UX 11.23, as in: [EMAIL PROTECTED] uname -a HP-UX td176 B.11.23 U ia64 1928826293 unlimited-user license /usr/lib/hpux32/dld.so: Unsatisfied code symbol '__umodsi3' in load module '/usr/local/bin/wget'. And where did you get _that_ copy of wget? If I use the source code and run the configure script, then do a 'make install' I get the following error: [...] gcc -I. -I. -O -DHAVE_CONFIG_H -DSYSTEM_WGETRC=\/usr/local/etc/wgetrc\ -DLOCALEDIR=\/usr/local/share/locale\ -O -c connect.c In file included from connect.c:41: /usr/include/sys/socket.h:535: error: static declaration of 'sendfile' follows non-static declaration [...] Complaints about header files are often caused by a bad GCC installation (or an OS upgrade which confuses GCC). I just tried building my VMS-oriented 1.10.2c kit using GCC on one of the HP TestDrive systems, and I had some trouble ('ld: Unsatisfied symbol libintl_gettext in file getopt.o'), but that's much later than compiling connect.c, which got only the (usual) warnings about the pointers. That's with: http://antinode.org/dec/sw/wget.html http://antinode.org/ftp/wget/wget-1_10_2c_vms/wget-1_10_2c_vms.zip [EMAIL PROTECTED] gcc --version gcc (GCC) 3.4.3 [...] And I have no idea whether the GCC installation there is good or bad. (But it seems to be better than yours.) I also tried it using HP's C compiler (CC=cc ./configure): [EMAIL PROTECTED] cc -V cc: HP C/aC++ B3910B A.06.12 [Aug 17 2006] Here, the make ran to an apparently successful completion, but real testing is not convenient on the TestDrive systems, so I can't say whether it would actually work better than what you have. [EMAIL PROTECTED] ./src/wget -V GNU Wget 1.10.2c built on hpux11.23. [...] So, I'd suggest using HP's C compiler, or else re-installing GCC. After that, I'd suggest using the ITRC HP-UX forum: http://forums1.itrc.hp.com/service/forums/familyhome.do?familyId=117 Any idea's and assistance [...] That's ideas, by the way. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
More detail on bug
When using -P or --directory-prefix in v1.11 Beta 1 and later v1.11 Beta 1(with spider patch) command-line switches wget does not pay attention to neither of them. It saves files in the current directory. Wget v1.10.2 worked right. Such incorrent behaviour appeares only if server http answer contains Content-disposition tag. Looking forward to developers comments!
Wget in 1.11 beta 1 found
Hi, dear developers! When using -P or --directory-prefix in v1.11 Beta 1 and later v1.11 Beta 1(with spider patch) command-line switches wget does not pay attention to neither of them. It saves files in the current directory. Such incorrent behaviour appeares only if server http answer contains Content-disposition tag. Wget v1.10.2 worked right. Hope, this bug won't live long :). -- denis mailto:[EMAIL PROTECTED]