Re: [freenet-support] need a program to crawl links in freenet
On 11-Mar-2004 Ian Clarke wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Toad wrote: >| Unfortunately crawling freenet via HTTP will have the main effect of >| DoSing your freenet node, because every web download takes up a thread, >| and we therefore limit parallel HTTP downloads to 24-36. Ideally you'd >| want a real FCP spider; there must be one out there somewhere. > > You can download something which claims to do this from: > > http://127.0.0.1:/[EMAIL PROTECTED]/spider/5// > > I tried it (in a sandbox Linux account, which is absoltely the minimum > precaution anyone should take if running code downloaded from an > untrusted anonymous source) and it seems to work pretty nicely. Yes, this is the same spider I use to generate DFI. Actually, I'm using an earlier version that I've hacked on quite a bit. :-) -- Conrad Sabatier <[EMAIL PROTECTED]> - "In Unix veritas" ___ Support mailing list [EMAIL PROTECTED] http://news.gmane.org/gmane.network.freenet.support Unsubscribe at http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/support Or mailto:[EMAIL PROTECTED]
Re: [freenet-support] need a program to crawl links in freenet
Thanks for the info. > [Original Message] > From: Toad <[EMAIL PROTECTED]> > To: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> > Date: 3/12/2004 10:58:04 AM > Subject: Re: [freenet-support] need a program to crawl links in freenet > > On Thu, Mar 11, 2004 at 05:09:09PM -0500, Nicholas Sturm wrote: > > Please provide reference to a good glossary. > > > > > > I tried it (in a sandbox Linux account, which is absoltely the minimum > > > precaution anyone should take if running code downloaded from an > > > untrusted anonymous source) and it seems to work pretty nicely. > > > > Is sandbox just Linux term or does it have broader application? > > It basically means a sort of virtual computer within the computer, the > idea being to limit the damage that can be done by untrusted code. An > example would be, if you download some software that might be useful but > you don't know whether it is safe, you might run it on a spare PC that > isn't connected to the others and doesn't do anything else. A sandbox > does the same thing but in software: you can run untrusted code, in a > box, where it can't inflict too much harm on the rest of the system. > User Mode Linux is a popular way to do this on linux, and is used for > some hosting systems. On Windows... VMWare would be an option, perhaps. > -- > Matthew J Toseland - [EMAIL PROTECTED] > Freenet Project Official Codemonkey - http://freenetproject.org/ > ICTHUS - Nothing is impossible. Our Boss says so. ___ Support mailing list [EMAIL PROTECTED] http://news.gmane.org/gmane.network.freenet.support Unsubscribe at http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/support Or mailto:[EMAIL PROTECTED]
Re: [freenet-support] need a program to crawl links in freenet
On Thu, Mar 11, 2004 at 05:09:09PM -0500, Nicholas Sturm wrote: > Please provide reference to a good glossary. > > > > I tried it (in a sandbox Linux account, which is absoltely the minimum > > precaution anyone should take if running code downloaded from an > > untrusted anonymous source) and it seems to work pretty nicely. > > Is sandbox just Linux term or does it have broader application? It basically means a sort of virtual computer within the computer, the idea being to limit the damage that can be done by untrusted code. An example would be, if you download some software that might be useful but you don't know whether it is safe, you might run it on a spare PC that isn't connected to the others and doesn't do anything else. A sandbox does the same thing but in software: you can run untrusted code, in a box, where it can't inflict too much harm on the rest of the system. User Mode Linux is a popular way to do this on linux, and is used for some hosting systems. On Windows... VMWare would be an option, perhaps. -- Matthew J Toseland - [EMAIL PROTECTED] Freenet Project Official Codemonkey - http://freenetproject.org/ ICTHUS - Nothing is impossible. Our Boss says so. signature.asc Description: Digital signature ___ Support mailing list [EMAIL PROTECTED] http://news.gmane.org/gmane.network.freenet.support Unsubscribe at http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/support Or mailto:[EMAIL PROTECTED]
Re: [freenet-support] need a program to crawl links in freenet
On 12/03/2004, at 11:09 AM, Nicholas Sturm wrote: Please provide reference to a good glossary. I tried it (in a sandbox Linux account, which is absoltely the minimum precaution anyone should take if running code downloaded from an untrusted anonymous source) and it seems to work pretty nicely. Is sandbox just Linux term or does it have broader application? It refers to any environment that's been secured so only necessary applications are available to anything running as that user. It means that no unnecessary things, such as access to system password files, access to compilers, any setuid binaries etc are not allowed. Any untrusted code should be run in such an account, so it can't screw up your system. Java itself runs in a sandbox of sorts. Especially applets. -- Phillip Hutchings [EMAIL PROTECTED] http://www.sitharus.com/ smime.p7s Description: S/MIME cryptographic signature ___ Support mailing list [EMAIL PROTECTED] http://news.gmane.org/gmane.network.freenet.support Unsubscribe at http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/support Or mailto:[EMAIL PROTECTED]
Re: [freenet-support] need a program to crawl links in freenet
Nicholas Sturm wrote: Please provide reference to a good glossary. I tried it (in a sandbox Linux account, which is absoltely the minimum precaution anyone should take if running code downloaded from an untrusted anonymous source) and it seems to work pretty nicely. Is sandbox just Linux term or does it have broader application? A "sandbox" is a area of limited functionality where one can control a programs behavior. Think of it more like a "jail". Java applets (web applets, not Freenet) run a in sandbox. This way, if the program is malicious (or badly written), it can't do any damage outside the "sandbox" (in theory, anyway). If a Java applet tries to do something not allowed by the security policy (write a file, open a network connection, change the security policy, etc), Java will raise an exception. Note the ActiveX controls do NOT run in a sandbox. For Linux, there was a project, Subterfugue, which could create a "sandbox" for a program, but its not currently maintained. There is also User-mode Linux (UML), which lets you run Linux in Linux - everything run the the UML environment is "trapped" and can't do any damage outside its environment. ___ Support mailing list [EMAIL PROTECTED] http://news.gmane.org/gmane.network.freenet.support Unsubscribe at http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/support Or mailto:[EMAIL PROTECTED]
Re: [freenet-support] need a program to crawl links in freenet
Please provide reference to a good glossary. > > I tried it (in a sandbox Linux account, which is absoltely the minimum > precaution anyone should take if running code downloaded from an > untrusted anonymous source) and it seems to work pretty nicely. Is sandbox just Linux term or does it have broader application? ___ Support mailing list [EMAIL PROTECTED] http://news.gmane.org/gmane.network.freenet.support Unsubscribe at http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/support Or mailto:[EMAIL PROTECTED]
Re: [freenet-support] need a program to crawl links in freenet
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Toad wrote: | Unfortunately crawling freenet via HTTP will have the main effect of | DoSing your freenet node, because every web download takes up a thread, | and we therefore limit parallel HTTP downloads to 24-36. Ideally you'd | want a real FCP spider; there must be one out there somewhere. You can download something which claims to do this from: http://127.0.0.1:/[EMAIL PROTECTED]/spider/5// I tried it (in a sandbox Linux account, which is absoltely the minimum precaution anyone should take if running code downloaded from an untrusted anonymous source) and it seems to work pretty nicely. If you download it, and it inserts your credit card details into Freenet and emails your mother with pictures of hard core porn, all before it deletes your hard disk - don't blame me, you run this entirely at your own risk. Ian. -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFAUKjVQtgxRWSmsqwRAmeHAJ95xrhiPkwzrzo0co60shDbOZzd+ACdFTdy dONXnYpzaG5MtlQzrj/IN3g= =a7AK -END PGP SIGNATURE- ___ Support mailing list [EMAIL PROTECTED] http://news.gmane.org/gmane.network.freenet.support Unsubscribe at http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/support Or mailto:[EMAIL PROTECTED]
Re: [freenet-support] need a program to crawl links in freenet
Unfortunately crawling freenet via HTTP will have the main effect of DoSing your freenet node, because every web download takes up a thread, and we therefore limit parallel HTTP downloads to 24-36. Ideally you'd want a real FCP spider; there must be one out there somewhere. On Fri, Oct 31, 2003 at 05:58:54PM +1300, David McNab wrote: > On Fri, 2003-10-31 at 17:11, tripolar wrote: > > Hello all > > > > I need a program to crawl links in freenet to get sites in cache before > > I need them. I get frustrated with the speed of freenet and dead links. > > I have used freenet on windows & linux and just installed freenet last > > night on this winbox. I spent hours clicking on links just trying to > > pull them in. > > Anything I can do other than manually clicking on all links? > > A winbox, eh? > > Well, there are dozens of freeware and shareware (crack available) > windoze programs to recursively download websites. Sites like > www.tucows.com, www.nonags.com, www.shareware.com etc list them by the > score. > > You might need to try a few until you hit on one which doesn't molest > the '//' in freenet URIs. > > Once you choose a site downloader program, you'd need to point it at: > - http://localhost:/[EMAIL PROTECTED]/TFE// > - http://localhost:/[EMAIL PROTECTED]/YoYo// > or, point it at whatever site(s) you need, observing the above URL > syntax. Be sure to disable the timeout (or extend it to an hour or so), > because these crawler progs are used to web performance. > > One last thing - these crawler progs won't be able to tell one freesite > from another, since it will perceive all freesites as part of the same > 'site' at http://localhost:/. So take care to set a pattern match > requirement (unless you want the crawler to suck the whole freenet). > > Cheers > David > > ___ > Support mailing list > [EMAIL PROTECTED] > http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/support -- Matthew J Toseland - [EMAIL PROTECTED] Freenet Project Official Codemonkey - http://freenetproject.org/ ICTHUS - Nothing is impossible. Our Boss says so. signature.asc Description: Digital signature ___ Support mailing list [EMAIL PROTECTED] http://news.gmane.org/gmane.network.freenet.support Unsubscribe at http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/support Or mailto:[EMAIL PROTECTED]
Re: [freenet-support] need a program to crawl links in freenet
On Thu, Oct 30, 2003 at 10:11:23PM -0600, tripolar wrote: > Hello all > > I need a program to crawl links in freenet to get sites in cache before > I need them. I get frustrated with the speed of freenet and dead links. > I have used freenet on windows & linux and just installed freenet last > night on this winbox. I spent hours clicking on links just trying to > pull them in. > Anything I can do other than manually clicking on all links? > I do hope this make sense. > one other thing- I would like to dedicate more bandwidth to freenet- so > that I can request several freenet sites as the same time without one or > more of them practically stalling. That's not a problem with Freenet, it's a problem with your browser. You need to tell it to use more simultaneous connections. Details are in the README... > Thanks -- Matthew J Toseland - [EMAIL PROTECTED] Freenet Project Official Codemonkey - http://freenetproject.org/ ICTHUS - Nothing is impossible. Our Boss says so. signature.asc Description: Digital signature ___ Support mailing list [EMAIL PROTECTED] http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/support
Re: [freenet-support] need a program to crawl links in freenet
On Fri, 2003-10-31 at 17:11, tripolar wrote: > Hello all > > I need a program to crawl links in freenet to get sites in cache before > I need them. I get frustrated with the speed of freenet and dead links. > I have used freenet on windows & linux and just installed freenet last > night on this winbox. I spent hours clicking on links just trying to > pull them in. > Anything I can do other than manually clicking on all links? A winbox, eh? Well, there are dozens of freeware and shareware (crack available) windoze programs to recursively download websites. Sites like www.tucows.com, www.nonags.com, www.shareware.com etc list them by the score. You might need to try a few until you hit on one which doesn't molest the '//' in freenet URIs. Once you choose a site downloader program, you'd need to point it at: - http://localhost:/[EMAIL PROTECTED]/TFE// - http://localhost:/[EMAIL PROTECTED]/YoYo// or, point it at whatever site(s) you need, observing the above URL syntax. Be sure to disable the timeout (or extend it to an hour or so), because these crawler progs are used to web performance. One last thing - these crawler progs won't be able to tell one freesite from another, since it will perceive all freesites as part of the same 'site' at http://localhost:/. So take care to set a pattern match requirement (unless you want the crawler to suck the whole freenet). Cheers David ___ Support mailing list [EMAIL PROTECTED] http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/support