While going back to correct the problem of seednodes.ref being backed up in both addnodes.sh and dedupe.sh (which would cause the final backed up seednodes.ref.bak to actually be a copy of the raw concatenated noderefs.txt and seednodes.ref), I refined these two scripts even further and added a *lot* of bulletproofing. :-) Also added more "echo" statements to inform the user as to their progress.
They should both be quite safe to use now, and are working just as I wanted them to. :-) Jeez, between this and all the work I've been doing modifying the spider, I'm whipped! I think I'll just *play* with the computer my remaining time off (yeah, right!) :-) Enjoy! On 17-Feb-2004 Conrad Sabatier wrote: > One more correction to addnodes.sh: > > Rather than copying noderefs.txt to the end of seednodes.ref before deduping > seednodes.ref, put it at the beginning. This way the (presumably) newer > nodes > from the freshly fetched noderefs.txt are guaranteed to win out over any > older > ones in seednodes.ref. > > Also, backup seednodes.ref in addnodes.sh before doing anything. > > Meant to do this on the last go-round, but it slipped my mind. :-) > > Enjoy! > > On 17-Feb-2004 Conrad Sabatier wrote: >> D'oh! That's what I get for coding into the wee hours of the morning on too >> much coffee and not enough food. :-) >> >> I've attached corrections to the addnodes.sh and dedupe.sh scripts. It >> occurred to me later that if addnodes.sh is run periodically without >> *really* >> eliminating duplicate noderefs using the same version, then the >> seednodes.ref >> file is going to just grow and grow and grow. >> >> I've solved the problem by a sort of "brute force" method for now. In the >> case >> of duplicate noderefs using the same version, only one is saved, the others >> are >> mindlessly tossed in the bit bucket. >> >> I've also improved the code in a few other places, removing some unnecessary >> stuff, simplifying and speeding it up a good bit. >> >> Enjoy! >> >> On 17-Feb-2004 Conrad Sabatier wrote: >>> OK, I've finally come up with a deduping script. It's not perfect, in that >>> I >>> didn't know what to do with duplicate nodes with the same version, so I >>> just >>> left them alone, but it's a start. Hopefully, over time, it'll work out >>> OK, >>> the idea being that as later versions of nodes are added, the older dupes >>> will >>> drop out. >>> >>> For this reason, I've modified the addnodes.sh script to just add >>> everything >>> in >>> noderefs.txt (after first deduping it) to seednodes.ref, and then deduping >>> seednodes.ref. The problem with the earlier version of addnodes.sh was >>> that >>> if >>> a node was already present in seednodes.ref, the one in noderefs.txt would >>> never get added, even if it was newer. >>> >>> I've also cleaned up and "normalized" all the other scripts, cleaning up >>> the >>> headers and adding usage tips, as well as made them safer by having them >>> backup >>> files before modifying them. >>> >>> Hope you'll find them useful. >>> >>> Conrad (my God, it's nearly 5 a.m.!) >>> >>> -- >>> Conrad Sabatier <[EMAIL PROTECTED]> - "In Unix veritas" >> >> -- >> Conrad Sabatier <[EMAIL PROTECTED]> - "In Unix veritas" > > -- > Conrad Sabatier <[EMAIL PROTECTED]> - "In Unix veritas" -- Conrad Sabatier <[EMAIL PROTECTED]> - "In Unix veritas"
dedupe.sh
Description: dedupe.sh
addnodes.sh
Description: addnodes.sh
_______________________________________________ Support mailing list [EMAIL PROTECTED] http://news.gmane.org/gmane.network.freenet.support Unsubscribe at http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/support Or mailto:[EMAIL PROTECTED]