> From: Rob Dixon <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Subject: Re: compare multiple lines > Date: 20 Jan 2003 11:55:32 +0000 > > Mertens Bram wrote: [snip] > > No, folders always have the NAME, CREATED and ORDER fields. URL's > > always have the NAME, URL, CREATED, VISITED and ORDER fields. > > The URL's should be compared based on the values of NAME and URL. > > Are you sure? The naming is arbitrary (although it defaults to the HTML > title) > and identical URLs could be named differently.
Hmm, perhaps comparing only the URL's might be a better idea indeed. The reason I specified this is because the URL's are sorted by NAME right now. When I first tried to write a script for this I meant only to compare two adjacent lines... > > If I somehow assigned two name's to the same URL I don't mind deleting > > those manually afterwards. I still have to go through the file > > manually later anyhow to put some of the URL's into other folders. > > It would be easier to use Opera's Manage bookmarks facility to drag and > drop them into place. That is how I meant 'reorganise manually'... > > Right now I would like to remove the duplicates per folder. > > Per folder? That means you don't mind duplicate URLs across folders? Indeed per folder, e.g. certain URL's are stored both in the 'personal' bar and in another subfolder, I would like to keep this. > > Rob's suggestion works fine but it doesn't preserve the syntax of the > > bookmark file. > > It preserves it OK, its problem is that it doesn't touch the file at > all! Indeed, all I meant was that I can't use it's output as a bookmark-file for opera. Other than that the script is great! > A solution which edits the file for you may be a few hours work. Are > there so many duplicates that you don't want to edit them by hand, or > will you want to do this again many times in the future? If not then I > suggest that you stick to manual editing. There really are that many duplicates, especially in a folder named 'unsorted'. I have already removed quite a few of those manually which kept me busy for several hours. Originally I hoped to be able to alter the bookmark-file's syntax so that 'uniq' would remove the duplicates. Unfortunately all folder settings got screwed up. Perhaps it's easier to take a look at the scripts I used to see where things went wrong? I started by reseting the ORDER-fields: #!/usr/bin/perl my $record = ''; while (<>) { s/(^\tORDER=)\d*/$1/; $record .= $_; } print "$record\n" if $record; The problem here is that it also removes the ORDER from #FOLDER entries. I was hoping this wouldn't matter if the structure of the file was preserved. Then I changed the CREATED and VISITED fields to a fixed value: while (<>) { s/(^\tCREATED)=\d*/$1=1042903422/; # same for VISITED $record .= $_; } print "$record\n" if $record; I was hoping to avoid this butif I want to use 'uniq' the lines have to match exactly... Then I ran the following several times to get all fields on one line (this will definitely prove I am still learning): while (<>) { s/(^#URL.*)\n/$1/; $record .= $_; } print "$record\n" if $record; Then I wanted to change '.com' into '.com/' since the trailing slash was omitted her and there (other : while (<>) { s/\.com\t/.com\/\t/; $record .= $_; } print "$record\n" if $record; Here I noticed that something had gone wrong already so I ran the following: while (<>) { s/=#/=\n#/g; $record .= $_; } print "$record\n" if $record; Some #FOLDER and #URL had been moved to the end of the previous line... Then I ran the file through 'uniq' and wanted to convert it back to the correct syntax: while (<>) { if ($_ =~ /^#URL/i) { s/\t/\n\t/g; } $record .= $_; } print "$record\n" if $record; But this file still wasn't what I want... If editing these scripts gets the job done, I have no need for one big script... Perhaps the output can be run through a perl-script instead of 'uniq' before converting back to a bookmark-file, that way the CREATED and VISITED fields don't have to be reset. TIA -- # Mertens Bram "M8ram" <[EMAIL PROTECTED]> Linux User #249103 # # Red Hat Linux release 7.3 (Valhalla) kernel 2.4.18-3 i686 128MB RAM # # 6:15pm up 9 days, 22:29, 1 user, load average: 0.29, 0.09, 0.03 # -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]