> From: Rob Dixon <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Subject: Re: compare multiple lines
> Date: 20 Jan 2003 11:55:32 +0000
> 
> Mertens Bram wrote:
[snip]
> > No, folders always have the NAME, CREATED and ORDER fields.  URL's
> > always have the NAME, URL, CREATED, VISITED and ORDER fields.
> > The URL's should be compared based on the values of NAME and URL.
> 
> Are you sure? The naming is arbitrary (although it defaults to the HTML
> title)
> and identical URLs could be named differently.

Hmm, perhaps comparing only the URL's might be a better idea indeed. 
The reason I specified this is because the URL's are sorted by NAME
right now.  When I first tried to write a script for this I meant only
to compare two adjacent lines...

> > If I somehow assigned two name's to the same URL I don't mind deleting
> > those manually afterwards.  I still have to go through the file
> > manually later anyhow to put some of the URL's into other folders.
> 
> It would be easier to use Opera's Manage bookmarks facility to drag and
> drop them into place.

That is how I meant 'reorganise manually'...

> > Right now I would like to remove the duplicates per folder.
> 
> Per folder? That means you don't mind duplicate URLs across folders?

Indeed per folder, e.g. certain URL's are stored both in the 'personal'
bar and in another subfolder, I would like to keep this.

> > Rob's suggestion works fine but it doesn't preserve the syntax of the
> > bookmark file.
> 
> It preserves it OK, its problem is that it doesn't touch the file at
> all!

Indeed, all I meant was that I can't use it's output as a bookmark-file
for opera.  Other than that the script is great!

> A solution which edits the file for you may be a few hours work. Are
> there so many duplicates that you don't want to edit them by hand, or
> will you want to do this again many times in the future? If not then I
> suggest that you stick to manual editing.

There really are that many duplicates, especially in a folder named
'unsorted'.  I have already removed quite a few of those manually which
kept me busy for several hours.
Originally I hoped to be able to alter the bookmark-file's syntax so
that 'uniq' would remove the duplicates.  Unfortunately all folder
settings got screwed up.

Perhaps it's easier to take a look at the scripts I used to see where
things went wrong?

I started by reseting the ORDER-fields:
#!/usr/bin/perl
my $record = '';
while (<>) {
    s/(^\tORDER=)\d*/$1/;
    $record .= $_;
}
print "$record\n" if $record;

The problem here is that it also removes the ORDER from #FOLDER
entries.  I was hoping this wouldn't matter if the structure of the file
was preserved.

Then I changed the CREATED and VISITED fields to a fixed value:
while (<>) {
    s/(^\tCREATED)=\d*/$1=1042903422/;  # same for VISITED
    $record .= $_;
}
print "$record\n" if $record;

I was hoping to avoid this butif I want to use 'uniq' the lines have to
match exactly...

Then I ran the following several times to get all fields on one line
(this will definitely prove I am still learning):
while (<>) {
    s/(^#URL.*)\n/$1/;
    $record .= $_;
}
print "$record\n" if $record;

Then I wanted to change '.com' into '.com/' since the trailing slash was
omitted her and there (other :
while (<>) {
    s/\.com\t/.com\/\t/;
    $record .= $_;
}
print "$record\n" if $record;

Here I noticed that something had gone wrong already so I ran the
following:
while (<>) {
    s/=#/=\n#/g;
    $record .= $_;
}
print "$record\n" if $record;

Some #FOLDER and #URL had been moved to the end of the previous line...

Then I ran the file through 'uniq' and wanted to convert it back to the
correct syntax:
while (<>) {
    if ($_ =~ /^#URL/i) {
    s/\t/\n\t/g;
    }
    $record .= $_;
}
print "$record\n" if $record;

But this file still wasn't what I want...

If editing these scripts gets the job done, I have no need for one big
script...
Perhaps the output can be run through a perl-script instead of 'uniq'
before converting back to a bookmark-file, that way the CREATED and
VISITED fields don't have to be reset.

TIA
-- 
 #  Mertens Bram "M8ram"  <[EMAIL PROTECTED]>  Linux User #249103  #
 # Red Hat Linux release 7.3 (Valhalla) kernel 2.4.18-3 i686 128MB RAM #
 #  6:15pm up 9 days, 22:29, 1 user, load average: 0.29, 0.09, 0.03 #


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to