Hi Lux,
Here are two perl text filters.
Copy them to ~/Library/Application Support/BBEdit/Text Filters.
You can test them from menu Text > Apply Text Filter.
This one (tags_and_categories_to_list_pl.pl) is for your initial case (CSV
-> hyphenated item per line) :
#!/usr/bin/env perl
use v5.14;
use strict;
use warnings;
while(my $line = <>) {
if (my ($field, $items) = $line =~
/^(tags|categories):\s*(.+)\s*$/gi) {
my $yaml_items = $items =~ s/\s*,\s*/\n- /gr;
print "${field}:\n- ${yaml_items}\n";
} else {
print $line;
}
}
=for test
I got a huge number of posts from an old static blog. Inside the header
there's a "tags" line, made this way:
tags: Steve Jobs, Steve Throughton-Smith, T'Bone, Better to Be a Pirate
Than Join the Navy, Ben & Jerry, NBA75
The actual number of tags in the line can be any. Comma act as a
delimiter and, after that, any character can be used inside a tag, spaces
included.
categories: Education
=cut
This other one (tags_and_categories_to_array_pl.pl) if for your second case
(hyphenated item per line -> array):
#!/usr/bin/env perl
use v5.14;
use strict;
use warnings;
$/ = undef;
sub replace {
my $match = shift;
my @splitted = split /[[:blank:]]*\n[[:blank:]]*-[[:blank:]]/,
$match;
my $field = shift @splitted;
map { s/^\s+|\s+$//g; } @splitted;
my $joined = join ", ", @splitted;
return "${field} [${joined}]\n";
}
print <> =~
s/^((?:tags|categories)[[:blank:]]*:[[:blank:]]*\n[[:blank:]]*(?:-[^\n]+\n)+)/replace
$1/grimse;
=for test
title: "Più uguali degli altri"
date: 2022-02-24T01:43:23+01:00
draft: false
toc: false
comments: false
categories:
- Education
tags:
- MacBook Pro
- iPad mini
- Apple Pencil
- Bowdoin
=cut
HTH,
Jean Jourdain
On Friday, February 25, 2022 at 5:05:02 AM UTC+1 lux wrote:
> On Wednesday, February 23, 2022 at 6:30:31 PM UTC+1 [email protected]
> wrote:
>
>> When you state header, this implies that there is more data than what you
>> present. Could you provide a small selection of what the data file actually
>> looks like? ;)
>>
>
> This is an example of a complete header:
>
> *((( header begins )))*
> ---
> title: "Più uguali degli altri"
> date: 2022-02-24T01:43:23+01:00
> draft: false
> toc: false
> comments: false
> categories:
> - Education
> tags:
> - MacBook Pro
> - iPad mini
> - Apple Pencil
> - Bowdoin
> ---
> *((( header ends )))*
>
> Right under the header, the body text begins.
>
> The count for categories and tags items can be zero, one, or more than one.
>
> My main issue is to refrain from mistakenly capturing sentences inside the
> body text that share the same structure (i.e., lists beginning with dashes).
>
> Working on the problem, I could somewhat simplify the problem. I'd like
> now to convert the former header in the following one:
>
> *((( header begins )))*
> ---
> title: "Più uguali degli altri"
> date: 2022-02-24T01:43:23+01:00
> draft: false
> toc: false
> comments: false
> categories: [Education]
> tags: [MacBook Pro, iPad mini, Apple Pencil, Bowdoin]
> ---
> *((( header ends )))*
>
> Again, thanks very much for the attention. :-)
>
> lux
>
--
This is the BBEdit Talk public discussion group. If you have a feature request
or need technical support, please email "[email protected]" rather than
posting here. Follow @bbedit on Twitter: <https://twitter.com/bbedit>
---
You received this message because you are subscribed to the Google Groups
"BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/bbedit/f076b205-ab47-4542-a08a-e7ee79637913n%40googlegroups.com.