Miguel,

If you are use Automator and AppleScript, "System Events" has some useful 
XML parsing functionality.

See 
: 
https://developer.apple.com/library/archive/documentation/LanguagesUtilities/Conceptual/MacAutomationScriptingGuide/WorkwithXML.html

Your input XML could be parsed with something like this:

```applescript
    tell application "System Events"
        tell XML file "~/Downloads/MA_NO_2021_05_011.xml"
            tell XML element "ejemplar"
                set vSecciones to every XML element whose name = "seccion"
                repeat with vSeccion in vSecciones
                    tell vSeccion
                        set vFichas to (every XML element whose name = 
"ficha")
                        repeat with vFicha in vFichas
                            tell vFicha
                                set vCampos to (every XML element whose 
name = "campo")
                                repeat with vCampo in vCampos
                                    tell vCampo
                                        set vClave to value of XML element 
"clave"
                                        set vValor to value of XML element 
"valor"
                                        log {clave:vClave, valor:vValor}
                                    end tell
                                end repeat
                            end tell
                        end repeat
                    end tell
                end repeat
            end tell
        end tell
    end tell
```
HTH,

Jean Jourdain
On Wednesday, May 19, 2021 at 3:30:03 PM UTC+2 Miguel Perez wrote:

> Hi!
>
> I have a question regarding Automator and BBEdit.
>
> *Context:*
>
> On a daily basis I get an XML file. This file contains information about 
> some dossiers. I need to extract two elements from each dossier: (1) a URL 
> to download associated images, and (2) the dossier's name.
>
> Here's an example of such XML files: 
> https://www.icloud.com/iclouddrive/0uq0GozmzGusqe09WNAmUJuow#MA_NO_2021_05_011
>
> Information in the file is in Spanish.
>
> *What I currently do:*
>
> I open the XML file on BBEdit and use Grep search to extract the 
> information. My Grep patterns are:
>
> To extract the URLs:
> <clave><!\[CDATA\[Imagen\]\]></clave>\n\s+<valor><!\[CDATA\[(.+?)\]
>
> To extract the dossier's name:
> <clave><!\[CDATA\[Denominación\]\]></clave>\n\s+<valor><!\[CDATA\[(.+?)\]
>
> I "replace" this Grep patterns with \1 to extract everything and works 
> like a charm.
>
> Both pieces of information get saved in their own plain text files.
>
> Then I download the images using some wget magic:
> wget -E -H -k -K -p -e robots=off -P /users/USERNAME/TARGETFOLDER -i 
> /users/USERNAME/URLSLIST.txt
>
> As a final touch to my workflow, I run a batch rename on all files to add 
> the filetype *.GIF on all images and I'm ready to work.
>
> *What I want to do:*
>
> I want to further automate the process.
>
> Using Automator I created a Service (Quick action) that uses files as 
> input in Finder.
>
> What I have in mind is:
> ➤ Run the service on the XML file
> ➤ Read the contents of the file
> ➤ Use BBEdit's Automator action called "Extract lines containing" in Grep 
> mode to extract the URLs
> ➤ Use a shell script to download all images
> ➤ Use a batch rename action to add the *.GIF filetype
>
> For the love of me I can't get "Extract lines containing" to work. I'm 
> using BBEdit 13.5.6 and Big Sur 11.3.1.
>
> Any ideas?
>
> Does anybody know if BBEdit's Automator actions still work?
>

-- 
This is the BBEdit Talk public discussion group. If you have a feature request 
or need technical support, please email "[email protected]" rather than 
posting here. Follow @bbedit on Twitter: <https://twitter.com/bbedit>
--- 
You received this message because you are subscribed to the Google Groups 
"BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/bbedit/c52a7b24-d919-4474-850e-0ad189301e2an%40googlegroups.com.

Reply via email to