I want to parse a xbel file into J box array. An example,

<?xml version="1.0"?>
<!DOCTYPE xbel PUBLIC "+//IDN python.org//DTD XML Bookmark Exchange Language 
1.1//EN//XML" "http://pyxml.sourceforge.net/topics/dtds/xbel-1.1.dtd";>
<xbel>
  <title>Bookmarks</title>
  <desc>Bookmarks</desc>
  <folder id="rdf:#$FvPhC3" folded="no">
    <title>Bookmarks Toolbar Folder</title>
    <desc>Add bookmarks to this folder to see them displayed on the Bookmarks 
Toolbar
    </desc>
    <bookmark href="http://www.mozilla.com/products/firefox/central.html";>
      <title>Getting Started</title>
    </bookmark>
    <bookmark href="http://fxfeeds.mozilla.com/"; modified="1209052290">
      <title>Latest Headlines</title>
    </bookmark>
  </folder>
    <bookmark href="http://www.jsoftware.com/"; added="1146880810" 
visited="1209017433">
      <title>J Home</title>
    </bookmark>
    <bookmark href="http://www.vector.org.uk/"; added="1160402450" 
visited="1208656644">
      <title>Vector Online | Vector - the array languages</title>
    <desc>The Journal of the British APL Association
    </desc>
    </bookmark>
    <bookmark href="http://www.jsoftware.com/forumsearch"; added="1159366136" 
visited="1207757057">
      <title>Forum Search</title>
    </bookmark>
    <bookmark href="http://www.ubuntu.com/"; added="1208996707" 
visited="1209047897">
      <title>Ubuntu Home Page | Ubuntu</title>
    </bookmark>
</xbel>

I can ignore the folder hierarchy. Only the bookmark href, and title
are needed, so the output should be two character box arrays (or its
equivalent) as following,

http://www.mozilla.com/products/firefox/central.html
http://fxfeeds.mozilla.com/
http://www.jsoftware.com/
http://www.vector.org.uk/
http://www.jsoftware.com/forumsearch
http://www.ubuntu.com/

Getting Started
Latest Headlines
J Home</title
Vector Online | Vector - the array languages
Forum Search
Ubuntu Home Page | Ubuntu

It can be done by regex or sax addons but don't know how to do it.

FYI, xbel is a xml bookmark format for synchronizing bookmarks across
different browsers over internet.

-- 
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
唐詩156 李商隱  蟬
    本以高難飽  徒勞恨費聲  五更疏欲斷  一樹碧無情
    薄宦梗猶汎  故園蕪已平  煩君最相警  我亦舉家清
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to