On Wed, Jan 05, 2005 at 04:32:07PM -0500, William Ballard wrote:
echo '</Long-Description></entry></packages>'
^^^
Should have closed the CDATA tag here. The short description tag should probably be wrapped in CDATA too. If any package descriptions contain "]]>", it'll break it.
I was able to succesfully turn the sarge/contrib (i386) Packages file into a valid XML file with the following modified version of your script. It is still definately a hack though. Especially the way it escapes Non-ASCII characters.
Since it contains a few long lines, I attached it. It's under 1k in size.
hth
cu, sven
#!/bin/bash
PACKAGES=$1
CAT=cat
if [[ ! -f ${PACKAGES} ]]; then
echo ${PACKAGES not found
exit 1
fi
if file ${PACKAGES} | grep -q gzip ; then
CAT=ZCAT
fi
echo '<packages><entry>'
${CAT} ${PACKAGES} \
| grep-dctrl . \
| sed -r \
-e
's/&/\&/g;s/</\</g;s/>/\>/g;s/�/\ũ/g;s/�/\ş/g;s/�/\ţ/g'
\
-e 's/(Description):
(.+)/<\1><Short-Description>\2<\/Short-Description><Long-Description><CDATA>/' \
-e 's/^([^: ]+): (.+)/<\1><CDATA>\2<\/CDATA><\/\1>/' \
-e 's/^$/><\/CDATA><\/Long-Description><\/Description><\/entry><entry>/' \
| head -n-1
echo '</CDATA></Long-Description></Description></entry></packages>'

