Em sáb, 2018-10-06 às 12:45 +0200, Bernd Vogelgesang escreveu:
Hi,

We work a lot with gpx files created with the  Locus App on Android.

Unfortunately, the "desc" field is created with html tags (for whatever
reason), so it is quite a tedious work to extract the plain text
informations out of it.

Does anyone know a way how to get rid of the html and only preserve the
plain text informations?

Example:

<!-- desc_gen:start -->
<font color="#ff000000"><table width="100%"><tr><td width="100%"
align="center">
<!-- desc_user:start -->
This is the information I would like to keep
<!-- desc_user:end -->
</td></tr><tr><td><table width="100%"></table></td></tr></

   A REGEXP like  "<[^>]+>" should match all contents between a consecutive
pair of angle brackets.   It may be necessary to escape some of the
symbols in REGEXP to avoid misinterpretation.

   It is necessary to avoid REGEXP like "<.*>" because it will match
everything from the first "<" to the last ">", that may include other
characters "<" and ">".

   HTH
Hi Fernando,
a many thanks for your hint. REGEX ist definitely the way to go, if it was only a little more intuitive.

 regexp_replace( "desc",'<[^>]+>','')

in the field calculator did the trick for me for all entries with correct html. So only few entries with crippled html left to process manually.

Thanx a lot,
Bernd


Is the e.g. a way to search for < and > and then delete them an all
text
within programmatically?


Cheers,

Bernd

_______________________________________________
Qgis-user mailing list
[email protected]
List info: https://lists.osgeo.org/mailman/listinfo/qgis-user
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-user

   Roxo


_______________________________________________
Qgis-user mailing list
[email protected]
List info: https://lists.osgeo.org/mailman/listinfo/qgis-user
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-user

Reply via email to