Re: [WSG] Questions about the new european parliament web site
* Matthew Cruickshank [EMAIL PROTECTED] [2005-09-13 19:39]: Maybe it would be more educational if someone could describe how these tags might have been built. I'm assuming they are using a .net platform that has been horribly hacked. Maybe I shouldn't throw blame immediately at .net, but I have noticed similar things with them. Should the CMS have translated all of the xml stuff into an action or content? Is it just bad code? It's just bad code. They're almost certainly using Apache Cocoon to aggregate a bunch of XML formats together, but they haven't written their XSLT to strip namespaces or tags. This was the same problem that tvnz.co.nz faced when it launched their Apache Cocoon site. The problem they're facing is that they've got XML coming from all kinds of sources, Eg, they might have a content repository providing XML like content xmlns=c_repo section titleSome title/title paraSome content/para /section /content And they'll combine that with a couple of RSS feeds, some XHTML, and maybe some search engine results, and end up with a lot of XML namespaces and non-XHTML tags. This is no bad thing, provided they go to the effort of cleaning it up into XHTML. They could put another stage in their pipeline to filter the namespaces and such with some XSLT and using xsl:stylesheet exclude-result-prefixes= ... Cleaning the result up into XHTML as the final stage in the pipeline though is messy, and generally it's better to write wrappers around all the XML you're aggregating, convert this to XHTML, and then keep a simple sitemap that just aggregates these wrappers together. I'm following a process to transform the incoming feed into clean Atom 1.0, ant to JTidy the content into XHTML. Then I can take a standard set of transforms to convert into RSS 2.0, HTML, and whatever else down the road. Getting it clean coming in makes it much easier to keep it clean going out. -- Alan Gutierrez - [EMAIL PROTECTED] - http://engrm.com/blogometer/index.html - http://engrm.com/blogometer/rss.2.0.xml ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **
Re: [WSG] Questions about the new european parliament web site
It is run on Apache, so the Xalan mentioning there looks like it's a product of some bad XSTL transformation. And it's not .NET, it's Java. -- Jan Brasna aka JohnyB :: www.alphanumeric.cz | www.janbrasna.com ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **
Re: [WSG] Questions about the new european parliament web site
It definitely looks like generator/cms code that isn't being converted, but I don't know for sure.On 9/13/05, Drake, Ted C. [EMAIL PROTECTED] wrote:As many of you probably noticed via the WASP web site, a new web site was launched for the European parliament and it is a mess.Now, I'm the first to admit that I don't know everything and graciously askfor advice. When I looked at the source code, I saw all sorts of bizarre mixtures of xml, html, and what seems to be cms placeholder information.Could people describe some of these bits of code. Are they at all valid?1. content/content2. table xmlns:RT= http://www.europarl.eu.int/publication.engineRTxmlns:pe=http://www.europarl.eu.int/publication.engine xmlns:hp=ec.ep.webpub.refeur.service.publication.util.Utilsxmlns:psa=ec.ep.webpub.refeur.service.publication.runtime.PublicationServiceAccessor xmlns:xalan= http://xml.apache.org/xalan class=alerttablecellspacing=0 cellpadding=5 border=0 width=100%3. contenu /contenu - I'm assuming this is the French equivalent of content4. br xmlns:RT=http://www.europarl.eu.int/publication.engineRTxmlns:pe= http://www.europarl.eu.int/publication.enginexmlns:hp=ec.ep.webpub.refeur.service.publication.util.Utilsxmlns:psa=ec.ep.webpub.refeur.service.publication.runtime.PublicationServiceAccessor xmlns:xalan= http://xml.apache.org/xalan/ (imagine 25 of thesein a paragraph! :o )Maybe it would be more educational if someone could describe how these tagsmight have been built. I'm assuming they are using a .net platform that has been horribly hacked.Maybe I shouldn't throw blame immediately at .net, but I have noticedsimilar things with them.Should the CMS have translated all of the xml stuff into an action or content? Is it just bad code?Content is not a valid xhtml 1.0 transitional tag. Was it ever? Will it bein the future?ThanksTed**The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **
Re: [WSG] Questions about the new european parliament web site
Maybe it would be more educational if someone could describe how these tags might have been built. I'm assuming they are using a .net platform that has been horribly hacked. Maybe I shouldn't throw blame immediately at .net, but I have noticed similar things with them. Should the CMS have translated all of the xml stuff into an action or content? Is it just bad code? It's just bad code. They're almost certainly using Apache Cocoon to aggregate a bunch of XML formats together, but they haven't written their XSLT to strip namespaces or tags. This was the same problem that tvnz.co.nz faced when it launched their Apache Cocoon site. The problem they're facing is that they've got XML coming from all kinds of sources, Eg, they might have a content repository providing XML like content xmlns=c_repo section titleSome title/title paraSome content/para /section /content And they'll combine that with a couple of RSS feeds, some XHTML, and maybe some search engine results, and end up with a lot of XML namespaces and non-XHTML tags. This is no bad thing, provided they go to the effort of cleaning it up into XHTML. They could put another stage in their pipeline to filter the namespaces and such with some XSLT and using xsl:stylesheet exclude-result-prefixes= ... Cleaning the result up into XHTML as the final stage in the pipeline though is messy, and generally it's better to write wrappers around all the XML you're aggregating, convert this to XHTML, and then keep a simple sitemap that just aggregates these wrappers together. .Matthew Cruickshank http://holloway.co.nz/ ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **