Re: [WSG] Questions about the new european parliament web site

2005-09-14 Thread Alan Gutierrez
* Matthew Cruickshank [EMAIL PROTECTED] [2005-09-13 19:39]:
 
 
  Maybe it would be more educational if someone could describe how these
  tags might have been built.
  
  I'm assuming they are using a .net platform that has been horribly
  hacked. Maybe I shouldn't throw blame immediately at .net, but I have
  noticed similar things with them.
  
  Should the CMS have translated all of the xml stuff into an action or
  content? Is it just bad code?
  
 
 It's just bad code. They're almost certainly using Apache Cocoon to
 aggregate a bunch of XML formats together, but they haven't written
 their XSLT to strip namespaces or tags. This was the same problem that
 tvnz.co.nz faced when it launched their Apache Cocoon site.
 
 The problem they're facing is that they've got XML coming from all kinds
 of sources, Eg, they might have a content repository providing XML like
 
 content xmlns=c_repo
   section
   titleSome title/title
   paraSome content/para
   /section
 /content
 
 And they'll combine that with a couple of RSS feeds, some XHTML, and
 maybe some search engine results, and end up with a lot of XML
 namespaces and non-XHTML tags. This is no bad thing, provided they go to
 the effort of cleaning it up into XHTML.
 
 They could put another stage in their pipeline to filter the namespaces
 and such with some XSLT and using xsl:stylesheet
 exclude-result-prefixes= ... 
 
 Cleaning the result up into XHTML as the final stage in the pipeline
 though is messy, and generally it's better to write wrappers around all
 the XML you're aggregating, convert this to XHTML, and then keep a
 simple sitemap that just aggregates these wrappers together.

I'm following a process to transform the incoming feed into
clean Atom 1.0, ant to JTidy the content into XHTML. Then I can
take a standard set of transforms to convert into RSS 2.0, HTML,
and whatever else down the road.

Getting it clean coming in makes it much easier to keep it clean
going out.

--
Alan Gutierrez - [EMAIL PROTECTED]
- http://engrm.com/blogometer/index.html
- http://engrm.com/blogometer/rss.2.0.xml
**
The discussion list for  http://webstandardsgroup.org/

 See http://webstandardsgroup.org/mail/guidelines.cfm
 for some hints on posting to the list  getting help
**



Re: [WSG] Questions about the new european parliament web site

2005-09-13 Thread Jan Brasna
It is run on Apache, so the Xalan mentioning there looks like it's a 
product of some bad XSTL transformation. And it's not .NET, it's Java.


--
Jan Brasna aka JohnyB :: www.alphanumeric.cz | www.janbrasna.com
**
The discussion list for  http://webstandardsgroup.org/

See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list  getting help
**



Re: [WSG] Questions about the new european parliament web site

2005-09-13 Thread Christian Montoya
It definitely looks like generator/cms code that isn't being converted, but I don't know for sure.On 9/13/05, Drake, Ted C. 
[EMAIL PROTECTED] wrote:As many of you probably noticed via the WASP web site, a new web site was
launched for the European parliament and it is a mess.Now, I'm the first to admit that I don't know everything and graciously askfor advice. When I looked at the source code, I saw all sorts of bizarre
mixtures of xml, html, and what seems to be cms placeholder information.Could people describe some of these bits of code. Are they at all valid?1. content/content2. table xmlns:RT=
http://www.europarl.eu.int/publication.engineRTxmlns:pe=http://www.europarl.eu.int/publication.engine
xmlns:hp=ec.ep.webpub.refeur.service.publication.util.Utilsxmlns:psa=ec.ep.webpub.refeur.service.publication.runtime.PublicationServiceAccessor xmlns:xalan=
http://xml.apache.org/xalan class=alerttablecellspacing=0 cellpadding=5 border=0 width=100%3. contenu /contenu - I'm assuming this is the French equivalent of
content4. br xmlns:RT=http://www.europarl.eu.int/publication.engineRTxmlns:pe=
http://www.europarl.eu.int/publication.enginexmlns:hp=ec.ep.webpub.refeur.service.publication.util.Utilsxmlns:psa=ec.ep.webpub.refeur.service.publication.runtime.PublicationServiceAccessor xmlns:xalan=
http://xml.apache.org/xalan/ (imagine 25 of thesein a paragraph! :o )Maybe it would be more educational if someone could describe how these tagsmight have been built.
I'm assuming they are using a .net platform that has been horribly hacked.Maybe I shouldn't throw blame immediately at .net, but I have noticedsimilar things with them.Should the CMS have translated all of the xml stuff into an action or
content? Is it just bad code?Content is not a valid xhtml 1.0 transitional tag. Was it ever? Will it bein the future?ThanksTed**The discussion list for
http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list  getting help
**


Re: [WSG] Questions about the new european parliament web site

2005-09-13 Thread Matthew Cruickshank


 Maybe it would be more educational if someone could describe how these
 tags might have been built.
 
 I'm assuming they are using a .net platform that has been horribly
 hacked. Maybe I shouldn't throw blame immediately at .net, but I have
 noticed similar things with them.
 
 Should the CMS have translated all of the xml stuff into an action or
 content? Is it just bad code?
 

It's just bad code. They're almost certainly using Apache Cocoon to
aggregate a bunch of XML formats together, but they haven't written
their XSLT to strip namespaces or tags. This was the same problem that
tvnz.co.nz faced when it launched their Apache Cocoon site.

The problem they're facing is that they've got XML coming from all kinds
of sources, Eg, they might have a content repository providing XML like

content xmlns=c_repo
section
titleSome title/title
paraSome content/para
/section
/content

And they'll combine that with a couple of RSS feeds, some XHTML, and
maybe some search engine results, and end up with a lot of XML
namespaces and non-XHTML tags. This is no bad thing, provided they go to
the effort of cleaning it up into XHTML.

They could put another stage in their pipeline to filter the namespaces
and such with some XSLT and using xsl:stylesheet
exclude-result-prefixes= ... 

Cleaning the result up into XHTML as the final stage in the pipeline
though is messy, and generally it's better to write wrappers around all
the XML you're aggregating, convert this to XHTML, and then keep a
simple sitemap that just aggregates these wrappers together.



.Matthew Cruickshank
http://holloway.co.nz/

**
The discussion list for  http://webstandardsgroup.org/

 See http://webstandardsgroup.org/mail/guidelines.cfm
 for some hints on posting to the list  getting help
**