Re: Language Negotiation

2006-07-26 Thread James M Snell



David Powell wrote:
> [snip]
> Hosted services (eg Bloglines) won't work unless they store multiple
> seperate lists of entries and feed data for each set of request
> headers that the request varies over; this seems very optimistic.
> 

This is perhaps the single most important reason for using CDN, at least
for now.

> If you want to use a user's language preference to provide them with a
> localized feed, I'd do it at the HTML level - use accept headers to
> provide the correct autodiscovery link, or even better, to sort the
> list of multiple links in a suitable order.
> 

And within feed documents in the form of language-qualified alternate
links (e.g., , , etc)

> 
> What are you going to do about ids BTW? I'd probably mint new ids for
> the translated entries and feeds, and employ some sort of
> link/extension if you need to be able to associate them.
> 

I'd also lean towards minting new ids for translated resources.

- James



Re: Language Negotiation

2006-07-26 Thread David Powell


Wednesday, July 26, 2006, 8:33:55 PM, James M Snell wrote:

> Now imagine that we start to apply machine translation to entries, so
> that we can say, give me all entries, but translate them to French, or
> English, etc.  Would that be best done using conneg or separate URIs?

I'd go for seperate URLs. Server-driven conneg (SDN) has its uses, but
for a protocol like atom (-syntax), the benefits are very limited, and
the risks and costs are high. With client-driven negotiation (CDN),
the client and server both have a better understanding of what they
are asking for and what is available - there is less risk of anything
going wrong.

The only advantage of SDN, is that it selects what it thinks is the
best feed without any user interaction - but that is something that
only needs to be done once anyway.

With SDN, you'll need to employ the Vary header, which can adversely affect
caching.

Hosted services (eg Bloglines) won't work unless they store multiple
seperate lists of entries and feed data for each set of request
headers that the request varies over; this seems very optimistic.

If you want to use a user's language preference to provide them with a
localized feed, I'd do it at the HTML level - use accept headers to
provide the correct autodiscovery link, or even better, to sort the
list of multiple links in a suitable order.


What are you going to do about ids BTW? I'd probably mint new ids for
the translated entries and feeds, and employ some sort of
link/extension if you need to be able to associate them.

-- 
Dave



Language Negotiation

2006-07-26 Thread James M Snell

Quick question regarding language negotiation with feeds...

By way of example, IBM's internal blogging infrastructure supports
bloggers in every locale IBM does business around the world, meaning
that there are posts in many different languages (japanese, french,
german, chinese, etc).  We have a dashboard/planet view that lists all
of the most recent posts across the entire system.  The content of each
post is presented in it's original language, but the metadata in the
feed is always in English, so we end up with things like...

  
  
Dashboard

  ...


  ...


  ...

  

So, given this, what (c|sh)ould be the expected behavior if a client
includes an Accept-Language: fr header, for instance, when GET'ing the
dashboard feed?

Now imagine that we start to apply machine translation to entries, so
that we can say, give me all entries, but translate them to French, or
English, etc.  Would that be best done using conneg or separate URIs?

- James



Re: clarification: "escaped"

2006-07-26 Thread James Holderness


Antone Roundy wrote:

Converting & to & and < to < is sufficient


People keep missing this so I'm going to point it out one more time: there 
are certain rare circumstances when a right angle bracket (>) MUST be 
escaped so if you're just doing ampersands and left angle brackets that 
WON'T always be sufficient. To be safe it's best to always encode all three.


As for CDATA sections, it's worth noting that you wouldn't have been able to 
syndicate this message thread if you always escaped everything with CDATA.


Regards
James



Re: clarification: "escaped"

2006-07-26 Thread A. Pagaltzis

* Antone Roundy <[EMAIL PROTECTED]> [2006-07-26 16:45]:
> Or you put the whole thing in a CDATA block.

Which is the easiest option, so long as you remember the edge
case of having to turn any `]]>` sequences in the input into
`]]>]]>

Re: clarification: "escaped"

2006-07-26 Thread Antone Roundy


On Jul 26, 2006, at 3:19 AM, Bill de hÓra wrote:

A. Pagaltzis wrote:

* Robert Sayre <[EMAIL PROTECTED]> [2006-07-26 01:45]:

On 7/25/06, Bill de hÓra <[EMAIL PROTECTED]> wrote:

And I didn't know whether Atom code could get away with
escaping < and &.

 hmm

that is an XML fatal error, no doubt, as the ampersand before
"nbsp" must be escaped.

But he did say “escaping < and &”, so it would be. I’m not sure
what Bill’s question even is.


What do I escape, so I know what to unescape?


The point is that after your XML parser has unescaped the content of  
the element, it should be suitable for handling as HTML.  Escape  
whatever you have to ensure that the consumer gets HTML from their  
XML parser.  Converting & to & and < to < is sufficient  
(assuming that you've started with HTML--if you've started with plain  
text, then you need to double escape, but in that case, you should be  
using type="text" anyway to save yourself the trouble).  You could  
also convert > to >, " to ", ' to ' and any other  
characters to numeric character references.  Or you put the whole  
thing in a CDATA block.




Re: clarification: "escaped"



A. Pagaltzis wrote:

* Robert Sayre <[EMAIL PROTECTED]> [2006-07-26 01:45]:

On 7/25/06, Bill de hÓra <[EMAIL PROTECTED]> wrote:

And I didn't know whether Atom code could get away with
escaping < and &.

 hmm

that is an XML fatal error, no doubt, as the ampersand before
"nbsp" must be escaped.


But he did say “escaping < and &”, so it would be. I’m not sure
what Bill’s question even is.


What do I escape, so I know what to unescape?

cheers
Bill