Re: Language Negotiation
David Powell wrote: > [snip] > Hosted services (eg Bloglines) won't work unless they store multiple > seperate lists of entries and feed data for each set of request > headers that the request varies over; this seems very optimistic. > This is perhaps the single most important reason for using CDN, at least for now. > If you want to use a user's language preference to provide them with a > localized feed, I'd do it at the HTML level - use accept headers to > provide the correct autodiscovery link, or even better, to sort the > list of multiple links in a suitable order. > And within feed documents in the form of language-qualified alternate links (e.g., , , etc) > > What are you going to do about ids BTW? I'd probably mint new ids for > the translated entries and feeds, and employ some sort of > link/extension if you need to be able to associate them. > I'd also lean towards minting new ids for translated resources. - James
Re: Language Negotiation
Wednesday, July 26, 2006, 8:33:55 PM, James M Snell wrote: > Now imagine that we start to apply machine translation to entries, so > that we can say, give me all entries, but translate them to French, or > English, etc. Would that be best done using conneg or separate URIs? I'd go for seperate URLs. Server-driven conneg (SDN) has its uses, but for a protocol like atom (-syntax), the benefits are very limited, and the risks and costs are high. With client-driven negotiation (CDN), the client and server both have a better understanding of what they are asking for and what is available - there is less risk of anything going wrong. The only advantage of SDN, is that it selects what it thinks is the best feed without any user interaction - but that is something that only needs to be done once anyway. With SDN, you'll need to employ the Vary header, which can adversely affect caching. Hosted services (eg Bloglines) won't work unless they store multiple seperate lists of entries and feed data for each set of request headers that the request varies over; this seems very optimistic. If you want to use a user's language preference to provide them with a localized feed, I'd do it at the HTML level - use accept headers to provide the correct autodiscovery link, or even better, to sort the list of multiple links in a suitable order. What are you going to do about ids BTW? I'd probably mint new ids for the translated entries and feeds, and employ some sort of link/extension if you need to be able to associate them. -- Dave
Language Negotiation
Quick question regarding language negotiation with feeds... By way of example, IBM's internal blogging infrastructure supports bloggers in every locale IBM does business around the world, meaning that there are posts in many different languages (japanese, french, german, chinese, etc). We have a dashboard/planet view that lists all of the most recent posts across the entire system. The content of each post is presented in it's original language, but the metadata in the feed is always in English, so we end up with things like... Dashboard ... ... ... So, given this, what (c|sh)ould be the expected behavior if a client includes an Accept-Language: fr header, for instance, when GET'ing the dashboard feed? Now imagine that we start to apply machine translation to entries, so that we can say, give me all entries, but translate them to French, or English, etc. Would that be best done using conneg or separate URIs? - James
Re: clarification: "escaped"
Antone Roundy wrote: Converting & to & and < to < is sufficient People keep missing this so I'm going to point it out one more time: there are certain rare circumstances when a right angle bracket (>) MUST be escaped so if you're just doing ampersands and left angle brackets that WON'T always be sufficient. To be safe it's best to always encode all three. As for CDATA sections, it's worth noting that you wouldn't have been able to syndicate this message thread if you always escaped everything with CDATA. Regards James
Re: clarification: "escaped"
* Antone Roundy <[EMAIL PROTECTED]> [2006-07-26 16:45]: > Or you put the whole thing in a CDATA block. Which is the easiest option, so long as you remember the edge case of having to turn any `]]>` sequences in the input into `]]>]]>
Re: clarification: "escaped"
On Jul 26, 2006, at 3:19 AM, Bill de hÓra wrote: A. Pagaltzis wrote: * Robert Sayre <[EMAIL PROTECTED]> [2006-07-26 01:45]: On 7/25/06, Bill de hÓra <[EMAIL PROTECTED]> wrote: And I didn't know whether Atom code could get away with escaping < and &. hmm that is an XML fatal error, no doubt, as the ampersand before "nbsp" must be escaped. But he did say “escaping < and &”, so it would be. I’m not sure what Bill’s question even is. What do I escape, so I know what to unescape? The point is that after your XML parser has unescaped the content of the element, it should be suitable for handling as HTML. Escape whatever you have to ensure that the consumer gets HTML from their XML parser. Converting & to & and < to < is sufficient (assuming that you've started with HTML--if you've started with plain text, then you need to double escape, but in that case, you should be using type="text" anyway to save yourself the trouble). You could also convert > to >, " to ", ' to ' and any other characters to numeric character references. Or you put the whole thing in a CDATA block.
Re: clarification: "escaped"
A. Pagaltzis wrote: * Robert Sayre <[EMAIL PROTECTED]> [2006-07-26 01:45]: On 7/25/06, Bill de hÓra <[EMAIL PROTECTED]> wrote: And I didn't know whether Atom code could get away with escaping < and &. hmm that is an XML fatal error, no doubt, as the ampersand before "nbsp" must be escaped. But he did say “escaping < and &”, so it would be. I’m not sure what Bill’s question even is. What do I escape, so I know what to unescape? cheers Bill