While implementing mod_sift for Prosody, I saw some possibilities for improvement and had thoughts about issues. Some of these follow.
1. Remove disallowed child elements for filtered messages and presence. Here's a typical identi.ca message: <message from="[email protected]/xmpp001daemon" to="[email protected]" type="chat"> <body>evan: RT @sil doom. the Shuttle computer I'm setting up for dad can't read the hard drive. Won't boot from USB, has no CD drive, I have no USB ... [23931040]</body> <html xmlns="http://jabber.org/protocol/xhtml-im"> <body xmlns="http://www.w3.org/1999/xhtml"> : RT @doom. the Shuttle computer I'm setting up for dad can't read the hard drive. Won't boot from USB, has no CD drive, I have no USB ... <a href="http://identi.ca/evan">evan</a> <span class="vcard"> <a title="Stuart Langridge" class="url" href="http://identi.ca/user/279"> <span class="fn nickname">sil</span> </a> </span> <a href="http://identi.ca/conversation/24011046#notice-23931040">[23931040]</a> </body> </html> <entry xmlns="http://www.w3.org/2005/Atom"> <source> <title>evan - Identi.ca</title> <link href="http://identi.ca/evan" /> <link rel="self" type="application/atom+xml" href="http://identi.ca/evan" /> <link rel="license" href="http://creativecommons.org/licenses/by/3.0/" /> <icon>http://avatar.identi.ca/1-96-20090819204503.jpeg</icon> </source> <title>RT @sil doom. the Shuttle computer I'm setting up for dad can't read the hard drive. Won't boot from USB, has no CD drive, I have no USB ...</title> <author> <name>evan</name> <uri>http://identi.ca/user/1</uri> </author> <actor xmlns="http://activitystrea.ms/spec/1.0/"> <object-type>http://activitystrea.ms/schema/1.0/person</object-type> <id xmlns="http://www.w3.org/2005/Atom">http://identi.ca/user/1</id> <title xmlns="http://www.w3.org/2005/Atom">Evan Prodromou</title> <link rel="alternate" type="text/html" href="http://identi.ca/evan" xmlns="http://www.w3.org/2005/Atom" /> <link rel="avatar" type="image/jpeg" xmlns:ns1="http://purl.org/syndication/atommedia" ns1:height="353" xmlns:ns2="http://purl.org/syndication/atommedia" ns2:width="353" href="http://avatar.identi.ca/1-353-20090819204502.jpeg" xmlns="http://www.w3.org/2005/Atom" /> <link rel="avatar" type="image/jpeg" xmlns:ns1="http://purl.org/syndication/atommedia" ns1:height="96" xmlns:ns2="http://purl.org/syndication/atommedia" ns2:width="96" href="http://avatar.identi.ca/1-96-20090819204503.jpeg" xmlns="http://www.w3.org/2005/Atom" /> <link rel="avatar" type="image/jpeg" xmlns:ns1="http://purl.org/syndication/atommedia" ns1:height="48" xmlns:ns2="http://purl.org/syndication/atommedia" ns2:width="48" href="http://avatar.identi.ca/1-48-20090819204503.jpeg" xmlns="http://www.w3.org/2005/Atom" /> <link rel="avatar" type="image/jpeg" xmlns:ns1="http://purl.org/syndication/atommedia" ns1:height="24" xmlns:ns2="http://purl.org/syndication/atommedia" ns2:width="24" href="http://avatar.identi.ca/1-24-20090819204503.jpeg" xmlns="http://www.w3.org/2005/Atom" /> <point xmlns="http://www.georss.org/georss">45.5088375 -73.587809</point> <preferredUsername xmlns="http://portablecontacts.net/spec/1.0">evan</preferredUsername> <displayName xmlns="http://portablecontacts.net/spec/1.0">Evan Prodromou</displayName> <note xmlns="http://portablecontacts.net/spec/1.0">Montreal hacker and entrepreneur. Founder of identi.ca, lead developer of StatusNet, CEO of StatusNet Inc.</note> <address xmlns="http://portablecontacts.net/spec/1.0"> <formatted>Montreal, Quebec, Canada</formatted> </address> <urls xmlns="http://portablecontacts.net/spec/1.0"> <type>homepage</type> <value>http://evan.prodromou.name/</value> <primary>true</primary> </urls> </actor> <link rel="alternate" type="text/html" href="http://identi.ca/notice/23931040" /> <id>http://identi.ca/notice/23931040</id> <published>2010-03-06T20:01:22+00:00</published> <updated>2010-03-06T20:01:22+00:00</updated> <link rel="ostatus:conversation" href="http://identi.ca/conversation/24011046" /> <forward ref="http://identi.ca/notice/23928915" href="http://identi.ca/notice/23928915" xmlns="http://ostatus.org/schema/1.0" /> <content type="html">RT @<span class="vcard"><a href="http://identi.ca/user/279" class="url" title="Stuart Langridge"><span class="fn nickname">sil</span></a></span> doom. the Shuttle computer I'm setting up for dad can't read the hard drive. Won't boot from USB, has no CD drive, I have no USB ...</content> </entry> </message> Look at the size of that. Should I laugh or cry? This should be reduced to: <message from="[email protected]/xmpp001daemon" to="[email protected]" type="chat"> <body>evan: RT @sil doom. the Shuttle computer I'm setting up for dad can't read the hard drive. Won't boot from USB, has no CD drive, I have no USB ... [23931040]</body> </message> for mobile clients. That's roughly 6% of the original (~4,257 bytes reduced to ~262 bytes). I think without this behavior, message filtering is pretty useless. Useless fact: Watching offline messages from identi.ca using up bandwidth in slow motion (slow, expensive GPRS with payment based on bandwidth usage) is what got mod_sift for Prosody started. 2. Offline messages. A SIFT message filter which has some <allow/> elements doesn't scale well for large numbers of offline messages. Currently a server with an SQL backend may do something like this: 1. resource becomes available 2. SELECT * FROM offline_messages WHERE JID == ${account_jid} 3. loop over the resultset and send all messages to the newly available resource 4. DELETE FROM offline_messages WHERE JID == ${account_jid} With per-message filtering this changes to something like this: 3. loop over the resultset and send all _allowed_ messages to the newly available resource 4. for each sent message, DELETE FROM offline_messages WHERE JID == ${account_jid} and MESSAGEID == $(unique_message_id) This could be optimized somewhat, but would still be relatively complex. I've added this here to get comments from other server developers. How significant is this overhead in your opinion? 3. Automatic IQ responses. Currently SIFT allows blocking IQs based on payload. The server auto-reponds with an error. It would be interesting if the server could be made to reply with an IQ result preset by the client. Maybe something along these lines: <sift xmlns='urn:xmpp:sift:1'> <iq> <reply name='query' ns='http://jabber.org/protocol/disco#info'> [...] service discovery reply payload here [...] </reply> </iq> </sift> The above example has some issues (think service discovery nodes), but the approach is worth considering regardless. This fits perfectly for version replies, etc. 4. mod_sift for Prosody Our implementation is a work in progress, but it does the basics. Hopefully we'll have some implementation experience soon. Now if only those client developers hurry up. Docs: http://code.google.com/p/prosody-modules/wiki/mod_sift Source: http://code.google.com/p/prosody-modules/source/browse/mod_sift/mod_sift.lua -- Waqas Hussain
