Re: [Standards] What is the size limit of node and item ids in XEP-0060: Publish-Subscribe?

2018-03-27 Thread Timothée Jaussoin
Hi everyone, 

Just bouncing back this discussion again. I'd like to see what we can decide at 
the moment based on the information that we have here.
I'll add a couple of information there, those are simple technical limitation 
that will guide our decisions regarding this problem.

ejabberd is restraining the size of the JIDs, node IDs and many other things to 
varchar(191) for MySQL, I'm doing similar things in
Movim regarding the key size limit in MySQL (it's less a problem for the other 
SQL databases).

So we already have in the wild some servers that will not accept those long 
JIDs and IDs.

Some web app that are using XMPP as a backend are mapping Pubsub resources to 
URLs, like Movim or SàT (afaik), here's an example https:
//nl.movim.eu/?node/pubsub.movim.eu/Movim/a-new-release-is-coming-help-up-with-the-translations-WM4Yrf.
 On my side I'm slugifying
things to make those node and item ids easier to read but I'm expecting to have 
some escaping problems for some cases.

Related to that, we have Bookmark 2 that is in discussion 
https://xmpp.org/extensions/inbox/bookmarks2.html. This XEP defines that
"Each item SHALL have, as item id, the Room JID of the chatroom". This means 
that Pubsub item ids have the same definition as JIDs?

On my side I'd propose to restrict JIDs to something shorter (like 128 UTF-8 
characters) to be sure that those can be stored and
intexed properly in databases and to define that all the Pubsub/PEP IDs are 
having the same definition as JIDs. 

Regards,

Timothée Jaussoin

Le mercredi 07 mars 2018 à 09:20 -0700, Peter Saint-Andre a écrit :
> On 3/6/18 1:02 AM, Jonas Wielicki wrote:
> > Hi Peter,
> > 
> > Thank you very much for the clarification, comments inline.
> > 
> > On Dienstag, 6. März 2018 02:59:04 CET Peter Saint-Andre wrote:
> > > On 3/5/18 12:17 AM, Jonas Wielicki wrote:
> > > > On Sonntag, 4. März 2018 19:42:39 CET Peter Saint-Andre wrote:
> > > > > On 3/4/18 10:54 AM, Jonas Wielicki wrote:
> > > > > > On Sonntag, 4. März 2018 17:02:07 CET Peter Saint-Andre wrote:
> > > > > > > If we want to specify this, I would recommend the 
> > > > > > > UsernameCaseMapped
> > > > > > > profile defined in RFC 8265.
> > > > > > > 
> > > > > > > However, there's a twist: if a node ID can be a full JID, then do 
> > > > > > > we
> > > > > > > want to apply the normal rules of RFC 7622 to all the JID parts,
> > > > > > > instead
> > > > > > > of one uniform profile such as UsernameCaseMapped to the entire 
> > > > > > > node
> > > > > > > ID?
> > > > > > > For instance, the resourcepart of a JID is allowed to contain a 
> > > > > > > much
> > > > > > > wider range of Unicode characters than is allowed by the
> > > > > > > UsernameCaseMapped profile of the PRECIS IdentifierClass (which 
> > > > > > > we use
> > > > > > > for the localpart).
> > > > > > > 
> > > > > > > Given that a node ID can be used for authorization decisions, I 
> > > > > > > think
> > > > > > > it's better to be conservative in what we accept (specifically, 
> > > > > > > not
> > > > > > > allow the wider range of characters in a resourcepart because
> > > > > > > developers, and attackers, could get too "creative").
> > > > > > 
> > > > > > I would argue that adding those restrictions / any kind of string
> > > > > > prepping
> > > > > > to XEP-0060 or XEP-0030 nodes is (a) too late and (b) ambiguous at
> > > > > > least,
> > > > > > as you mentioned (depending on the data).
> > > > > 
> > > > > I would argue that not specifying normalization rules is a security 
> > > > > hole
> > > > > (e.g., allowing an attacker to gain unauthorized access to a node). 
> > > > > Just
> > > > > because we should've done this years ago doesn't mean we can fix it 
> > > > > now.
> > > > 
> > > > Hm, okay, I don’t seem to understand the attack vector. Could you spell 
> > > > it
> > > > out more clearly to me?
> > > 
> > > Here's a true, non-XMPP example: I have the account stpe...@gmail.com.
> > > However, Google ignores "." in the localpart. Therefore I receive some
> > > email messages intended for st.pe...@gmail.com. I could probably reset
> > > passwords (via email-based authentication) and take over other accounts
> > > associated with st.pe...@gmail.com.
> > > 
> > > Similarly, let's say you create a node "foo2" at pubsub.example.com. If
> > > I know that this service decomposes superscript characters to their
> > > compatibility equivalents, I could create a node "foo²" (the last
> > > character is U+00B2 = SUPERSCRIPT TWO) and the service would consider it
> > > to be the same as "foo2". Now I can publish notifications to your node
> > > without ever trying to take over your account - I just use my "foo²" node.
> > 
> > Okay, that all makes sense, but it seems to me that this is due to the 
> > *presence* of a normalization, not the absence. 
> 
> Actually, incomplete or incorrect normalization.
> 
> > That’s where my confusion came 
> > from. I think the absence of a normalization (or specifying that absence) 

Re: [Standards] What is the size limit of node and item ids in XEP-0060: Publish-Subscribe?

2018-03-07 Thread Peter Saint-Andre
On 3/6/18 1:02 AM, Jonas Wielicki wrote:
> Hi Peter,
> 
> Thank you very much for the clarification, comments inline.
> 
> On Dienstag, 6. März 2018 02:59:04 CET Peter Saint-Andre wrote:
>> On 3/5/18 12:17 AM, Jonas Wielicki wrote:
>>> On Sonntag, 4. März 2018 19:42:39 CET Peter Saint-Andre wrote:
 On 3/4/18 10:54 AM, Jonas Wielicki wrote:
> On Sonntag, 4. März 2018 17:02:07 CET Peter Saint-Andre wrote:
>> If we want to specify this, I would recommend the UsernameCaseMapped
>> profile defined in RFC 8265.
>>
>> However, there's a twist: if a node ID can be a full JID, then do we
>> want to apply the normal rules of RFC 7622 to all the JID parts,
>> instead
>> of one uniform profile such as UsernameCaseMapped to the entire node
>> ID?
>> For instance, the resourcepart of a JID is allowed to contain a much
>> wider range of Unicode characters than is allowed by the
>> UsernameCaseMapped profile of the PRECIS IdentifierClass (which we use
>> for the localpart).
>>
>> Given that a node ID can be used for authorization decisions, I think
>> it's better to be conservative in what we accept (specifically, not
>> allow the wider range of characters in a resourcepart because
>> developers, and attackers, could get too "creative").
>
> I would argue that adding those restrictions / any kind of string
> prepping
> to XEP-0060 or XEP-0030 nodes is (a) too late and (b) ambiguous at
> least,
> as you mentioned (depending on the data).

 I would argue that not specifying normalization rules is a security hole
 (e.g., allowing an attacker to gain unauthorized access to a node). Just
 because we should've done this years ago doesn't mean we can fix it now.
>>>
>>> Hm, okay, I don’t seem to understand the attack vector. Could you spell it
>>> out more clearly to me?
>>
>> Here's a true, non-XMPP example: I have the account stpe...@gmail.com.
>> However, Google ignores "." in the localpart. Therefore I receive some
>> email messages intended for st.pe...@gmail.com. I could probably reset
>> passwords (via email-based authentication) and take over other accounts
>> associated with st.pe...@gmail.com.
>>
>> Similarly, let's say you create a node "foo2" at pubsub.example.com. If
>> I know that this service decomposes superscript characters to their
>> compatibility equivalents, I could create a node "foo²" (the last
>> character is U+00B2 = SUPERSCRIPT TWO) and the service would consider it
>> to be the same as "foo2". Now I can publish notifications to your node
>> without ever trying to take over your account - I just use my "foo²" node.
> 
> Okay, that all makes sense, but it seems to me that this is due to the 
> *presence* of a normalization, not the absence. 

Actually, incomplete or incorrect normalization.

> That’s where my confusion came 
> from. I think the absence of a normalization (or specifying that absence) is 
> not going to do us harm. 

Never assume that harm can't happen when computers are involved. :-)
Especially when internationalized characters are used. If we said that a
node could only use characters from the ASCII range then we'd be safe,
but that's not the case - people want to use JIDs as nodes, which means
we're inheriting everything from internationalized domain names (please
read RFC 5890), internationalized usernames (please read RFC 7613), and
internationalized "free-form" strings (please read RFC 7613 again), and
their combination in XMPP (please read RFC 7622). Handling all of those
strings correctly requires normalization of some kind, end of story.

> That is what I was trying to say when I said that 
> "I’d also argue that nodes aren’t shown or typed into a field by users 
> normally, so I would not worry about that kind of normalization here.": Since 
> users aren’t confronted with them, lookalikes etc. should not be an issue and 
> do not need to be normalized.

This is not just about user-facing "confusable characters", but
machine-generated and machine-processed characters as well. And in any
case do you think that a pubsub application will *never* show the node
name to an end user? These things inevitably leak out to userland (e.g.,
for a user to manage subscriptions, for a node owner to manage users, etc.).

> If we’re going to specify that "node names etc. need to be taken as-is and 
> compared codepoint-by-codepoint [I can’t look up the name of that collation 
> right now] and must not be normalized in any way by the service", that makes 
> sense to me; 

There's your problem: you think this internationalization stuff makes
sense. :-) Abandon hope, all ye who enter here! If I had more time, I'd
write a book entitled "Internationalization: A Guide for the Perplexed".

Comparing two strings for an octet-for-octet match is the last step, but
if you don't properly enforce various rules before then (including
normalization), bad things will happen. Especially if we're 

Re: [Standards] What is the size limit of node and item ids in XEP-0060: Publish-Subscribe?

2018-03-06 Thread Jonas Wielicki
Hi Peter,

Thank you very much for the clarification, comments inline.

On Dienstag, 6. März 2018 02:59:04 CET Peter Saint-Andre wrote:
> On 3/5/18 12:17 AM, Jonas Wielicki wrote:
> > On Sonntag, 4. März 2018 19:42:39 CET Peter Saint-Andre wrote:
> >> On 3/4/18 10:54 AM, Jonas Wielicki wrote:
> >>> On Sonntag, 4. März 2018 17:02:07 CET Peter Saint-Andre wrote:
>  If we want to specify this, I would recommend the UsernameCaseMapped
>  profile defined in RFC 8265.
>  
>  However, there's a twist: if a node ID can be a full JID, then do we
>  want to apply the normal rules of RFC 7622 to all the JID parts,
>  instead
>  of one uniform profile such as UsernameCaseMapped to the entire node
>  ID?
>  For instance, the resourcepart of a JID is allowed to contain a much
>  wider range of Unicode characters than is allowed by the
>  UsernameCaseMapped profile of the PRECIS IdentifierClass (which we use
>  for the localpart).
>  
>  Given that a node ID can be used for authorization decisions, I think
>  it's better to be conservative in what we accept (specifically, not
>  allow the wider range of characters in a resourcepart because
>  developers, and attackers, could get too "creative").
> >>> 
> >>> I would argue that adding those restrictions / any kind of string
> >>> prepping
> >>> to XEP-0060 or XEP-0030 nodes is (a) too late and (b) ambiguous at
> >>> least,
> >>> as you mentioned (depending on the data).
> >> 
> >> I would argue that not specifying normalization rules is a security hole
> >> (e.g., allowing an attacker to gain unauthorized access to a node). Just
> >> because we should've done this years ago doesn't mean we can fix it now.
> > 
> > Hm, okay, I don’t seem to understand the attack vector. Could you spell it
> > out more clearly to me?
> 
> Here's a true, non-XMPP example: I have the account stpe...@gmail.com.
> However, Google ignores "." in the localpart. Therefore I receive some
> email messages intended for st.pe...@gmail.com. I could probably reset
> passwords (via email-based authentication) and take over other accounts
> associated with st.pe...@gmail.com.
> 
> Similarly, let's say you create a node "foo2" at pubsub.example.com. If
> I know that this service decomposes superscript characters to their
> compatibility equivalents, I could create a node "foo²" (the last
> character is U+00B2 = SUPERSCRIPT TWO) and the service would consider it
> to be the same as "foo2". Now I can publish notifications to your node
> without ever trying to take over your account - I just use my "foo²" node.

Okay, that all makes sense, but it seems to me that this is due to the 
*presence* of a normalization, not the absence. That’s where my confusion came 
from. I think the absence of a normalization (or specifying that absence) is 
not going to do us harm. That is what I was trying to say when I said that 
"I’d also argue that nodes aren’t shown or typed into a field by users 
normally, so I would not worry about that kind of normalization here.": Since 
users aren’t confronted with them, lookalikes etc. should not be an issue and 
do not need to be normalized.

If we’re going to specify that "node names etc. need to be taken as-is and 
compared codepoint-by-codepoint [I can’t look up the name of that collation 
right now] and must not be normalized in any way by the service", that makes 
sense to me; I think most services, if not all, already operate this way.

Otherwise, I think we’ll have to think hard about the implications of 
introducing a normalization/preparation method this far into deployment and 
how to handle unnormalized input [1]. XEP-0030 is Final and used ~everywhere, 
XEP-0060 is Draft and a key dependency to a few modern features (via PEP). 
Having the ecosystem move from "no preparation" to "some preparation" feels 
like it’s bound to introduce exactly the type of bugs you were talking about.

Add to that the trickiness if we want to use JIDs as node names, I’d argue 
that a "don’t touch this" directive to the server makes sense. If a protocol 
has specific requirements for node names specifically in PubSub, I think it 
could still specify that.

Does this make sense?

kind regards,
Jonas

   [1]: Given the lack of even resourceprep validation in current servers,
I’d also not put my money on "servers will validate and reject any
invalid node names".

signature.asc
Description: This is a digitally signed message part.
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


Re: [Standards] What is the size limit of node and item ids in XEP-0060: Publish-Subscribe?

2018-03-05 Thread Peter Saint-Andre
On 3/5/18 12:17 AM, Jonas Wielicki wrote:
> On Sonntag, 4. März 2018 19:42:39 CET Peter Saint-Andre wrote:
>> On 3/4/18 10:54 AM, Jonas Wielicki wrote:
>>> On Sonntag, 4. März 2018 17:02:07 CET Peter Saint-Andre wrote:
 If we want to specify this, I would recommend the UsernameCaseMapped
 profile defined in RFC 8265.

 However, there's a twist: if a node ID can be a full JID, then do we
 want to apply the normal rules of RFC 7622 to all the JID parts, instead
 of one uniform profile such as UsernameCaseMapped to the entire node ID?
 For instance, the resourcepart of a JID is allowed to contain a much
 wider range of Unicode characters than is allowed by the
 UsernameCaseMapped profile of the PRECIS IdentifierClass (which we use
 for the localpart).

 Given that a node ID can be used for authorization decisions, I think
 it's better to be conservative in what we accept (specifically, not
 allow the wider range of characters in a resourcepart because
 developers, and attackers, could get too "creative").
>>>
>>> I would argue that adding those restrictions / any kind of string prepping
>>> to XEP-0060 or XEP-0030 nodes is (a) too late and (b) ambiguous at least,
>>> as you mentioned (depending on the data).
>>
>> I would argue that not specifying normalization rules is a security hole
>> (e.g., allowing an attacker to gain unauthorized access to a node). Just
>> because we should've done this years ago doesn't mean we can fix it now.
> 
> Hm, okay, I don’t seem to understand the attack vector. Could you spell it 
> out 
> more clearly to me?

Here's a true, non-XMPP example: I have the account stpe...@gmail.com.
However, Google ignores "." in the localpart. Therefore I receive some
email messages intended for st.pe...@gmail.com. I could probably reset
passwords (via email-based authentication) and take over other accounts
associated with st.pe...@gmail.com.

Similarly, let's say you create a node "foo2" at pubsub.example.com. If
I know that this service decomposes superscript characters to their
compatibility equivalents, I could create a node "foo²" (the last
character is U+00B2 = SUPERSCRIPT TWO) and the service would consider it
to be the same as "foo2". Now I can publish notifications to your node
without ever trying to take over your account - I just use my "foo²" node.

Here is a real-world example (using an old version of XMPP nodeprep, no
less!):

https://labs.spotify.com/2013/06/18/creative-usernames/

Let me know if the attack vector is still not clear. :-)

Peter




signature.asc
Description: OpenPGP digital signature
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


Re: [Standards] What is the size limit of node and item ids in XEP-0060: Publish-Subscribe?

2018-03-04 Thread Jonas Wielicki
On Sonntag, 4. März 2018 19:42:39 CET Peter Saint-Andre wrote:
> On 3/4/18 10:54 AM, Jonas Wielicki wrote:
> > On Sonntag, 4. März 2018 17:02:07 CET Peter Saint-Andre wrote:
> >> If we want to specify this, I would recommend the UsernameCaseMapped
> >> profile defined in RFC 8265.
> >> 
> >> However, there's a twist: if a node ID can be a full JID, then do we
> >> want to apply the normal rules of RFC 7622 to all the JID parts, instead
> >> of one uniform profile such as UsernameCaseMapped to the entire node ID?
> >> For instance, the resourcepart of a JID is allowed to contain a much
> >> wider range of Unicode characters than is allowed by the
> >> UsernameCaseMapped profile of the PRECIS IdentifierClass (which we use
> >> for the localpart).
> >> 
> >> Given that a node ID can be used for authorization decisions, I think
> >> it's better to be conservative in what we accept (specifically, not
> >> allow the wider range of characters in a resourcepart because
> >> developers, and attackers, could get too "creative").
> > 
> > I would argue that adding those restrictions / any kind of string prepping
> > to XEP-0060 or XEP-0030 nodes is (a) too late and (b) ambiguous at least,
> > as you mentioned (depending on the data).
> 
> I would argue that not specifying normalization rules is a security hole
> (e.g., allowing an attacker to gain unauthorized access to a node). Just
> because we should've done this years ago doesn't mean we can fix it now.

Hm, okay, I don’t seem to understand the attack vector. Could you spell it out 
more clearly to me?

kind regards,
Jonas

signature.asc
Description: This is a digitally signed message part.
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


Re: [Standards] What is the size limit of node and item ids in XEP-0060: Publish-Subscribe?

2018-03-04 Thread Florian Schmaus
On 04.03.2018 17:02, Peter Saint-Andre wrote:
> On 3/4/18 1:27 AM, Florian Schmaus wrote:
>> On 04.03.2018 02:30, Peter Saint-Andre wrote:
 On Mar 3, 2018, at 2:36 PM, Timothée Jaussoin  wrote:
 Thanks for the answers. I'm fine for the 3071 limitation, so we can set it 
 both for the Pubsub nodes id and Pubsub items it?
 If yes I'm ok to do a PR on the 0060 to specify that. I'm also wondering 
 if there is a specific way of declaring such string
 limitations, are you aware of any other XEPs that specify such things?
>>>
>>> As mentioned, I think this belongs in XEP-0030 but I suppose it can be 
>>> defined in XEP-0060.
>>
>> Could you elaborate why you think that it belongs in xep30? Are xep30
>> item node's related to xep60 nodes? How fit xep60 item IDs into this?
> 
> The concept of a node was originally defined in XEP-0030, and the usage
> in XEP-0060 borrowed from XEP-0030. The former is more fundamental and
> thus I think it would be good to specify this in XEP-0030 (so that
> "node" means the same thing across all protocol extensions). Or we could
> resurrect XEP-0271:
> 
> https://xmpp.org/extensions/xep-0271.html

Whatever we do, there should have a prominent pointer from the affected
XEPs to the place where we specific the limitations on node values.

>> Related: I wonder if we should specify string preparation for xep60 node
>> and item ID strings. Same goes for the strings used by xep30, e.g.
>> 's 'var' attribute value. Is any Unicode string a valid value
>> for those? Or is this already specified somewhere and I just missed it?
> 
> If we want to specify this, I would recommend the UsernameCaseMapped
> profile defined in RFC 8265.
> 
> However, there's a twist: if a node ID can be a full JID, then do we
> want to apply the normal rules of RFC 7622 to all the JID parts, instead
> of one uniform profile such as UsernameCaseMapped to the entire node ID?
> For instance, the resourcepart of a JID is allowed to contain a much
> wider range of Unicode characters than is allowed by the
> UsernameCaseMapped profile of the PRECIS IdentifierClass (which we use
> for the localpart).

I believe the following requirements are sensible:
1.) If we specify a profile, then it must be applied to the whole value
2.) Since it is a common use case to put JIDs into node ID values, we
must ensure that distinct (full?) JIDs, do not map not the same node ID.

And if I'm not mistaken, resourceparts are case preserving, as result it
depends on whether or not we deem support for full JIDs in node IDs
worthwhile, which PRECIS profiles we need to consider.

- Florian



signature.asc
Description: OpenPGP digital signature
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


Re: [Standards] What is the size limit of node and item ids in XEP-0060: Publish-Subscribe?

2018-03-04 Thread Peter Saint-Andre
On 3/4/18 10:54 AM, Jonas Wielicki wrote:
> On Sonntag, 4. März 2018 17:02:07 CET Peter Saint-Andre wrote:
>> If we want to specify this, I would recommend the UsernameCaseMapped
>> profile defined in RFC 8265.
>>
>> However, there's a twist: if a node ID can be a full JID, then do we
>> want to apply the normal rules of RFC 7622 to all the JID parts, instead
>> of one uniform profile such as UsernameCaseMapped to the entire node ID?
>> For instance, the resourcepart of a JID is allowed to contain a much
>> wider range of Unicode characters than is allowed by the
>> UsernameCaseMapped profile of the PRECIS IdentifierClass (which we use
>> for the localpart).
>>
>> Given that a node ID can be used for authorization decisions, I think
>> it's better to be conservative in what we accept (specifically, not
>> allow the wider range of characters in a resourcepart because
>> developers, and attackers, could get too "creative").
> 
> I would argue that adding those restrictions / any kind of string prepping to 
> XEP-0060 or XEP-0030 nodes is (a) too late and (b) ambiguous at least, as you 
> mentioned (depending on the data).

I would argue that not specifying normalization rules is a security hole
(e.g., allowing an attacker to gain unauthorized access to a node). Just
because we should've done this years ago doesn't mean we can fix it now.

> I’d also argue that nodes aren’t shown or typed into a field by users 
> normally, so I would not worry about that kind of normalization here.

So that only automated attackers can succeed? :-)

> If a specific XEP-0030/XEP-0060-based protocol needs more guarantees, I think 
> those can be defined there.

No, this needs to be done at the lowest level we can manage. Pushing
this off to extensions just means we'll have inconsistent approaches.

Peter




signature.asc
Description: OpenPGP digital signature
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


Re: [Standards] What is the size limit of node and item ids in XEP-0060: Publish-Subscribe?

2018-03-04 Thread Jonas Wielicki
On Sonntag, 4. März 2018 17:02:07 CET Peter Saint-Andre wrote:
> If we want to specify this, I would recommend the UsernameCaseMapped
> profile defined in RFC 8265.
> 
> However, there's a twist: if a node ID can be a full JID, then do we
> want to apply the normal rules of RFC 7622 to all the JID parts, instead
> of one uniform profile such as UsernameCaseMapped to the entire node ID?
> For instance, the resourcepart of a JID is allowed to contain a much
> wider range of Unicode characters than is allowed by the
> UsernameCaseMapped profile of the PRECIS IdentifierClass (which we use
> for the localpart).
> 
> Given that a node ID can be used for authorization decisions, I think
> it's better to be conservative in what we accept (specifically, not
> allow the wider range of characters in a resourcepart because
> developers, and attackers, could get too "creative").

I would argue that adding those restrictions / any kind of string prepping to 
XEP-0060 or XEP-0030 nodes is (a) too late and (b) ambiguous at least, as you 
mentioned (depending on the data).

I’d also argue that nodes aren’t shown or typed into a field by users 
normally, so I would not worry about that kind of normalization here.

If a specific XEP-0030/XEP-0060-based protocol needs more guarantees, I think 
those can be defined there.

kind regards,
Jonas


signature.asc
Description: This is a digitally signed message part.
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


Re: [Standards] What is the size limit of node and item ids in XEP-0060: Publish-Subscribe?

2018-03-04 Thread Peter Saint-Andre
On 3/4/18 1:27 AM, Florian Schmaus wrote:
> On 04.03.2018 02:30, Peter Saint-Andre wrote:
>>> On Mar 3, 2018, at 2:36 PM, Timothée Jaussoin  wrote:
>>> Hi,
>>>
>>> Thanks for the answers. I'm fine for the 3071 limitation, so we can set it 
>>> both for the Pubsub nodes id and Pubsub items it?
>>> If yes I'm ok to do a PR on the 0060 to specify that. I'm also wondering if 
>>> there is a specific way of declaring such string
>>> limitations, are you aware of any other XEPs that specify such things?
>>
>> As mentioned, I think this belongs in XEP-0030 but I suppose it can be 
>> defined in XEP-0060.
> 
> Could you elaborate why you think that it belongs in xep30? Are xep30
> item node's related to xep60 nodes? How fit xep60 item IDs into this?

The concept of a node was originally defined in XEP-0030, and the usage
in XEP-0060 borrowed from XEP-0030. The former is more fundamental and
thus I think it would be good to specify this in XEP-0030 (so that
"node" means the same thing across all protocol extensions). Or we could
resurrect XEP-0271:

https://xmpp.org/extensions/xep-0271.html

> Related: I wonder if we should specify string preparation for xep60 node
> and item ID strings. Same goes for the strings used by xep30, e.g.
> 's 'var' attribute value. Is any Unicode string a valid value
> for those? Or is this already specified somewhere and I just missed it?

If we want to specify this, I would recommend the UsernameCaseMapped
profile defined in RFC 8265.

However, there's a twist: if a node ID can be a full JID, then do we
want to apply the normal rules of RFC 7622 to all the JID parts, instead
of one uniform profile such as UsernameCaseMapped to the entire node ID?
For instance, the resourcepart of a JID is allowed to contain a much
wider range of Unicode characters than is allowed by the
UsernameCaseMapped profile of the PRECIS IdentifierClass (which we use
for the localpart).

Given that a node ID can be used for authorization decisions, I think
it's better to be conservative in what we accept (specifically, not
allow the wider range of characters in a resourcepart because
developers, and attackers, could get too "creative").

Peter



signature.asc
Description: OpenPGP digital signature
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


Re: [Standards] What is the size limit of node and item ids in XEP-0060: Publish-Subscribe?

2018-03-04 Thread Florian Schmaus
On 04.03.2018 02:30, Peter Saint-Andre wrote:
>> On Mar 3, 2018, at 2:36 PM, Timothée Jaussoin  wrote:
>> Hi,
>>
>> Thanks for the answers. I'm fine for the 3071 limitation, so we can set it 
>> both for the Pubsub nodes id and Pubsub items it?
>> If yes I'm ok to do a PR on the 0060 to specify that. I'm also wondering if 
>> there is a specific way of declaring such string
>> limitations, are you aware of any other XEPs that specify such things?
> 
> As mentioned, I think this belongs in XEP-0030 but I suppose it can be 
> defined in XEP-0060.

Could you elaborate why you think that it belongs in xep30? Are xep30
item node's related to xep60 nodes? How fit xep60 item IDs into this?

Related: I wonder if we should specify string preparation for xep60 node
and item ID strings. Same goes for the strings used by xep30, e.g.
's 'var' attribute value. Is any Unicode string a valid value
for those? Or is this already specified somewhere and I just missed it?

- Florian




signature.asc
Description: OpenPGP digital signature
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


Re: [Standards] What is the size limit of node and item ids in XEP-0060: Publish-Subscribe?

2018-03-03 Thread Peter Saint-Andre


> On Mar 3, 2018, at 2:36 PM, Timothée Jaussoin  wrote:
> 
>> Le jeudi 01 mars 2018 à 07:10 -0700, Peter Saint-Andre a écrit :
>>> On 3/1/18 1:07 AM, Jonas Wielicki wrote:
 On Donnerstag, 1. März 2018 08:52:29 CET Florian Schmaus wrote:
> On 01.03.2018 01:17, Peter Saint-Andre wrote:
>> On 2/28/18 3:18 PM, Timothée Jaussoin wrote:
>> Hi,
>> 
>> I came across a database limitation while implementing Pubsub in Movim.
>> 
>> I'd like to know if we have a limitation for the size of the node and
>> items ids in Pubsub (like we have for the JIDs). Also do we have some
>> specific forbid characters, basically what is the format of such
>> attributes? If noting is already specificed I think that it would be
>> wise to update the 0060 to do so.> 
> 
> My inclination is to specify a length of 1023 octets
 
 Which would break applications and protocols using JIDs as node or item
 identifier. This includes for example MIX. If we want to allow this, we
 need at least (3x1023)+2 octets, and then I would probably go for 4096
 octets.
>>> 
>>> This is bikeshedding territory. But given that databases have limits on the 
>>> size of keys, using as many as needed and as few as possible octets (the 
>>> 3071 
>>> you quoted) is probably sensible.
>>> 
>>> Do those protocols use bare or full JIDs? If they only use bare and if we 
>>> agree that full JIDs (due to their transience) do not make sense, the limit 
>>> could conceivably be as low as 2047, which is probably comfortable for 
>>> databases to handle.
>> 
>> A full, especially non-client JID need not be transient, so I suppose
>> we'd set it to 3071 (not sure why we'd need 4096 other than the fact
>> it's a power of 2):
>> 
>> https://tools.ietf.org/html/rfc7622#section-3.1
>> 
>> Peter
>> 
>> ___
>> Standards mailing list
>> Info: https://mail.jabber.org/mailman/listinfo/standards
>> Unsubscribe: standards-unsubscr...@xmpp.org
>> ___
> 
> Hi,
> 
> Thanks for the answers. I'm fine for the 3071 limitation, so we can set it 
> both for the Pubsub nodes id and Pubsub items it?
> If yes I'm ok to do a PR on the 0060 to specify that. I'm also wondering if 
> there is a specific way of declaring such string
> limitations, are you aware of any other XEPs that specify such things?

As mentioned, I think this belongs in XEP-0030 but I suppose it can be defined 
in XEP-0060.

See appendix A.7 of RFC 6120 for an example of length limits in XML Schema. RFC 
has similar text as well.

HTH,

Peter


___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


Re: [Standards] What is the size limit of node and item ids in XEP-0060: Publish-Subscribe?

2018-03-03 Thread Timothée Jaussoin
Le jeudi 01 mars 2018 à 07:10 -0700, Peter Saint-Andre a écrit :
> On 3/1/18 1:07 AM, Jonas Wielicki wrote:
> > On Donnerstag, 1. März 2018 08:52:29 CET Florian Schmaus wrote:
> > > On 01.03.2018 01:17, Peter Saint-Andre wrote:
> > > > On 2/28/18 3:18 PM, Timothée Jaussoin wrote:
> > > > > Hi,
> > > > > 
> > > > > I came across a database limitation while implementing Pubsub in 
> > > > > Movim.
> > > > > 
> > > > > I'd like to know if we have a limitation for the size of the node and
> > > > > items ids in Pubsub (like we have for the JIDs). Also do we have some
> > > > > specific forbid characters, basically what is the format of such
> > > > > attributes? If noting is already specificed I think that it would be
> > > > > wise to update the 0060 to do so.> 
> > > > 
> > > > My inclination is to specify a length of 1023 octets
> > > 
> > > Which would break applications and protocols using JIDs as node or item
> > > identifier. This includes for example MIX. If we want to allow this, we
> > > need at least (3x1023)+2 octets, and then I would probably go for 4096
> > > octets.
> > 
> > This is bikeshedding territory. But given that databases have limits on the 
> > size of keys, using as many as needed and as few as possible octets (the 
> > 3071 
> > you quoted) is probably sensible.
> > 
> > Do those protocols use bare or full JIDs? If they only use bare and if we 
> > agree that full JIDs (due to their transience) do not make sense, the limit 
> > could conceivably be as low as 2047, which is probably comfortable for 
> > databases to handle.
> 
> A full, especially non-client JID need not be transient, so I suppose
> we'd set it to 3071 (not sure why we'd need 4096 other than the fact
> it's a power of 2):
> 
> https://tools.ietf.org/html/rfc7622#section-3.1
> 
> Peter
> 
> ___
> Standards mailing list
> Info: https://mail.jabber.org/mailman/listinfo/standards
> Unsubscribe: standards-unsubscr...@xmpp.org
> ___

Hi,

Thanks for the answers. I'm fine for the 3071 limitation, so we can set it both 
for the Pubsub nodes id and Pubsub items it?
If yes I'm ok to do a PR on the 0060 to specify that. I'm also wondering if 
there is a specific way of declaring such string
limitations, are you aware of any other XEPs that specify such things?

Regards,

Timothée
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


Re: [Standards] What is the size limit of node and item ids in XEP-0060: Publish-Subscribe?

2018-03-01 Thread Peter Saint-Andre
On 3/1/18 1:07 AM, Jonas Wielicki wrote:
> On Donnerstag, 1. März 2018 08:52:29 CET Florian Schmaus wrote:
>> On 01.03.2018 01:17, Peter Saint-Andre wrote:
>>> On 2/28/18 3:18 PM, Timothée Jaussoin wrote:
 Hi,

 I came across a database limitation while implementing Pubsub in Movim.

 I'd like to know if we have a limitation for the size of the node and
 items ids in Pubsub (like we have for the JIDs). Also do we have some
 specific forbid characters, basically what is the format of such
 attributes? If noting is already specificed I think that it would be
 wise to update the 0060 to do so.> 
>>> My inclination is to specify a length of 1023 octets
>>
>> Which would break applications and protocols using JIDs as node or item
>> identifier. This includes for example MIX. If we want to allow this, we
>> need at least (3x1023)+2 octets, and then I would probably go for 4096
>> octets.
> 
> This is bikeshedding territory. But given that databases have limits on the 
> size of keys, using as many as needed and as few as possible octets (the 3071 
> you quoted) is probably sensible.
> 
> Do those protocols use bare or full JIDs? If they only use bare and if we 
> agree that full JIDs (due to their transience) do not make sense, the limit 
> could conceivably be as low as 2047, which is probably comfortable for 
> databases to handle.

A full, especially non-client JID need not be transient, so I suppose
we'd set it to 3071 (not sure why we'd need 4096 other than the fact
it's a power of 2):

https://tools.ietf.org/html/rfc7622#section-3.1

Peter



signature.asc
Description: OpenPGP digital signature
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


Re: [Standards] What is the size limit of node and item ids in XEP-0060: Publish-Subscribe?

2018-03-01 Thread Kevin Smith
On 1 Mar 2018, at 08:07, Jonas Wielicki  wrote:
> 
> On Donnerstag, 1. März 2018 08:52:29 CET Florian Schmaus wrote:
>> On 01.03.2018 01:17, Peter Saint-Andre wrote:
>>> On 2/28/18 3:18 PM, Timothée Jaussoin wrote:
 Hi,
 
 I came across a database limitation while implementing Pubsub in Movim.
 
 I'd like to know if we have a limitation for the size of the node and
 items ids in Pubsub (like we have for the JIDs). Also do we have some
 specific forbid characters, basically what is the format of such
 attributes? If noting is already specificed I think that it would be
 wise to update the 0060 to do so.> 
>>> My inclination is to specify a length of 1023 octets
>> 
>> Which would break applications and protocols using JIDs as node or item
>> identifier. This includes for example MIX. If we want to allow this, we
>> need at least (3x1023)+2 octets, and then I would probably go for 4096
>> octets.
> 
> This is bikeshedding territory. But given that databases have limits on the 
> size of keys, using as many as needed and as few as possible octets (the 3071 
> you quoted) is probably sensible.
> 
> Do those protocols use bare or full JIDs? If they only use bare and if we 
> agree that full JIDs (due to their transience) do not make sense, the limit 
> could conceivably be as low as 2047, which is probably comfortable for 
> databases to handle.

If we’re really octet-counting, that 2047 can be 2046 :)

/K
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


Re: [Standards] What is the size limit of node and item ids in XEP-0060: Publish-Subscribe?

2018-03-01 Thread Jonas Wielicki
On Donnerstag, 1. März 2018 08:52:29 CET Florian Schmaus wrote:
> On 01.03.2018 01:17, Peter Saint-Andre wrote:
> > On 2/28/18 3:18 PM, Timothée Jaussoin wrote:
> >> Hi,
> >> 
> >> I came across a database limitation while implementing Pubsub in Movim.
> >> 
> >> I'd like to know if we have a limitation for the size of the node and
> >> items ids in Pubsub (like we have for the JIDs). Also do we have some
> >> specific forbid characters, basically what is the format of such
> >> attributes? If noting is already specificed I think that it would be
> >> wise to update the 0060 to do so.> 
> > My inclination is to specify a length of 1023 octets
> 
> Which would break applications and protocols using JIDs as node or item
> identifier. This includes for example MIX. If we want to allow this, we
> need at least (3x1023)+2 octets, and then I would probably go for 4096
> octets.

This is bikeshedding territory. But given that databases have limits on the 
size of keys, using as many as needed and as few as possible octets (the 3071 
you quoted) is probably sensible.

Do those protocols use bare or full JIDs? If they only use bare and if we 
agree that full JIDs (due to their transience) do not make sense, the limit 
could conceivably be as low as 2047, which is probably comfortable for 
databases to handle.

kind regards,
Jonas

signature.asc
Description: This is a digitally signed message part.
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


Re: [Standards] What is the size limit of node and item ids in XEP-0060: Publish-Subscribe?

2018-02-28 Thread Florian Schmaus
On 01.03.2018 01:17, Peter Saint-Andre wrote:
> On 2/28/18 3:18 PM, Timothée Jaussoin wrote:
>> Hi,
>>
>> I came across a database limitation while implementing Pubsub in Movim.
>>
>> I'd like to know if we have a limitation for the size of the node and items 
>> ids in Pubsub (like we have for the JIDs).
>> Also do we have some specific forbid characters, basically what is the 
>> format of such attributes?
>> If noting is already specificed I think that it would be wise to update the 
>> 0060 to do so.
> 
> My inclination is to specify a length of 1023 octets

Which would break applications and protocols using JIDs as node or item
identifier. This includes for example MIX. If we want to allow this, we
need at least (3x1023)+2 octets, and then I would probably go for 4096
octets.

- Florian



signature.asc
Description: OpenPGP digital signature
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


Re: [Standards] What is the size limit of node and item ids in XEP-0060: Publish-Subscribe?

2018-02-28 Thread Peter Saint-Andre
On 2/28/18 3:18 PM, Timothée Jaussoin wrote:
> Hi,
> 
> I came across a database limitation while implementing Pubsub in Movim.
> 
> I'd like to know if we have a limitation for the size of the node and items 
> ids in Pubsub (like we have for the JIDs).
> Also do we have some specific forbid characters, basically what is the format 
> of such attributes?
> If noting is already specificed I think that it would be wise to update the 
> 0060 to do so.

My inclination is to specify a length of 1023 octets, just like the
locapart, domainpart, and resourcepart in RFC 7622. Also I would specify
it in XEP-0030 and let XEP-0060 inherit from there.

Peter




signature.asc
Description: OpenPGP digital signature
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


[Standards] What is the size limit of node and item ids in XEP-0060: Publish-Subscribe?

2018-02-28 Thread Timothée Jaussoin
Hi,

I came across a database limitation while implementing Pubsub in Movim.

I'd like to know if we have a limitation for the size of the node and items ids 
in Pubsub (like we have for the JIDs).
Also do we have some specific forbid characters, basically what is the format 
of such attributes?
If noting is already specificed I think that it would be wise to update the 
0060 to do so.

Regards,

Timothée Jaussoin
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___