Re: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
On Mon, Feb 14, 2011 at 11:38 PM, Pablo Castro wrote: > (sorry for my random out-of-timing previous email on this thread. please see > below for an actually up to date reply) > > -Original Message- > From: Jonas Sicking [mailto:jo...@sicking.cc] > Sent: Monday, February 07, 2011 3:31 PM > > On Mon, Feb 7, 2011 at 3:07 PM, Jeremy Orlow wrote: >> On Mon, Feb 7, 2011 at 2:49 PM, Jonas Sicking wrote: >>> >>> On Sun, Feb 6, 2011 at 11:41 PM, Jeremy Orlow wrote: >>> > On Sun, Feb 6, 2011 at 11:38 PM, Jonas Sicking wrote: >>> >> >>> >> On Sun, Feb 6, 2011 at 2:31 PM, Jeremy Orlow >>> >> wrote: >>> >> > On Sun, Feb 6, 2011 at 2:03 PM, Shawn Wilsher >>> >> > wrote: >>> >> >> >>> >> >> On 2/6/2011 12:42 PM, Jeremy Orlow wrote: >>> >> >>> >>> >> >>> My current thinking is that we should have some relatively large >>> >> >>> limitmaybe on the order of 64k? It seems like it'd be very >>> >> >>> difficult >>> >> >>> to >>> >> >>> hit such a limit with any sort of legitimate use case, and the >>> >> >>> chances >>> >> >>> of >>> >> >>> some subtle data-dependent error would be much less. But a 1GB key >>> >> >>> is >>> >> >>> just >>> >> >>> not going to work well in any implementation (if it doesn't simply >>> >> >>> oom >>> >> >>> the >>> >> >>> process!). So despite what I said earlier, I guess I think we >>> >> >>> should >>> >> >>> have >>> >> >>> some limit...but keep it an order of magnitude or two larger than >>> >> >>> what >>> >> >>> we >>> >> >>> expect any legitimate usage to hit just to keep the system as >>> >> >>> flexible >>> >> >>> as >>> >> >>> possible. >>> >> >>> >>> >> >>> Does that sound reasonable to people? >>> >> >> >>> >> >> Are we thinking about making this a MUST requirement, or a SHOULD? >>> >> >> I'm >>> >> >> hesitant to spec an exact size as a MUST given how technology has a >>> >> >> way >>> >> >> of >>> >> >> changing in unexpected ways that makes old constraints obsolete. >>> >> >> But >>> >> >> then, >>> >> >> I may just be overly concerned about this too. >>> >> > >>> >> > If we put a limit, it'd be a MUST for sure. Otherwise people would >>> >> > develop >>> >> > against one of the implementations that don't place a limit and then >>> >> > their >>> >> > app would break on the others. >>> >> > The reason that I suggested 64K is that it seems outrageously big for >>> >> > the >>> >> > data types that we're looking at. But it's too small to do much with >>> >> > base64 >>> >> > encoding binary blobs into it or anything else like that that I could >>> >> > see >>> >> > becoming rather large. So it seems like a limit that'd avoid major >>> >> > abuses >>> >> > (where someone is probably approaching the problem wrong) but would >>> >> > not >>> >> > come >>> >> > close to limiting any practical use I can imagine. >>> >> > With our architecture in Chrome, we will probably need to have some >>> >> > limit. >>> >> > We haven't decided what that is yet, but since I remember others >>> >> > saying >>> >> > similar things when we talked about this at TPAC, it seems like it >>> >> > might >>> >> > be >>> >> > best to standardize it--even though it does feel a bit dirty. >>> >> >>> >> One problem with putting a limit is that it basically forces >>> >> implementations to use a specific encoding, or pay a hefty price. For >>> >> example if we choose a 64K limit, is that of UTF8 data or of UTF16 >>> >> data? If it is of UTF8 data, and the implementation uses something >>> >> else to store the date, you risk having to convert the data just to >>> >> measure the size. Possibly this would be different if we measured size >>> >> using UTF16 as javascript more or less enforces that the source string >>> >> is UTF16 which means that you can measure utf16 size on the cheap, >>> >> even if the stored data uses a different format. >>> > >>> > That's a very good point. What's your suggestion then? Spec unlimited >>> > storage and have non-normative text saying that >>> > most implementations will >>> > likely have some limit? Maybe we can at least spec a minimum limit in >>> > terms >>> > of a particular character encoding? (Implementations could translate >>> > this >>> > into the worst case size for their own native encoding and then ensure >>> > their >>> > limit is higher.) >>> >>> I'm fine with relying on UTF16 encoding size and specifying a 64K >>> limit. Like Shawn points out, this API is fairly geared towards >>> JavaScript anyway (and I personally don't think that's a bad thing). >>> One thing that I just thought of is that even if implementations use >>> other encodings, you can in the vast majority of cases do a worst-case >>> estimate and easily see that the key that is used is below 64K. >>> >>> That said, does having a 64K limit really help anyone? In SQLite we >>> can easily store vastly more than that, enough that we don't have to >>> specify a limit. And my understanding is that in the Microsoft >>> implementation, the limits for what they can store without re
RE: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
(sorry for my random out-of-timing previous email on this thread. please see below for an actually up to date reply) -Original Message- From: Jonas Sicking [mailto:jo...@sicking.cc] Sent: Monday, February 07, 2011 3:31 PM On Mon, Feb 7, 2011 at 3:07 PM, Jeremy Orlow wrote: > On Mon, Feb 7, 2011 at 2:49 PM, Jonas Sicking wrote: >> >> On Sun, Feb 6, 2011 at 11:41 PM, Jeremy Orlow wrote: >> > On Sun, Feb 6, 2011 at 11:38 PM, Jonas Sicking wrote: >> >> >> >> On Sun, Feb 6, 2011 at 2:31 PM, Jeremy Orlow >> >> wrote: >> >> > On Sun, Feb 6, 2011 at 2:03 PM, Shawn Wilsher >> >> > wrote: >> >> >> >> >> >> On 2/6/2011 12:42 PM, Jeremy Orlow wrote: >> >> >>> >> >> >>> My current thinking is that we should have some relatively large >> >> >>> limitmaybe on the order of 64k? It seems like it'd be very >> >> >>> difficult >> >> >>> to >> >> >>> hit such a limit with any sort of legitimate use case, and the >> >> >>> chances >> >> >>> of >> >> >>> some subtle data-dependent error would be much less. But a 1GB key >> >> >>> is >> >> >>> just >> >> >>> not going to work well in any implementation (if it doesn't simply >> >> >>> oom >> >> >>> the >> >> >>> process!). So despite what I said earlier, I guess I think we >> >> >>> should >> >> >>> have >> >> >>> some limit...but keep it an order of magnitude or two larger than >> >> >>> what >> >> >>> we >> >> >>> expect any legitimate usage to hit just to keep the system as >> >> >>> flexible >> >> >>> as >> >> >>> possible. >> >> >>> >> >> >>> Does that sound reasonable to people? >> >> >> >> >> >> Are we thinking about making this a MUST requirement, or a SHOULD? >> >> >> I'm >> >> >> hesitant to spec an exact size as a MUST given how technology has a >> >> >> way >> >> >> of >> >> >> changing in unexpected ways that makes old constraints obsolete. >> >> >> But >> >> >> then, >> >> >> I may just be overly concerned about this too. >> >> > >> >> > If we put a limit, it'd be a MUST for sure. Otherwise people would >> >> > develop >> >> > against one of the implementations that don't place a limit and then >> >> > their >> >> > app would break on the others. >> >> > The reason that I suggested 64K is that it seems outrageously big for >> >> > the >> >> > data types that we're looking at. But it's too small to do much with >> >> > base64 >> >> > encoding binary blobs into it or anything else like that that I could >> >> > see >> >> > becoming rather large. So it seems like a limit that'd avoid major >> >> > abuses >> >> > (where someone is probably approaching the problem wrong) but would >> >> > not >> >> > come >> >> > close to limiting any practical use I can imagine. >> >> > With our architecture in Chrome, we will probably need to have some >> >> > limit. >> >> > We haven't decided what that is yet, but since I remember others >> >> > saying >> >> > similar things when we talked about this at TPAC, it seems like it >> >> > might >> >> > be >> >> > best to standardize it--even though it does feel a bit dirty. >> >> >> >> One problem with putting a limit is that it basically forces >> >> implementations to use a specific encoding, or pay a hefty price. For >> >> example if we choose a 64K limit, is that of UTF8 data or of UTF16 >> >> data? If it is of UTF8 data, and the implementation uses something >> >> else to store the date, you risk having to convert the data just to >> >> measure the size. Possibly this would be different if we measured size >> >> using UTF16 as javascript more or less enforces that the source string >> >> is UTF16 which means that you can measure utf16 size on the cheap, >> >> even if the stored data uses a different format. >> > >> > That's a very good point. What's your suggestion then? Spec unlimited >> > storage and have non-normative text saying that >> > most implementations will >> > likely have some limit? Maybe we can at least spec a minimum limit in >> > terms >> > of a particular character encoding? (Implementations could translate >> > this >> > into the worst case size for their own native encoding and then ensure >> > their >> > limit is higher.) >> >> I'm fine with relying on UTF16 encoding size and specifying a 64K >> limit. Like Shawn points out, this API is fairly geared towards >> JavaScript anyway (and I personally don't think that's a bad thing). >> One thing that I just thought of is that even if implementations use >> other encodings, you can in the vast majority of cases do a worst-case >> estimate and easily see that the key that is used is below 64K. >> >> That said, does having a 64K limit really help anyone? In SQLite we >> can easily store vastly more than that, enough that we don't have to >> specify a limit. And my understanding is that in the Microsoft >> implementation, the limits for what they can store without resorting >> to various tricks, is much lower. So since that implementation will >> have to implement special handling of long keys anyway, is there a >> difference between say
RE: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
>> From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow >> Sent: Sunday, February 06, 2011 12:43 PM >> >> On Tue, Dec 14, 2010 at 4:26 PM, Pablo Castro >> wrote: >> >> From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow >> Sent: Tuesday, December 14, 2010 4:23 PM >> >> >> On Wed, Dec 15, 2010 at 12:19 AM, Pablo Castro >> >> wrote: >> >> >> >> From: public-webapps-requ...@w3.org >> >> [mailto:public-webapps-requ...@w3.org] On Behalf Of Jonas Sicking >> >> Sent: Friday, December 10, 2010 1:42 PM >> >> >> >> >> On Fri, Dec 10, 2010 at 7:32 AM, Jeremy Orlow >> >> >> wrote: >> >> >> > Any more thoughts on this? >> >> >> >> >> >> I don't feel strongly one way or another. Implementation wise I don't >> >> >> really understand why implementations couldn't use keys of unlimited >> >> >> size. I wouldn't imagine implementations would want to use fixed-size >> >> >> allocations for every key anyway, right (which would be a strong >> >> >> reason to keep maximum size down). >> >> I don't have a very strong opinion either. I don't quite agree with the >> >> guideline of "having something working slowly is better than not working >> >> at all"...as having something not work at all sometimes may help >> >> developers hit a wall and think differently about their approach for a >> >> given problem. That said, if folks think this is an instance where we're >> >> better off not having a limit I'm fine with it. >> >> >> >> My only concern is that the developer might not hit this wall, but then >> >> some user (doing things the developer didn't fully anticipate) could hit >> >> that wall. I can definitely see both sides of the argument though. And >> >> elsewhere we've headed more in the direction of forcing the developer to >> >> think about performance, but this case seems a bit more non-deterministic >> >> than any of those. >> >> Yeah, that's a good point for this case, avoiding data-dependent errors is >> probably worth the perf hit. >> >> My current thinking is that we should have some relatively large >> limitmaybe on the order of 64k? It seems like it'd be very difficult to >> hit such a limit with any sort of legitimate use case, and the chances of >> some subtle data-dependent error would be much less. But a 1GB key is just >> not going to work well in any implementation (if it doesn't simply oom the >> process!). So despite what I said earlier, I guess I think we should have >> some limit...but keep it an order of magnitude or two larger than what we >> expect any legitimate usage to hit just to keep the system as flexible as >> possible. >> >> Does that sound reasonable to people? I thought we were trying to avoid data-dependent errors and thus shooting for having no limit (which may translate into having very large limits in actual implementations but not the kind of thing you'd typically hit). Specifying an exact size may be a bit weird...I guess an alternative could be to spec what is the minimum size UAs need to support. A related problem is what units is this specified in, if it's bytes then that means developers need to make assumptions about how strings are stored or something. -pablo
Re: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
On Mon, Feb 7, 2011 at 3:07 PM, Jeremy Orlow wrote: > On Mon, Feb 7, 2011 at 2:49 PM, Jonas Sicking wrote: >> >> On Sun, Feb 6, 2011 at 11:41 PM, Jeremy Orlow wrote: >> > On Sun, Feb 6, 2011 at 11:38 PM, Jonas Sicking wrote: >> >> >> >> On Sun, Feb 6, 2011 at 2:31 PM, Jeremy Orlow >> >> wrote: >> >> > On Sun, Feb 6, 2011 at 2:03 PM, Shawn Wilsher >> >> > wrote: >> >> >> >> >> >> On 2/6/2011 12:42 PM, Jeremy Orlow wrote: >> >> >>> >> >> >>> My current thinking is that we should have some relatively large >> >> >>> limitmaybe on the order of 64k? It seems like it'd be very >> >> >>> difficult >> >> >>> to >> >> >>> hit such a limit with any sort of legitimate use case, and the >> >> >>> chances >> >> >>> of >> >> >>> some subtle data-dependent error would be much less. But a 1GB key >> >> >>> is >> >> >>> just >> >> >>> not going to work well in any implementation (if it doesn't simply >> >> >>> oom >> >> >>> the >> >> >>> process!). So despite what I said earlier, I guess I think we >> >> >>> should >> >> >>> have >> >> >>> some limit...but keep it an order of magnitude or two larger than >> >> >>> what >> >> >>> we >> >> >>> expect any legitimate usage to hit just to keep the system as >> >> >>> flexible >> >> >>> as >> >> >>> possible. >> >> >>> >> >> >>> Does that sound reasonable to people? >> >> >> >> >> >> Are we thinking about making this a MUST requirement, or a SHOULD? >> >> >> I'm >> >> >> hesitant to spec an exact size as a MUST given how technology has a >> >> >> way >> >> >> of >> >> >> changing in unexpected ways that makes old constraints obsolete. >> >> >> But >> >> >> then, >> >> >> I may just be overly concerned about this too. >> >> > >> >> > If we put a limit, it'd be a MUST for sure. Otherwise people would >> >> > develop >> >> > against one of the implementations that don't place a limit and then >> >> > their >> >> > app would break on the others. >> >> > The reason that I suggested 64K is that it seems outrageously big for >> >> > the >> >> > data types that we're looking at. But it's too small to do much with >> >> > base64 >> >> > encoding binary blobs into it or anything else like that that I could >> >> > see >> >> > becoming rather large. So it seems like a limit that'd avoid major >> >> > abuses >> >> > (where someone is probably approaching the problem wrong) but would >> >> > not >> >> > come >> >> > close to limiting any practical use I can imagine. >> >> > With our architecture in Chrome, we will probably need to have some >> >> > limit. >> >> > We haven't decided what that is yet, but since I remember others >> >> > saying >> >> > similar things when we talked about this at TPAC, it seems like it >> >> > might >> >> > be >> >> > best to standardize it--even though it does feel a bit dirty. >> >> >> >> One problem with putting a limit is that it basically forces >> >> implementations to use a specific encoding, or pay a hefty price. For >> >> example if we choose a 64K limit, is that of UTF8 data or of UTF16 >> >> data? If it is of UTF8 data, and the implementation uses something >> >> else to store the date, you risk having to convert the data just to >> >> measure the size. Possibly this would be different if we measured size >> >> using UTF16 as javascript more or less enforces that the source string >> >> is UTF16 which means that you can measure utf16 size on the cheap, >> >> even if the stored data uses a different format. >> > >> > That's a very good point. What's your suggestion then? Spec unlimited >> > storage and have non-normative text saying that >> > most implementations will >> > likely have some limit? Maybe we can at least spec a minimum limit in >> > terms >> > of a particular character encoding? (Implementations could translate >> > this >> > into the worst case size for their own native encoding and then ensure >> > their >> > limit is higher.) >> >> I'm fine with relying on UTF16 encoding size and specifying a 64K >> limit. Like Shawn points out, this API is fairly geared towards >> JavaScript anyway (and I personally don't think that's a bad thing). >> One thing that I just thought of is that even if implementations use >> other encodings, you can in the vast majority of cases do a worst-case >> estimate and easily see that the key that is used is below 64K. >> >> That said, does having a 64K limit really help anyone? In SQLite we >> can easily store vastly more than that, enough that we don't have to >> specify a limit. And my understanding is that in the Microsoft >> implementation, the limits for what they can store without resorting >> to various tricks, is much lower. So since that implementation will >> have to implement special handling of long keys anyway, is there a >> difference between saying a 64K limit vs. saying unlimited? > > As I explained earlier: "The reason that I suggested 64K is that it seems > outrageously big for the data types that we're looking at. But it's too > small to do much with base64 encoding bin
Re: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
On Mon, Feb 7, 2011 at 2:49 PM, Jonas Sicking wrote: > On Sun, Feb 6, 2011 at 11:41 PM, Jeremy Orlow wrote: > > On Sun, Feb 6, 2011 at 11:38 PM, Jonas Sicking wrote: > >> > >> On Sun, Feb 6, 2011 at 2:31 PM, Jeremy Orlow > wrote: > >> > On Sun, Feb 6, 2011 at 2:03 PM, Shawn Wilsher > >> > wrote: > >> >> > >> >> On 2/6/2011 12:42 PM, Jeremy Orlow wrote: > >> >>> > >> >>> My current thinking is that we should have some relatively large > >> >>> limitmaybe on the order of 64k? It seems like it'd be very > >> >>> difficult > >> >>> to > >> >>> hit such a limit with any sort of legitimate use case, and the > chances > >> >>> of > >> >>> some subtle data-dependent error would be much less. But a 1GB key > is > >> >>> just > >> >>> not going to work well in any implementation (if it doesn't simply > oom > >> >>> the > >> >>> process!). So despite what I said earlier, I guess I think we > should > >> >>> have > >> >>> some limit...but keep it an order of magnitude or two larger than > what > >> >>> we > >> >>> expect any legitimate usage to hit just to keep the system as > flexible > >> >>> as > >> >>> possible. > >> >>> > >> >>> Does that sound reasonable to people? > >> >> > >> >> Are we thinking about making this a MUST requirement, or a SHOULD? > I'm > >> >> hesitant to spec an exact size as a MUST given how technology has a > way > >> >> of > >> >> changing in unexpected ways that makes old constraints obsolete. But > >> >> then, > >> >> I may just be overly concerned about this too. > >> > > >> > If we put a limit, it'd be a MUST for sure. Otherwise people would > >> > develop > >> > against one of the implementations that don't place a limit and then > >> > their > >> > app would break on the others. > >> > The reason that I suggested 64K is that it seems outrageously big for > >> > the > >> > data types that we're looking at. But it's too small to do much with > >> > base64 > >> > encoding binary blobs into it or anything else like that that I could > >> > see > >> > becoming rather large. So it seems like a limit that'd avoid major > >> > abuses > >> > (where someone is probably approaching the problem wrong) but would > not > >> > come > >> > close to limiting any practical use I can imagine. > >> > With our architecture in Chrome, we will probably need to have some > >> > limit. > >> > We haven't decided what that is yet, but since I remember others > saying > >> > similar things when we talked about this at TPAC, it seems like it > might > >> > be > >> > best to standardize it--even though it does feel a bit dirty. > >> > >> One problem with putting a limit is that it basically forces > >> implementations to use a specific encoding, or pay a hefty price. For > >> example if we choose a 64K limit, is that of UTF8 data or of UTF16 > >> data? If it is of UTF8 data, and the implementation uses something > >> else to store the date, you risk having to convert the data just to > >> measure the size. Possibly this would be different if we measured size > >> using UTF16 as javascript more or less enforces that the source string > >> is UTF16 which means that you can measure utf16 size on the cheap, > >> even if the stored data uses a different format. > > > > That's a very good point. What's your suggestion then? Spec unlimited > > storage and have non-normative text saying that most implementations will > > likely have some limit? Maybe we can at least spec a minimum limit in > terms > > of a particular character encoding? (Implementations could translate > this > > into the worst case size for their own native encoding and then ensure > their > > limit is higher.) > > I'm fine with relying on UTF16 encoding size and specifying a 64K > limit. Like Shawn points out, this API is fairly geared towards > JavaScript anyway (and I personally don't think that's a bad thing). > One thing that I just thought of is that even if implementations use > other encodings, you can in the vast majority of cases do a worst-case > estimate and easily see that the key that is used is below 64K. > > That said, does having a 64K limit really help anyone? In SQLite we > can easily store vastly more than that, enough that we don't have to > specify a limit. And my understanding is that in the Microsoft > implementation, the limits for what they can store without resorting > to various tricks, is much lower. So since that implementation will > have to implement special handling of long keys anyway, is there a > difference between saying a 64K limit vs. saying unlimited? > As I explained earlier: "The reason that I suggested 64K is that it seems outrageously big for the data types that we're looking at. But it's too small to do much with base64 encoding binary blobs into it or anything else like that that I could see becoming rather large. So it seems like a limit that'd avoid major abuses (where someone is probably approaching the problem wrong) but would not come close to limiting any practical use I can ima
Re: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
On Sun, Feb 6, 2011 at 11:41 PM, Jeremy Orlow wrote: > On Sun, Feb 6, 2011 at 11:38 PM, Jonas Sicking wrote: >> >> On Sun, Feb 6, 2011 at 2:31 PM, Jeremy Orlow wrote: >> > On Sun, Feb 6, 2011 at 2:03 PM, Shawn Wilsher >> > wrote: >> >> >> >> On 2/6/2011 12:42 PM, Jeremy Orlow wrote: >> >>> >> >>> My current thinking is that we should have some relatively large >> >>> limitmaybe on the order of 64k? It seems like it'd be very >> >>> difficult >> >>> to >> >>> hit such a limit with any sort of legitimate use case, and the chances >> >>> of >> >>> some subtle data-dependent error would be much less. But a 1GB key is >> >>> just >> >>> not going to work well in any implementation (if it doesn't simply oom >> >>> the >> >>> process!). So despite what I said earlier, I guess I think we should >> >>> have >> >>> some limit...but keep it an order of magnitude or two larger than what >> >>> we >> >>> expect any legitimate usage to hit just to keep the system as flexible >> >>> as >> >>> possible. >> >>> >> >>> Does that sound reasonable to people? >> >> >> >> Are we thinking about making this a MUST requirement, or a SHOULD? I'm >> >> hesitant to spec an exact size as a MUST given how technology has a way >> >> of >> >> changing in unexpected ways that makes old constraints obsolete. But >> >> then, >> >> I may just be overly concerned about this too. >> > >> > If we put a limit, it'd be a MUST for sure. Otherwise people would >> > develop >> > against one of the implementations that don't place a limit and then >> > their >> > app would break on the others. >> > The reason that I suggested 64K is that it seems outrageously big for >> > the >> > data types that we're looking at. But it's too small to do much with >> > base64 >> > encoding binary blobs into it or anything else like that that I could >> > see >> > becoming rather large. So it seems like a limit that'd avoid major >> > abuses >> > (where someone is probably approaching the problem wrong) but would not >> > come >> > close to limiting any practical use I can imagine. >> > With our architecture in Chrome, we will probably need to have some >> > limit. >> > We haven't decided what that is yet, but since I remember others saying >> > similar things when we talked about this at TPAC, it seems like it might >> > be >> > best to standardize it--even though it does feel a bit dirty. >> >> One problem with putting a limit is that it basically forces >> implementations to use a specific encoding, or pay a hefty price. For >> example if we choose a 64K limit, is that of UTF8 data or of UTF16 >> data? If it is of UTF8 data, and the implementation uses something >> else to store the date, you risk having to convert the data just to >> measure the size. Possibly this would be different if we measured size >> using UTF16 as javascript more or less enforces that the source string >> is UTF16 which means that you can measure utf16 size on the cheap, >> even if the stored data uses a different format. > > That's a very good point. What's your suggestion then? Spec unlimited > storage and have non-normative text saying that most implementations will > likely have some limit? Maybe we can at least spec a minimum limit in terms > of a particular character encoding? (Implementations could translate this > into the worst case size for their own native encoding and then ensure their > limit is higher.) I'm fine with relying on UTF16 encoding size and specifying a 64K limit. Like Shawn points out, this API is fairly geared towards JavaScript anyway (and I personally don't think that's a bad thing). One thing that I just thought of is that even if implementations use other encodings, you can in the vast majority of cases do a worst-case estimate and easily see that the key that is used is below 64K. That said, does having a 64K limit really help anyone? In SQLite we can easily store vastly more than that, enough that we don't have to specify a limit. And my understanding is that in the Microsoft implementation, the limits for what they can store without resorting to various tricks, is much lower. So since that implementation will have to implement special handling of long keys anyway, is there a difference between saying a 64K limit vs. saying unlimited? Pablo: Would love to get your input on the above. / Jonas
Re: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
On 2/7/2011 12:32 AM, Glenn Maynard wrote: Is that a safe assumption to design around? The API might later be bound to other languages fortunate enough not to be stuck in UTF-16. As I recall, we've already made design decisions based on the fact that the primary consumer of this API is going to be JavaScript on the web. (What those decisions were about, I don't recall offhand, however.) Cheers, Shawn smime.p7s Description: S/MIME Cryptographic Signature
Re: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
On Mon, Feb 7, 2011 at 2:38 AM, Jonas Sicking wrote: > One problem with putting a limit is that it basically forces > implementations to use a specific encoding, or pay a hefty price. For > example if we choose a 64K limit, is that of UTF8 data or of UTF16 > data? If it is of UTF8 data, and the implementation uses something > else to store the date, you risk having to convert the data just to > measure the size. Possibly this would be different if we measured size > using UTF16 as javascript more or less enforces that the source string > is UTF16 which means that you can measure utf16 size on the cheap, > even if the stored data uses a different format. > Is that a safe assumption to design around? The API might later be bound to other languages fortunate enough not to be stuck in UTF-16. -- Glenn Maynard
Re: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
On Sun, Feb 6, 2011 at 11:38 PM, Jonas Sicking wrote: > On Sun, Feb 6, 2011 at 2:31 PM, Jeremy Orlow wrote: > > On Sun, Feb 6, 2011 at 2:03 PM, Shawn Wilsher > wrote: > >> > >> On 2/6/2011 12:42 PM, Jeremy Orlow wrote: > >>> > >>> My current thinking is that we should have some relatively large > >>> limitmaybe on the order of 64k? It seems like it'd be very > difficult > >>> to > >>> hit such a limit with any sort of legitimate use case, and the chances > of > >>> some subtle data-dependent error would be much less. But a 1GB key is > >>> just > >>> not going to work well in any implementation (if it doesn't simply oom > >>> the > >>> process!). So despite what I said earlier, I guess I think we should > >>> have > >>> some limit...but keep it an order of magnitude or two larger than what > we > >>> expect any legitimate usage to hit just to keep the system as flexible > as > >>> possible. > >>> > >>> Does that sound reasonable to people? > >> > >> Are we thinking about making this a MUST requirement, or a SHOULD? I'm > >> hesitant to spec an exact size as a MUST given how technology has a way > of > >> changing in unexpected ways that makes old constraints obsolete. But > then, > >> I may just be overly concerned about this too. > > > > If we put a limit, it'd be a MUST for sure. Otherwise people would > develop > > against one of the implementations that don't place a limit and then > their > > app would break on the others. > > The reason that I suggested 64K is that it seems outrageously big for the > > data types that we're looking at. But it's too small to do much with > base64 > > encoding binary blobs into it or anything else like that that I could see > > becoming rather large. So it seems like a limit that'd avoid major > abuses > > (where someone is probably approaching the problem wrong) but would not > come > > close to limiting any practical use I can imagine. > > With our architecture in Chrome, we will probably need to have some > limit. > > We haven't decided what that is yet, but since I remember others saying > > similar things when we talked about this at TPAC, it seems like it might > be > > best to standardize it--even though it does feel a bit dirty. > > One problem with putting a limit is that it basically forces > implementations to use a specific encoding, or pay a hefty price. For > example if we choose a 64K limit, is that of UTF8 data or of UTF16 > data? If it is of UTF8 data, and the implementation uses something > else to store the date, you risk having to convert the data just to > measure the size. Possibly this would be different if we measured size > using UTF16 as javascript more or less enforces that the source string > is UTF16 which means that you can measure utf16 size on the cheap, > even if the stored data uses a different format. > That's a very good point. What's your suggestion then? Spec unlimited storage and have non-normative text saying that most implementations will likely have some limit? Maybe we can at least spec a minimum limit in terms of a particular character encoding? (Implementations could translate this into the worst case size for their own native encoding and then ensure their limit is higher.) J
Re: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
On Sun, Feb 6, 2011 at 2:31 PM, Jeremy Orlow wrote: > On Sun, Feb 6, 2011 at 2:03 PM, Shawn Wilsher wrote: >> >> On 2/6/2011 12:42 PM, Jeremy Orlow wrote: >>> >>> My current thinking is that we should have some relatively large >>> limitmaybe on the order of 64k? It seems like it'd be very difficult >>> to >>> hit such a limit with any sort of legitimate use case, and the chances of >>> some subtle data-dependent error would be much less. But a 1GB key is >>> just >>> not going to work well in any implementation (if it doesn't simply oom >>> the >>> process!). So despite what I said earlier, I guess I think we should >>> have >>> some limit...but keep it an order of magnitude or two larger than what we >>> expect any legitimate usage to hit just to keep the system as flexible as >>> possible. >>> >>> Does that sound reasonable to people? >> >> Are we thinking about making this a MUST requirement, or a SHOULD? I'm >> hesitant to spec an exact size as a MUST given how technology has a way of >> changing in unexpected ways that makes old constraints obsolete. But then, >> I may just be overly concerned about this too. > > If we put a limit, it'd be a MUST for sure. Otherwise people would develop > against one of the implementations that don't place a limit and then their > app would break on the others. > The reason that I suggested 64K is that it seems outrageously big for the > data types that we're looking at. But it's too small to do much with base64 > encoding binary blobs into it or anything else like that that I could see > becoming rather large. So it seems like a limit that'd avoid major abuses > (where someone is probably approaching the problem wrong) but would not come > close to limiting any practical use I can imagine. > With our architecture in Chrome, we will probably need to have some limit. > We haven't decided what that is yet, but since I remember others saying > similar things when we talked about this at TPAC, it seems like it might be > best to standardize it--even though it does feel a bit dirty. One problem with putting a limit is that it basically forces implementations to use a specific encoding, or pay a hefty price. For example if we choose a 64K limit, is that of UTF8 data or of UTF16 data? If it is of UTF8 data, and the implementation uses something else to store the date, you risk having to convert the data just to measure the size. Possibly this would be different if we measured size using UTF16 as javascript more or less enforces that the source string is UTF16 which means that you can measure utf16 size on the cheap, even if the stored data uses a different format. / Jonas
Re: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
On Sun, Feb 6, 2011 at 2:03 PM, Shawn Wilsher wrote: > On 2/6/2011 12:42 PM, Jeremy Orlow wrote: > >> My current thinking is that we should have some relatively large >> limitmaybe on the order of 64k? It seems like it'd be very difficult >> to >> hit such a limit with any sort of legitimate use case, and the chances of >> some subtle data-dependent error would be much less. But a 1GB key is >> just >> not going to work well in any implementation (if it doesn't simply oom the >> process!). So despite what I said earlier, I guess I think we should have >> some limit...but keep it an order of magnitude or two larger than what we >> expect any legitimate usage to hit just to keep the system as flexible as >> possible. >> >> Does that sound reasonable to people? >> > Are we thinking about making this a MUST requirement, or a SHOULD? I'm > hesitant to spec an exact size as a MUST given how technology has a way of > changing in unexpected ways that makes old constraints obsolete. But then, > I may just be overly concerned about this too. > If we put a limit, it'd be a MUST for sure. Otherwise people would develop against one of the implementations that don't place a limit and then their app would break on the others. The reason that I suggested 64K is that it seems outrageously big for the data types that we're looking at. But it's too small to do much with base64 encoding binary blobs into it or anything else like that that I could see becoming rather large. So it seems like a limit that'd avoid major abuses (where someone is probably approaching the problem wrong) but would not come close to limiting any practical use I can imagine. With our architecture in Chrome, we will probably need to have some limit. We haven't decided what that is yet, but since I remember others saying similar things when we talked about this at TPAC, it seems like it might be best to standardize it--even though it does feel a bit dirty. J
Re: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
On 2/6/2011 12:42 PM, Jeremy Orlow wrote: My current thinking is that we should have some relatively large limitmaybe on the order of 64k? It seems like it'd be very difficult to hit such a limit with any sort of legitimate use case, and the chances of some subtle data-dependent error would be much less. But a 1GB key is just not going to work well in any implementation (if it doesn't simply oom the process!). So despite what I said earlier, I guess I think we should have some limit...but keep it an order of magnitude or two larger than what we expect any legitimate usage to hit just to keep the system as flexible as possible. Does that sound reasonable to people? Are we thinking about making this a MUST requirement, or a SHOULD? I'm hesitant to spec an exact size as a MUST given how technology has a way of changing in unexpected ways that makes old constraints obsolete. But then, I may just be overly concerned about this too. Cheers, Shawn smime.p7s Description: S/MIME Cryptographic Signature
Re: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
On Tue, Dec 14, 2010 at 4:26 PM, Pablo Castro wrote: > > From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy > Orlow > Sent: Tuesday, December 14, 2010 4:23 PM > > >> On Wed, Dec 15, 2010 at 12:19 AM, Pablo Castro < > pablo.cas...@microsoft.com> wrote: > >> > >> From: public-webapps-requ...@w3.org [mailto: > public-webapps-requ...@w3.org] On Behalf Of Jonas Sicking > >> Sent: Friday, December 10, 2010 1:42 PM > >> > >> >> On Fri, Dec 10, 2010 at 7:32 AM, Jeremy Orlow > wrote: > >> >> > Any more thoughts on this? > >> >> > >> >> I don't feel strongly one way or another. Implementation wise I don't > >> >> really understand why implementations couldn't use keys of unlimited > >> >> size. I wouldn't imagine implementations would want to use fixed-size > >> >> allocations for every key anyway, right (which would be a strong > >> >> reason to keep maximum size down). > >> I don't have a very strong opinion either. I don't quite agree with the > guideline of "having something working slowly is better than not working at > all"...as having something not work at all sometimes may help developers hit > a wall and think differently about their approach for a given problem. That > said, if folks think this is an instance where we're better off not having a > limit I'm fine with it. > >> > >> My only concern is that the developer might not hit this wall, but then > some user (doing things the developer didn't fully anticipate) could hit > that wall. I can definitely see both sides of the argument though. And > elsewhere we've headed more in the direction of forcing the developer to > think about performance, but this case seems a bit more non-deterministic > than any of those. > > Yeah, that's a good point for this case, avoiding data-dependent errors is > probably worth the perf hit. My current thinking is that we should have some relatively large limitmaybe on the order of 64k? It seems like it'd be very difficult to hit such a limit with any sort of legitimate use case, and the chances of some subtle data-dependent error would be much less. But a 1GB key is just not going to work well in any implementation (if it doesn't simply oom the process!). So despite what I said earlier, I guess I think we should have some limit...but keep it an order of magnitude or two larger than what we expect any legitimate usage to hit just to keep the system as flexible as possible. Does that sound reasonable to people? J
RE: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow Sent: Tuesday, December 14, 2010 4:23 PM >> On Wed, Dec 15, 2010 at 12:19 AM, Pablo Castro >> wrote: >> >> From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] >> On Behalf Of Jonas Sicking >> Sent: Friday, December 10, 2010 1:42 PM >> >> >> On Fri, Dec 10, 2010 at 7:32 AM, Jeremy Orlow wrote: >> >> > Any more thoughts on this? >> >> >> >> I don't feel strongly one way or another. Implementation wise I don't >> >> really understand why implementations couldn't use keys of unlimited >> >> size. I wouldn't imagine implementations would want to use fixed-size >> >> allocations for every key anyway, right (which would be a strong >> >> reason to keep maximum size down). >> I don't have a very strong opinion either. I don't quite agree with the >> guideline of "having something working slowly is better than not working at >> all"...as having something not work at all sometimes may help developers hit >> a wall and think differently about their approach for a given problem. That >> said, if folks think this is an instance where we're better off not having a >> limit I'm fine with it. >> >> My only concern is that the developer might not hit this wall, but then some >> user (doing things the developer didn't fully anticipate) could hit that >> wall. I can definitely see both sides of the argument though. And >> elsewhere we've headed more in the direction of forcing the developer to >> think about performance, but this case seems a bit more non-deterministic >> than any of those. Yeah, that's a good point for this case, avoiding data-dependent errors is probably worth the perf hit. -pc
Re: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
On Wed, Dec 15, 2010 at 12:19 AM, Pablo Castro wrote: > > From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] > On Behalf Of Jonas Sicking > Sent: Friday, December 10, 2010 1:42 PM > > >> On Fri, Dec 10, 2010 at 7:32 AM, Jeremy Orlow > wrote: > >> > Any more thoughts on this? > >> > >> I don't feel strongly one way or another. Implementation wise I don't > >> really understand why implementations couldn't use keys of unlimited > >> size. I wouldn't imagine implementations would want to use fixed-size > >> allocations for every key anyway, right (which would be a strong > >> reason to keep maximum size down). > > I don't have a very strong opinion either. I don't quite agree with the > guideline of "having something working slowly is better than not working at > all"...as having something not work at all sometimes may help developers hit > a wall and think differently about their approach for a given problem. That > said, if folks think this is an instance where we're better off not having a > limit I'm fine with it. > My only concern is that the developer might not hit this wall, but then some user (doing things the developer didn't fully anticipate) could hit that wall. I can definitely see both sides of the argument though. And elsewhere we've headed more in the direction of forcing the developer to think about performance, but this case seems a bit more non-deterministic than any of those. > >> Pablo, do you know why the back ends you were looking at had such > >> relatively low limits? > > Mostly an implementation thing. Keys (and all other non-blob columns) > typically need to fit in a page. Predictable perf is also nice (no linked > lists, high density/locality, etc.), but not as fundamental as page size. > > -pablo > >
RE: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Jonas Sicking Sent: Friday, December 10, 2010 1:42 PM >> On Fri, Dec 10, 2010 at 7:32 AM, Jeremy Orlow wrote: >> > Any more thoughts on this? >> >> I don't feel strongly one way or another. Implementation wise I don't >> really understand why implementations couldn't use keys of unlimited >> size. I wouldn't imagine implementations would want to use fixed-size >> allocations for every key anyway, right (which would be a strong >> reason to keep maximum size down). I don't have a very strong opinion either. I don't quite agree with the guideline of "having something working slowly is better than not working at all"...as having something not work at all sometimes may help developers hit a wall and think differently about their approach for a given problem. That said, if folks think this is an instance where we're better off not having a limit I'm fine with it. >> Pablo, do you know why the back ends you were looking at had such >> relatively low limits? Mostly an implementation thing. Keys (and all other non-blob columns) typically need to fit in a page. Predictable perf is also nice (no linked lists, high density/locality, etc.), but not as fundamental as page size. -pablo
Re: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
On Fri, Dec 10, 2010 at 7:32 AM, Jeremy Orlow wrote: > Any more thoughts on this? I don't feel strongly one way or another. Implementation wise I don't really understand why implementations couldn't use keys of unlimited size. I wouldn't imagine implementations would want to use fixed-size allocations for every key anyway, right (which would be a strong reason to keep maximum size down). Pablo, do you know why the back ends you were looking at had such relatively low limits? At the same time, I suspect that very few people would run into problems if we set the limit at a K or two of bytes. It's in general a good idea to limit strings around somewhere 2^30 bytes as to avoid overflow problems, but such limits are large enough that I'm not even convinced they need to be specified. / Jonas
Re: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
Any more thoughts on this? On Mon, Nov 22, 2010 at 12:05 PM, Jeremy Orlow wrote: > Something working (but with degraded performance) is better than not > working at all. Especially when keys will often come from user data/input > and thus simple web apps will likely not handle the exceptions large keys > might generate. Throughout the rest of IndexedDB, we've taken quite a bit > of care to make sure that we don't throw exceptions on hard to anticipate > edge cases, I don't think this is terribly different. > > Storing a prefix and then doing lookups into the actual value seems like a > good way for implementations to handle it, but it's certainly not the only > way. Yes, this will turn into linear performance in the worst case, but in > practice I think you'll find that before the linear performance kills you, > various other issues with using IndexedDB like this will kill you. :-) > > I'm fine with us adding non-normative text reminding people that large keys > will be slow and having a recommended minimum key size that implementations > should try and make sure hits a reasonably fast path. But I think we should > make sure that implementations don't break with big keys. > > J > > > On Sat, Nov 20, 2010 at 10:49 AM, Jonas Sicking wrote: > >> On Fri, Nov 19, 2010 at 8:13 PM, Bjoern Hoehrmann >> wrote: >> > * Jonas Sicking wrote: >> >>The question is in part where the limit for "ridiculous" goes. 1K keys >> >>are sort of ridiculous, though I'm sure it happens. >> > >> > By "ridiculous" I mean that common systems would run out of memory. That >> > is different among systems, and I would expect developers to consider it >> > up to an order of magnitude, but not beyond that. Clearly, to me, a DB >> > system should not fail because I want to store 100 keys á 100KB. >> >> Note that at issue here isn't the total size of keys, but the key size >> of an individual entry. I'm not sure that I'd expect a 100KB key size >> to work. >> >> >>> Note that, since JavaScript does not offer key-value dictionaries for >> >>> complex keys, and now that JSON.stringify is widely implemented, it's >> >>> quite common for people to emulate proper dictionaries by using that >> to >> >>> work around this particular JavaScript limitation. Which would likely >> >>> extend to more persistent forms of storage. >> >> >> >>I don't understand what you mean here. >> > >> > I am saying that it's quite natural to want to have string keys that are >> > much, much longer than someone might envision the length of string keys, >> > mainly because their notion of "string keys" is different from the key >> > length you might get from serializing arbitrary objects. >> >> Still not fully sure I follow you. The only issue here is when using >> plain strings as keys, objects are not allowed to be used as keys. Or >> are you saying that people will use the return value from >> JSON.stringify as key? >> >> / Jonas >> >> >
Re: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
Something working (but with degraded performance) is better than not working at all. Especially when keys will often come from user data/input and thus simple web apps will likely not handle the exceptions large keys might generate. Throughout the rest of IndexedDB, we've taken quite a bit of care to make sure that we don't throw exceptions on hard to anticipate edge cases, I don't think this is terribly different. Storing a prefix and then doing lookups into the actual value seems like a good way for implementations to handle it, but it's certainly not the only way. Yes, this will turn into linear performance in the worst case, but in practice I think you'll find that before the linear performance kills you, various other issues with using IndexedDB like this will kill you. :-) I'm fine with us adding non-normative text reminding people that large keys will be slow and having a recommended minimum key size that implementations should try and make sure hits a reasonably fast path. But I think we should make sure that implementations don't break with big keys. J On Sat, Nov 20, 2010 at 10:49 AM, Jonas Sicking wrote: > On Fri, Nov 19, 2010 at 8:13 PM, Bjoern Hoehrmann > wrote: > > * Jonas Sicking wrote: > >>The question is in part where the limit for "ridiculous" goes. 1K keys > >>are sort of ridiculous, though I'm sure it happens. > > > > By "ridiculous" I mean that common systems would run out of memory. That > > is different among systems, and I would expect developers to consider it > > up to an order of magnitude, but not beyond that. Clearly, to me, a DB > > system should not fail because I want to store 100 keys á 100KB. > > Note that at issue here isn't the total size of keys, but the key size > of an individual entry. I'm not sure that I'd expect a 100KB key size > to work. > > >>> Note that, since JavaScript does not offer key-value dictionaries for > >>> complex keys, and now that JSON.stringify is widely implemented, it's > >>> quite common for people to emulate proper dictionaries by using that to > >>> work around this particular JavaScript limitation. Which would likely > >>> extend to more persistent forms of storage. > >> > >>I don't understand what you mean here. > > > > I am saying that it's quite natural to want to have string keys that are > > much, much longer than someone might envision the length of string keys, > > mainly because their notion of "string keys" is different from the key > > length you might get from serializing arbitrary objects. > > Still not fully sure I follow you. The only issue here is when using > plain strings as keys, objects are not allowed to be used as keys. Or > are you saying that people will use the return value from > JSON.stringify as key? > > / Jonas > >
Re: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
On Fri, Nov 19, 2010 at 8:13 PM, Bjoern Hoehrmann wrote: > * Jonas Sicking wrote: >>The question is in part where the limit for "ridiculous" goes. 1K keys >>are sort of ridiculous, though I'm sure it happens. > > By "ridiculous" I mean that common systems would run out of memory. That > is different among systems, and I would expect developers to consider it > up to an order of magnitude, but not beyond that. Clearly, to me, a DB > system should not fail because I want to store 100 keys á 100KB. Note that at issue here isn't the total size of keys, but the key size of an individual entry. I'm not sure that I'd expect a 100KB key size to work. >>> Note that, since JavaScript does not offer key-value dictionaries for >>> complex keys, and now that JSON.stringify is widely implemented, it's >>> quite common for people to emulate proper dictionaries by using that to >>> work around this particular JavaScript limitation. Which would likely >>> extend to more persistent forms of storage. >> >>I don't understand what you mean here. > > I am saying that it's quite natural to want to have string keys that are > much, much longer than someone might envision the length of string keys, > mainly because their notion of "string keys" is different from the key > length you might get from serializing arbitrary objects. Still not fully sure I follow you. The only issue here is when using plain strings as keys, objects are not allowed to be used as keys. Or are you saying that people will use the return value from JSON.stringify as key? / Jonas
Re: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
Just a thought, because the spec does not limit the key size, does not mean the implementation has to index on huge keys. For example you may choose to index only the first 1000 characters of string keys, and then link the values of key collisions together in the storage node. This way things are kept fast and compact for the more normal key size, and there is a sensible limit. As long as the implementation behaves like it admits arbitrary key sizes, it can actually implement things how it likes. Another example would be one index for keys less than size X, and a separate "oversize" key index for keys of size greater than X. These could use a different internal structure and disk layout. Cheers, Keean. On 20 November 2010 04:13, Bjoern Hoehrmann wrote: > * Jonas Sicking wrote: > >The question is in part where the limit for "ridiculous" goes. 1K keys > >are sort of ridiculous, though I'm sure it happens. > > By "ridiculous" I mean that common systems would run out of memory. That > is different among systems, and I would expect developers to consider it > up to an order of magnitude, but not beyond that. Clearly, to me, a DB > system should not fail because I want to store 100 keys á 100KB. > > >> Note that, since JavaScript does not offer key-value dictionaries for > >> complex keys, and now that JSON.stringify is widely implemented, it's > >> quite common for people to emulate proper dictionaries by using that to > >> work around this particular JavaScript limitation. Which would likely > >> extend to more persistent forms of storage. > > > >I don't understand what you mean here. > > I am saying that it's quite natural to want to have string keys that are > much, much longer than someone might envision the length of string keys, > mainly because their notion of "string keys" is different from the key > length you might get from serializing arbitrary objects. > -- > Björn Höhrmann · mailto:bjo...@hoehrmann.de · http://bjoern.hoehrmann.de > Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de > 25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ > >
Re: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
* Jonas Sicking wrote: >The question is in part where the limit for "ridiculous" goes. 1K keys >are sort of ridiculous, though I'm sure it happens. By "ridiculous" I mean that common systems would run out of memory. That is different among systems, and I would expect developers to consider it up to an order of magnitude, but not beyond that. Clearly, to me, a DB system should not fail because I want to store 100 keys á 100KB. >> Note that, since JavaScript does not offer key-value dictionaries for >> complex keys, and now that JSON.stringify is widely implemented, it's >> quite common for people to emulate proper dictionaries by using that to >> work around this particular JavaScript limitation. Which would likely >> extend to more persistent forms of storage. > >I don't understand what you mean here. I am saying that it's quite natural to want to have string keys that are much, much longer than someone might envision the length of string keys, mainly because their notion of "string keys" is different from the key length you might get from serializing arbitrary objects. -- Björn Höhrmann · mailto:bjo...@hoehrmann.de · http://bjoern.hoehrmann.de Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de 25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
Re: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
On Fri, Nov 19, 2010 at 7:03 PM, Bjoern Hoehrmann wrote: > * Pablo Castro wrote: Just looking at this list, I guess I'm leaning towards _not_ limiting the maximum key size and instead pushing it onto implementations to do the hard work here. If so, we should probably have some normative text about how bigger keys will probably not be handled very efficiently. >> >>I was trying to make up my mind on this, and I'm not sure this is a good idea. >>What would be the options for an implementation? Hashing keys into smaller >>values >>is pretty painful because of sorting requirements (we'd have to index the data >>twice, once for the key prefix that fits within limits, and a second one for a >>hash plus some sort of discriminator for collisions). Just storing a prefix as >>part of the key under the covers obviously won't fly...am I missing some >>other option? >> >>Clearly consistency in these things is important to people don't get caught >>off >>guard. I wonder if we just pick a "reasonable" limit, say 1 K characters >>(yeah, >>trying to do something weird to avoid details of how stuff is actually >>stored), >>and run with it. I looked around at a few databases (from a single vendor :)), >>and they seem to all be well over this but not by orders of magnitude (2KB to >>8KB seems to be the range of upper limits for this in practice). > > No limit would be reasonable, the general, and reasonable, assumption is > that if it works for X it will work for Y, unless Y is ridiculous. There > is also little point in saying for some values of Y performance will be > poor: implementations will cater for what is common, which is usually > not a constant, and when you do unusual things, you already know that it > is not entirely reasonable to expect the "usual" performance. The question is in part where the limit for "ridiculous" goes. 1K keys are sort of ridiculous, though I'm sure it happens. Note that "unusual" performance here means linear search times rather than logarithmic. Which in case of a join could easily mean quadratic. So it's quite commonly not "unusual" performance, but "unacceptable". > Note that, since JavaScript does not offer key-value dictionaries for > complex keys, and now that JSON.stringify is widely implemented, it's > quite common for people to emulate proper dictionaries by using that to > work around this particular JavaScript limitation. Which would likely > extend to more persistent forms of storage. I don't understand what you mean here. / Jonas
Re: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
* Pablo Castro wrote: >>> Just looking at this list, I guess I'm leaning towards _not_ limiting the >>> maximum key size and instead pushing it onto implementations to do the hard >>> work here. If so, we should probably have some normative text about how >>> bigger >>> keys will probably not be handled very efficiently. > >I was trying to make up my mind on this, and I'm not sure this is a good idea. >What would be the options for an implementation? Hashing keys into smaller >values >is pretty painful because of sorting requirements (we'd have to index the data >twice, once for the key prefix that fits within limits, and a second one for a >hash plus some sort of discriminator for collisions). Just storing a prefix as >part of the key under the covers obviously won't fly...am I missing some other >option? > >Clearly consistency in these things is important to people don't get caught off >guard. I wonder if we just pick a "reasonable" limit, say 1 K characters (yeah, >trying to do something weird to avoid details of how stuff is actually stored), >and run with it. I looked around at a few databases (from a single vendor :)), >and they seem to all be well over this but not by orders of magnitude (2KB to >8KB seems to be the range of upper limits for this in practice). No limit would be reasonable, the general, and reasonable, assumption is that if it works for X it will work for Y, unless Y is ridiculous. There is also little point in saying for some values of Y performance will be poor: implementations will cater for what is common, which is usually not a constant, and when you do unusual things, you already know that it is not entirely reasonable to expect the "usual" performance. Note that, since JavaScript does not offer key-value dictionaries for complex keys, and now that JSON.stringify is widely implemented, it's quite common for people to emulate proper dictionaries by using that to work around this particular JavaScript limitation. Which would likely extend to more persistent forms of storage. -- Björn Höhrmann · mailto:bjo...@hoehrmann.de · http://bjoern.hoehrmann.de Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de 25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
RE: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
-Original Message- From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of bugzi...@jessica.w3.org Sent: Friday, November 19, 2010 4:16 AM >> Just looking at this list, I guess I'm leaning towards _not_ limiting the >> maximum key size and instead pushing it onto implementations to do the hard >> work here. If so, we should probably have some normative text about how >> bigger >> keys will probably not be handled very efficiently. I was trying to make up my mind on this, and I'm not sure this is a good idea. What would be the options for an implementation? Hashing keys into smaller values is pretty painful because of sorting requirements (we'd have to index the data twice, once for the key prefix that fits within limits, and a second one for a hash plus some sort of discriminator for collisions). Just storing a prefix as part of the key under the covers obviously won't fly...am I missing some other option? Clearly consistency in these things is important to people don't get caught off guard. I wonder if we just pick a "reasonable" limit, say 1 K characters (yeah, trying to do something weird to avoid details of how stuff is actually stored), and run with it. I looked around at a few databases (from a single vendor :)), and they seem to all be well over this but not by orders of magnitude (2KB to 8KB seems to be the range of upper limits for this in practice). Thanks -pablo
[Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
http://www.w3.org/Bugs/Public/show_bug.cgi?id=11351 Summary: [IndexedDB] Should we have a maximum key size (or something like that)? Product: WebAppsWG Version: unspecified Platform: PC OS/Version: All Status: NEW Severity: normal Priority: P2 Component: Indexed Database API AssignedTo: dave.n...@w3.org ReportedBy: jor...@chromium.org QAContact: member-webapi-...@w3.org CC: m...@w3.org, public-webapps@w3.org Should we have some sort of maximum key size for what's in IndexedDB? Pros: * Most other databases do. * It's very difficult to handle them efficiently. * Many backing storage engines have limits. (These could be worked around by an implementation storing just the first part of a particularly big key in the backing engine and then looks up the rest in the value when necessary. This clearly would add a lot of complexity and slow things down.) Cons: * Pushing complexity onto the web developer. * May break web apps in ways authors don't anticipate. There are probably other pros and cons that I'm forgetting (please bring them up if so!). Just looking at this list, I guess I'm leaning towards _not_ limiting the maximum key size and instead pushing it onto implementations to do the hard work here. If so, we should probably have some normative text about how bigger keys will probably not be handled very efficiently. -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug.