Re: Growable arrays?
Daniel Hartwig mand...@gmail.com writes: On 11 June 2012 12:37, David Kastrup d...@gnu.org wrote: What is a vlist? vlist is a data type introduced around guile 2.0. You will find it documented in the Guile Reference under Compound Data Types. They are growable and provide vector-like access performances and memory locality. Ah, too bad. This needs to run under 1.8 as well. With these concerns your only options are really vlist or implementing growable vector. Or put them into a list, and reallocate on first access beyond the existing element. That seems rather contorted. You mean “put them into a /vector/”? No, since a vector can't be grown. This would basically switch the data structure between write and read mode, where write mode grows the list of additions, and read mode accesses the vector. Switching from write to read entails creating a newly allocated vector and copying the new additions from the list as well as the old vector into it. I see, that is rather contorted :-) Yes, but then it will actually be quite rare that the structure is extended while it is read rather often. It would probably do fine to just do the extension lazily by exception, but then wrapping a catch around every access is likely to be slower than checking the addition list to be empty. -- David Kastrup
Re: Growable arrays?
() David Kastrup d...@gnu.org () Sat, 09 Jun 2012 14:32:28 +0200 Suggestions? Guile-SDL implements (in C) collections of enums using both a C array (static, used also for init) and a Scheme hash table for backing store: http://git.savannah.gnu.org/cgit/guile-sdl.git/tree/src/sdlenums.c#n66 This is not dynamic (growable) as you specified, but i figure it might be useful as an example building block. The modifications are straightforward. You would have to add a container for the set of arrays, and two (more) indirections to ‘lookup’ (line 97), the first to distinguish between integer/otherwise ‘key’, the second to index into the outer container. Alternatively, keep track of collection size and add entries to the hash table under both the user-given key and the monotonic ‘size - 1’ (two entries). Extremely wasteful spacewise. Less wasteful spacewise but more wasteful timewise would be to combine the above approaches, where the initial (static, before growth) set does not get a double entry, but in turn requires another check (branch) in ‘lookup’. Actually the cut-off index need not be the initial set; it need only correspond to the initial array allocation (possibly greater than the initial set). I suspect some of these considerations have gone into the vlist implementation (though not all since IIRC vlists alloc doubling on overflow; there is not a single threshold), so another idea is to use that, back-porting as necessary. Tangential: I think vlists should be aggressively exercised, debugged and then back-ported to 1.8 and before. It's a nice self-contained design. This requires a change of policy, though.
Re: Growable arrays?
On 11 June 2012 15:25, David Kastrup d...@gnu.org wrote: Yes, but then it will actually be quite rare that the structure is extended while it is read rather often. It would probably do fine to just do the extension lazily by exception, but then wrapping a catch around every access is likely to be slower than checking the addition list to be empty. Certainly, catching exceptions is expensive even for the edge-case of the rare write op. Not sure why you would go that route though I suppose you understand your use case much better :-) For reference, attached is a growable vector I use in several projects, adapted to support the length operation similar to Lua (i.e. first unset numerical index). There is no catching of exceptions here, every access to the data is through the dynvector procedures which check the index and vector size. dynvector.scm Description: Binary data
Re: Growable arrays?
Hi David, You raise an interesting issue. But first, a nitpick :) On Sat 09 Jun 2012 14:32, David Kastrup d...@gnu.org writes: Scheme hashtable To be very pedantic, there are no hashtables in R5RS Scheme. SRFI-69 and R6RS specify them (in different ways), but do not mandate that they grow. Guile's do grow, but that's a detail. The underlying message important: _Scheme_ doesn't give you much in the way of data structures. It does give you some good tools to define new data structures, though. Guile (as an implementation) also provides a few data structures built-in. It could provide more. But, I don't think that table is the right strategy for what you want. Lua (and JS) implementations typically have many different implementation strategies for their table-like objects. For example, V8 has over two dozen. I don't think we want to pull all that complexity into the core of Guile where it's not necessary for idiomatic Guile programming. It would certainly exist in the language-specific runtime, though. Perhaps a module would be more appropriate, though whether something that fundamental could be shared between any two languages is tricky. You might enjoy http://blog.mrale.ph/post/24351748336/explaining-js-vm-in-js. For your particular problem, I would just define a new data type, growable-vector or so, perhaps using srfi-9. Regards, Andy -- http://wingolog.org/
Re: Growable arrays?
On 11 June 2012 17:01, Daniel Hartwig mand...@gmail.com wrote: For reference, attached is a growable vector I use in several projects, adapted to support the length operation similar to Lua (i.e. first unset numerical index). There is no catching of exceptions here, every access to the data is through the dynvector procedures which check the index and vector size. Always test before posting even small changes to existing code :-) Updated attachment to actually run without problems dynvector.scm Description: Binary data
Re: Growable arrays?
Andy Wingo wi...@pobox.com writes: You raise an interesting issue. But first, a nitpick :) On Sat 09 Jun 2012 14:32, David Kastrup d...@gnu.org writes: Scheme hashtable To be very pedantic, there are no hashtables in R5RS Scheme. SRFI-69 and R6RS specify them (in different ways), but do not mandate that they grow. Guile's do grow, but that's a detail. The underlying message important: _Scheme_ doesn't give you much in the way of data structures. Lua gives you one data structure, period. And it is pretty good at making this one work well. Sometimes, the one thing that works well approach beats the give the user choice between several options with different drawbacks approach, since there may be use cases exercising at least one deficiency in every option. Guile (as an implementation) also provides a few data structures built-in. It could provide more. But, I don't think that table is the right strategy for what you want. Tables are a superset of what I need here. I need the growable vector aspect, not the hash part aspect. Guile 1.8 only offers subsets: growable does not come together with vector. Guile hashtables _internally_ use a growable vector, but it is not accessible as such. Lua (and JS) implementations typically have many different implementation strategies for their table-like objects. For example, V8 has over two dozen. Uh what? Lua has one implementation strategy. Array part, and hash part. Lua is not JavaScript. That Lua emulators without access to a reasonably fitting table building block have to make do with substitutes is not much of a surprise. There has been some sort of a language shootout on the LilyPond developer list. With regard to the question Guile/Lua, my respective pro/contra list for suitability for extensions was a) lexically Scheme is _very_ nice to integrate. Scheme reader rules are straightforward. LilyPond basically starts Scheme expressions with #, and so things like #3, #(+ 2 1) #(cond ...) provide a seamless transition from just constants to more complex things, including the ability to plug into scopes. The combination of a straightforward lexer and a basically non-existent parser (you enter the parse trees directly) makes for predictable and extensible behavior and easy manipulation of expressions and pattern matching. b) from the data types, Lua is _very_ nice to integrate because it does not give you choice. You don't need to pick between symbols and strings when mapping your system to the extension language, because there are only interned strings. You don't need to pick between lists and fixed-size arrays because there are just variable size tables. You don't need to pick between arrays and structs because everything is a table. And so on. Since you get only a minimal set of data types and data structures, but those implemented in a manner where they actually cover most use cases nevertheless, you are saved from a lot of decisions that may have different drawbacks. c) for general-purpose programming, Lua is more human-friendly. That comes at the cost of being less computer-friendly by having a parser between input and parse trees. Macro manipulation in Lua is sort of a text processing exercise and not reliable. It is off the charts as a feasible programming technique. Goops is powerful and flexible, but has a bit of a problem in that it is too powerful and flexible. As a result, there is no really established programming style for it: it is more a tool for developing your own OOP technology rather than prescribing one itself. That would already be a mixed blessing. It becomes worse since, while the documentation does not take a lot of choices regarding how to do things or not and leaves the programmer with a lot of freedom, the implementation is optimized for certain uses, and if you want to find out if your plan of mapping your OOP requirement to Goops will actually work reasonably efficiently, you need to read all the internals of the Goops implementation and figure out just _which_ use patterns are supported efficiently, and which patterns aren't. That's somewhat less than fabulous. While the results are reasonably easy to understand and maintain, designing the first implementation requires a grandwizard if performance is not just an academical consideration. If you take a look at Lua, while things do get complex when you dive into metatable territory, the performance implications are rather transparent even without having to analyze the implementation of Lua manually in fine-grained detail. This is mostly a diversion from the original question, but it might still be interesting as a sort of observation. -- David Kastrup
scandir patch
Any thoughts on this patch? From 711ca0a5ed7351d6fde360f9b451600e77403522 Mon Sep 17 00:00:00 2001 From: Andy Wingo wi...@pobox.com Date: Mon, 11 Jun 2012 12:25:24 +0200 Subject: [PATCH] scandir: select? takes basenames, operates on (sub)dirs also * module/ice-9/ftw.scm (scandir): Run the select? procedure on all items, including subdirs and the `.' and `..' entries. Pass it the basename of the file in question instead of the full name. * test-suite/tests/ftw.test (scandir): Adapt expectation for the .test selector. Add test for a selector that rejects everything. --- module/ice-9/ftw.scm | 19 +++ test-suite/tests/ftw.test |7 +-- 2 files changed, 16 insertions(+), 10 deletions(-) diff --git a/module/ice-9/ftw.scm b/module/ice-9/ftw.scm index 96422b5..6c9db27 100644 --- a/module/ice-9/ftw.scm +++ b/module/ice-9/ftw.scm @@ -538,26 +538,29 @@ of file names is sorted according to ENTRY?, which defaults to (define (enter? dir stat result) (and stat (string=? dir name))) - (define (leaf name stat result) -(if (select? name) -(and (pair? result) ; must have a . entry - (cons (basename name) result)) + (define (visit basename result) +(if (select? basename) +(cons basename result) result)) + (define (leaf name stat result) +(and result + (visit (basename name) result))) + (define (down name stat result) -(list .)) +(visit . '())) (define (up name stat result) -(cons .. result)) +(visit .. result)) (define (skip name stat result) ;; All the sub-directories are skipped. -(cons (basename name) result)) +(visit (basename name) result)) (define (error name* stat errno result) (if (string=? name name*) ; top-level NAME is unreadable result -(cons (basename name*) result))) +(visit (basename name*) result))) (and= (file-system-fold enter? leaf down up skip error #f name stat) (lambda (files) diff --git a/test-suite/tests/ftw.test b/test-suite/tests/ftw.test index 805c779..33537d0 100644 --- a/test-suite/tests/ftw.test +++ b/test-suite/tests/ftw.test @@ -310,14 +310,17 @@ (pass-if test-suite (let ((select? (cut string-suffix? .test ))) (match (scandir (string-append %test-dir /tests) select?) -((. .. 00-initial-env.test (? select?) ...) +((00-initial-env.test (? select?) ...) #t (pass-if flat file (not (scandir (string-append %test-dir /Makefile.am (pass-if EACCES -(not (scandir /.does-not-exist. +(not (scandir /.does-not-exist.))) + + (pass-if no select +(null? (scandir %test-dir (lambda (_) #f) ;;; Local Variables: ;;; eval: (put 'with-file-tree 'scheme-indent-function 2) -- 1.7.10 -- http://wingolog.org/
Re: Growable arrays?
Hi, On Mon 11 Jun 2012 11:55, David Kastrup d...@gnu.org writes: Tables are a superset of what I need here. I need the growable vector aspect, not the hash part aspect. Guile 1.8 only offers subsets: growable does not come together with vector. Why not just make your own growable vectors, then? It will be just as efficient. Guile's hash tables are not designed to be addressed by index. I guess to summarize: if you want an abstraction like tables, you would build it out of vectors and hash tables. But vectors and hash tables themselves are the lower-level building blocks. Lua (and JS) implementations typically have many different implementation strategies for their table-like objects. For example, V8 has over two dozen. Uh what? Lua has one implementation strategy. Array part, and hash part. LuaJIT doesn't pack unboxed numerics in the array part? I would be surprised. I would also be surprised if it didn't do clever things in the properties part, as V8 also does. There has been some sort of a language shootout on the LilyPond developer list. With regard to the question Guile/Lua, my respective pro/contra list for suitability for extensions was [...] Pretty good observations here. Andy -- http://wingolog.org/
Re: scandir patch
Hi! Andy Wingo wi...@pobox.com skribis: * module/ice-9/ftw.scm (scandir): Run the select? procedure on all items, including subdirs and the `.' and `..' entries. Since the goal was to mimic scandir(3), I double-checked: #include stdlib.h #include unistd.h #include dirent.h int main () { int count; struct dirent **list; count = scandir (/, list, ({ int filter (const struct dirent *e) { return 0; } filter; }), NULL); return count == 0 ? EXIT_SUCCESS : EXIT_FAILURE; } On GNU/Linux, this test succeeds for me, so this change is correct (yeah I should have checked POSIX instead ;-)). Pass it the basename of the file in question instead of the full name. Makes sense too, since in C ‘struct dirent’ contains just the base name. Please apply! :-) Thanks, Ludo’.
Re: Growable arrays?
On 11 June 2012 18:38, David Kastrup d...@gnu.org wrote: Well, considering the cost of dynvector-grow!, doing the growth in a loop rather then just the determination of the new size seems a bit expensive: Only if you are repeatedly setting values at indices far beyond the current capacity. This does not occur in my usage patterns where values are primarily inserted at the tail position (i.e. length), which is amortized O(1) since grow! is geometric. This is only an example, something which fits my needs well. You are free to adapt it or not to your particular needs.
Re: subbytevectors
Hi, Andy Wingo wi...@pobox.com skribis: On Sat 09 Jun 2012 17:16, Thien-Thi Nguyen t...@gnuvola.org writes: If you want to make a case for such a facility, why not show some code, both without (status quo) and with (proposed)? It should be clear what expressiveness is gained, and how. For example, let's say I mmap a big file. (define x (mmap-file /usr/lib/debug/lib/x86_64-linux-gnu/libc-2.13.so)) I did some computation and have figured out that there is a region of interest between bytes 121241 and 121263 that interests me. I would like to be able to pass off that region to some other piece of code, without giving it access to the entire bytevector. I would also like to be able to pass around What about using copying (or rather, copy-on-write) sub-bytevectors to start with? That would avoid the aliasing issue; OTOH COW would make the implementation more complex. Thanks, Ludo’.
Re: Growable arrays?
Andy Wingo wi...@pobox.com writes: Hi, On Mon 11 Jun 2012 11:55, David Kastrup d...@gnu.org writes: Tables are a superset of what I need here. I need the growable vector aspect, not the hash part aspect. Guile 1.8 only offers subsets: growable does not come together with vector. Why not just make your own growable vectors, then? It will be just as efficient. Sure, I will. A native implementation would be able to benefit from storage layout conditions that would, in some cases, allow extending the array without allocating a new memory range, so it _can_ be done. Guile's hash tables are not designed to be addressed by index. Sure they are: the hash _is_ being used as an index. And one could likely provide a hash function doing just that, but it would still be storing a coalescing list in each bucket rather than a single element. Most of the _mechanisms_ are there. Just not user-accessible. I guess to summarize: if you want an abstraction like tables, you would build it out of vectors and hash tables. But vectors and hash tables themselves are the lower-level building blocks. Not low-level enough: they are already specialized in different directions making them equally unsuitable for footing the bill. Lua (and JS) implementations typically have many different implementation strategies for their table-like objects. For example, V8 has over two dozen. Uh what? Lua has one implementation strategy. Array part, and hash part. LuaJIT doesn't pack unboxed numerics in the array part? I would be surprised. I would also be surprised if it didn't do clever things in the properties part, as V8 also does. LuaJIT is not Lua. The point was that basic table usage in Lua (which won't use preallocation) can't be mapped well to Guile's data structures, and that is somewhat embarrassing. In this particular case, Lua is just something used for namedropping since it may be more relevant than my particular application. In either case, having to make the decision _either_ vector addressing _or_ growable is not really forthcoming, since hashtables _do_ use that combination internally, so it is not really technical reasons preventing it. -- David Kastrup
Re: Growable arrays?
Hi David, David Kastrup d...@gnu.org skribis: Scheme/Guile vectors are fixed size. Now I have a situation where I have a basic type lattice with records stored in vectors, and this type lattice may be extended dynamically (which typically happens at the start of a whole file, for potentially multi-file runs). As Daniel put it, vlists would probably be a good match for that (info (guile) VLists). Ludo’.
Re: scandir patch
Fine to me. ;-) On Mon, Jun 11, 2012 at 6:26 PM, Andy Wingo wi...@pobox.com wrote: Any thoughts on this patch? -- http://wingolog.org/
Re: Growable arrays?
David Kastrup d...@gnu.org writes: Andy Wingo wi...@pobox.com writes: Hi, On Mon 11 Jun 2012 11:55, David Kastrup d...@gnu.org writes: Tables are a superset of what I need here. I need the growable vector aspect, not the hash part aspect. Guile 1.8 only offers subsets: growable does not come together with vector. Why not just make your own growable vectors, then? It will be just as efficient. Sure, I will. A native implementation would be able to benefit from storage layout conditions that would, in some cases, allow extending the array without allocating a new memory range, so it _can_ be done. P.S.: I still need to look at vlists. They might already address this issue, though I can't use them in Guile 1.8. -- David Kastrup
Re: Growable arrays?
Hello, vlist is a data type introduced around guile 2.0. You will find it documented in the Guile Reference under Compound Data Types. They are growable and provide vector-like access performances and memory locality. Ah, too bad. This needs to run under 1.8 as well. If you need to support older versions of Guile, then you can't use any data structures we add now anyway, correct? So it seems like you will have to implement growable vectors yourself no matter what. If you want to, though, you could look at the implementation of vlists in Guile 2.0, which I think is all in Scheme. See module/ice-9/vlist.scm. Even if we can't fix it, it is still nice to hear about data structures you wish Guile had, so that in a few more versions this might not be a problem for you. Noah
Re: Growable arrays?
On 11 June 2012 20:00, David Kastrup d...@gnu.org wrote: I guess to summarize: if you want an abstraction like tables, you would build it out of vectors and hash tables. But vectors and hash tables themselves are the lower-level building blocks. Not low-level enough: they are already specialized in different directions making them equally unsuitable for footing the bill. Really? The Implementation of Lua 5.0 [1], section 4 illustrates how Lua tables are constructed from a standard hash table and array (vector). In particular, see Figure 2. The contiguous, numerically indexed slots are stored only in the array, with all other slots stored only in the hash table. This is perfectly able to be implemented in guile using the standard vectors and hash tables. It does require the vectors to be growable, a that capability which has already been demonstrated. [1] http://www.lua.org/doc/jucs05.pdf As Andy points out, Scheme (and guile) provide a toolset of primitive data types out of which you can build the particular abstractions you require. This has the advantage that you can optimize heavily for your own particular needs, is that possible to the same extent with Lua given that it only has tables as a fundamental container? When comparing Lua to guile I would not consider it an issue that guile does not natively provide a particular data type because most data types are simple to implement with the tools provided. Regards
Re: Growable arrays?
David Kastrup d...@gnu.org writes: David Kastrup d...@gnu.org writes: Andy Wingo wi...@pobox.com writes: Hi, On Mon 11 Jun 2012 11:55, David Kastrup d...@gnu.org writes: Tables are a superset of what I need here. I need the growable vector aspect, not the hash part aspect. Guile 1.8 only offers subsets: growable does not come together with vector. Why not just make your own growable vectors, then? It will be just as efficient. Sure, I will. A native implementation would be able to benefit from storage layout conditions that would, in some cases, allow extending the array without allocating a new memory range, so it _can_ be done. P.S.: I still need to look at vlists. They might already address this issue, though I can't use them in Guile 1.8. No, the immutable angle would make them unsuitable again. -- David Kastrup
Re: Growable arrays?
Noah Lavine noah.b.lav...@gmail.com writes: vlist is a data type introduced around guile 2.0. You will find it documented in the Guile Reference under Compound Data Types. They are growable and provide vector-like access performances and memory locality. Ah, too bad. This needs to run under 1.8 as well. If you need to support older versions of Guile, then you can't use any data structures we add now anyway, correct? So it seems like you will have to implement growable vectors yourself no matter what. If you want to, though, you could look at the implementation of vlists in Guile 2.0, which I think is all in Scheme. See module/ice-9/vlist.scm. Even if we can't fix it, it is still nice to hear about data structures you wish Guile had, so that in a few more versions this might not be a problem for you. I don't think I need yet another data structure deficient in some respects. We have vectors that can't grow, hashtables that can grow but only index through a hash function, vlists that can grow but have immutable content... Where is the point in thinking up yet another restriction that will make for a new data structure complementing the others? Why not just have a superset without arbitrary restriction and implement the other structures based on that? Then each one just needs to enforce its personal restrictions in its accessor functions, and otherwise just use what is there in the general mechanism. -- David Kastrup
Re: Growable arrays?
Daniel Hartwig mand...@gmail.com writes: On 11 June 2012 20:00, David Kastrup d...@gnu.org wrote: I guess to summarize: if you want an abstraction like tables, you would build it out of vectors and hash tables. But vectors and hash tables themselves are the lower-level building blocks. Not low-level enough: they are already specialized in different directions making them equally unsuitable for footing the bill. Really? The Implementation of Lua 5.0 [1], section 4 illustrates how Lua tables are constructed from a standard hash table and array (vector). In particular, see Figure 2. Sure. The contiguous, numerically indexed slots are stored only in the array, with all other slots stored only in the hash table. This is perfectly able to be implemented in guile using the standard vectors and hash tables. It does require the vectors to be growable, Cough, cough. Standard vectors are not growable. Which is the original problem of this thread, never mind Lua. a that capability which has already been demonstrated. Interesting. As Andy points out, Scheme (and guile) provide a toolset of primitive data types out of which you can build the particular abstractions you require. Too bad that the existing types all have rather arbitrary seeming limitations. Vectors don't grow, hashtables have additional indirection through hash buckets and coalescing lists, vlists are immutable. It seems like they are all rather similar, and all with a rather arbitrarily chosen additional restriction tacked on making them unsuitable for this task. When comparing Lua to guile I would not consider it an issue that guile does not natively provide a particular data type because most data types are simple to implement with the tools provided. Except that this one isn't. -- David Kastrup
Re: Growable arrays?
On 11 June 2012 20:20, David Kastrup d...@gnu.org wrote: P.S.: I still need to look at vlists. They might already address this issue, though I can't use them in Guile 1.8. No, the immutable angle would make them unsuitable again. Note that vlists are only immutable in the sense that you can not modify the value of a slot already allocated. You can modify the object that a slot points to and you can extend a vlist as much as you like. Multiple vlists can even share tails. For example, this session modifies the object in slot 0 of the vlist: (use-modules (ice-9 vlist)) (list-vlist '((a b) (c d) (e f))) $1 = #vlist ((a b) (c d) (e f)) (vlist-ref $1 0) $2 = (a b) (set-cdr! $2 '(boo)) $1 $3 = #vlist ((a boo) (c d) (e f)) Scheme/Guile vectors are fixed size. Now I have a situation where I have a basic type lattice with records stored in vectors, and this type lattice may be extended dynamically (which typically happens at the start of a whole file, for potentially multi-file runs). From this I gather that your use case only appends to the lattice, if so, vlist is suitable for that task. Cough, cough. Standard vectors are not growable. Which is the original problem of this thread, never mind Lua. True, but a growable vector is a tiny step away from the standard vector. hashtables have additional indirection through hash buckets and coalescing lists This is fairly standard for a hash table. I would be quite surprised if the hash table part of a Lua table did not also use buckets. Except that this one isn't. Why not? You take a vector and a hash table, store your values in them, and grow either as needed. This is not a complicated type.
Re: subbytevectors
On Mon 11 Jun 2012 13:55, l...@gnu.org (Ludovic Courtès) writes: What about using copying (or rather, copy-on-write) sub-bytevectors to start with? That would avoid the aliasing issue; OTOH COW would make the implementation more complex. Not a bad idea. The FFI can still introduce aliasing, though I don't know if that matters. Alignment is still a problem though. Plus it's a bit more complexity, though an optimizer should be able to do something about that. I'll see how much work an implementation would be. Andy -- http://wingolog.org/
Re: Growable arrays?
Daniel Hartwig mand...@gmail.com writes: On 11 June 2012 20:20, David Kastrup d...@gnu.org wrote: P.S.: I still need to look at vlists. They might already address this issue, though I can't use them in Guile 1.8. No, the immutable angle would make them unsuitable again. Note that vlists are only immutable in the sense that you can not modify the value of a slot already allocated. Which makes it useless here. Scheme/Guile vectors are fixed size. Now I have a situation where I have a basic type lattice with records stored in vectors, and this type lattice may be extended dynamically (which typically happens at the start of a whole file, for potentially multi-file runs). From this I gather that your use case only appends to the lattice, if so, vlist is suitable for that task. Wrong. My use case only _allocates_ at the end of the existing type lattice, but the records are not read-only. Cough, cough. Standard vectors are not growable. Which is the original problem of this thread, never mind Lua. True, but a growable vector is a tiny step away from the standard vector. A tiny step if you are modifying the C code. A not so tiny step if you are working with Scheme. hashtables have additional indirection through hash buckets and coalescing lists This is fairly standard for a hash table. I would be quite surprised if the hash table part of a Lua table did not also use buckets. But it is not standard for a growable vector that it only comes with buckets and chains. Except that this one isn't. Why not? You take a vector and a hash table, store your values in them, and grow either as needed. This is not a complicated type. Except that vectors don't grow. Are you even reading what you are replying to? -- David Kastrup
Re: Growable arrays?
Maybe this could be a first stub for a table structure that is uses both an array and a hash-table. I do think that implementing this might need fine tuning in order to have good defaults and so on. So in that sense one probably need to check out reference implementations. But this is for discussion! I Assumed growing here and have no shrinking! I made it nonfunctional. One note to the discussion. It is not just to combine a growable vector with a growable hash in ordder to have a one tool for all cases. The reason is that one need to tackle the issue with sparse tables with integer keys. I tried to add that feature as well in some way. Anyway it shows that you don't need a ton of code to get something pretty functional. On Mon, Jun 11, 2012 at 4:19 PM, David Kastrup d...@gnu.org wrote: Daniel Hartwig mand...@gmail.com writes: On 11 June 2012 20:20, David Kastrup d...@gnu.org wrote: P.S.: I still need to look at vlists. They might already address this issue, though I can't use them in Guile 1.8. No, the immutable angle would make them unsuitable again. Note that vlists are only immutable in the sense that you can not modify the value of a slot already allocated. Which makes it useless here. Scheme/Guile vectors are fixed size. Now I have a situation where I have a basic type lattice with records stored in vectors, and this type lattice may be extended dynamically (which typically happens at the start of a whole file, for potentially multi-file runs). From this I gather that your use case only appends to the lattice, if so, vlist is suitable for that task. Wrong. My use case only _allocates_ at the end of the existing type lattice, but the records are not read-only. Cough, cough. Standard vectors are not growable. Which is the original problem of this thread, never mind Lua. True, but a growable vector is a tiny step away from the standard vector. A tiny step if you are modifying the C code. A not so tiny step if you are working with Scheme. hashtables have additional indirection through hash buckets and coalescing lists This is fairly standard for a hash table. I would be quite surprised if the hash table part of a Lua table did not also use buckets. But it is not standard for a growable vector that it only comes with buckets and chains. Except that this one isn't. Why not? You take a vector and a hash table, store your values in them, and grow either as needed. This is not a complicated type. Except that vectors don't grow. Are you even reading what you are replying to? -- David Kastrup hasharray.scm Description: Binary data
Re: Growable arrays?
On Mon 11 Jun 2012 16:19, David Kastrup d...@gnu.org writes: Are you even reading what you are replying to? Please be civil. People are trying to help you. Thanks, Andy -- http://wingolog.org/
Re: subbytevectors
Hi, Andy Wingo wi...@pobox.com skribis: On Mon 11 Jun 2012 13:55, l...@gnu.org (Ludovic Courtès) writes: What about using copying (or rather, copy-on-write) sub-bytevectors to start with? That would avoid the aliasing issue; OTOH COW would make the implementation more complex. Not a bad idea. The FFI can still introduce aliasing, though I don't know if that matters. Alignment is still a problem though. Plus it's a bit more complexity, though an optimizer should be able to do something about that. Currently alignment is more of a theoretical problem because we are not yet at a point where such an optimization matters (and I think it wouldn’t matter on x86 anyway), no? :-) Though in principle, I agree we should avoid preventing future optimizations. Ludo’.
Re: Growable arrays?
Andy Wingo wi...@pobox.com writes: On Mon 11 Jun 2012 16:19, David Kastrup d...@gnu.org writes: Are you even reading what you are replying to? Please be civil. People are trying to help you. More like telling me off. Of course, I am perfectly able to implement my own moderately efficient version in Guile using already existing data structures, and this is exactly what I will be doing in this case. It is not me that needs help here but rather Guile, since the disjoint members of the set of abstract data structures chosen by Scheme don't include some combinations of features that map reasonably well both to an abstract problem space as well as to the underlying implementation. There is some sense in making use of this feature overlap in reducing the number and increasing the versatility of the underlying primitives, whether at the Scheme or the VM level. Whatever. I think we can all agree that I don't know what I am talking about and move on. Saves a lot of hassle for everybody. -- David Kastrup
Re: Growable arrays?
Hi David, David Kastrup d...@gnu.org writes: I don't think I need yet another data structure deficient in some respects. We have vectors that can't grow, hashtables that can grow but only index through a hash function, vlists that can grow but have immutable content... [...] Why not just have a superset without arbitrary restriction and implement the other structures based on that? Then each one just needs to enforce its personal restrictions in its accessor functions, and otherwise just use what is there in the general mechanism. Simpler data structures can usually be implemented with less memory, shorter code sequences with fewer conditional branches and less space in the instruction cache, which in turn means they can be implemented more efficiently. Therefore, to allow efficient compilation, primitive data structures should be very simple, with more complex structures built on simpler ones instead of the other way around. For example, consider resizable vectors vs fixed-size vectors. A fixed-size vector can be represented as a single memory block that contains both the length and the elements together. A resizable vector requires an additional level of pointer indirection, which inevitably means slower accesses and greater code size. Furthermore, fixed-size vectors allow the possibility of compiling in an unsafe mode where out-of-bounds checks can be skipped. Even if one is willing to pay the price of a relatively heavyweight primitive data structure, Lua's tables are not always a good fit. What if you need a table keyed on large, sparsely distributed integer keys? What if you need a linked list with O(1) insertions? What if you need a doubly-linked list with O(1) deletions? Data structure design involves many tradeoffs, and unfortunately there is no complex one-size-fits-all data structure that is a good universal primitive upon which to build all others. Regards, Mark