Re: Growable arrays?

2012-06-11 Thread David Kastrup
Daniel Hartwig mand...@gmail.com writes:

 On 11 June 2012 12:37, David Kastrup d...@gnu.org wrote:
 What is a vlist?

 vlist is a data type introduced around guile 2.0.  You will find it
 documented in the Guile Reference under Compound Data Types.

 They are growable and provide vector-like access performances and
 memory locality.

Ah, too bad.  This needs to run under 1.8 as well.

 With these concerns your only options are really vlist or implementing
 growable vector.

  Or put them into a
 list, and reallocate on first access beyond the existing element.  That
 seems rather contorted.

 You mean “put them into a /vector/”?

 No, since a vector can't be grown.  This would basically switch the data
 structure between write and read mode, where write mode grows the
 list of additions, and read mode accesses the vector.  Switching from
 write to read entails creating a newly allocated vector and copying the
 new additions from the list as well as the old vector into it.

 I see, that is rather contorted :-)

Yes, but then it will actually be quite rare that the structure is
extended while it is read rather often.  It would probably do fine to
just do the extension lazily by exception, but then wrapping a catch
around every access is likely to be slower than checking the addition
list to be empty.

-- 
David Kastrup




Re: Growable arrays?

2012-06-11 Thread Thien-Thi Nguyen
() David Kastrup d...@gnu.org
() Sat, 09 Jun 2012 14:32:28 +0200

   Suggestions?

Guile-SDL implements (in C) collections of enums using both a C array
(static, used also for init) and a Scheme hash table for backing store:

http://git.savannah.gnu.org/cgit/guile-sdl.git/tree/src/sdlenums.c#n66

This is not dynamic (growable) as you specified, but i figure it
might be useful as an example building block.  The modifications
are straightforward.  You would have to add a container for the
set of arrays, and two (more) indirections to ‘lookup’ (line 97),
the first to distinguish between integer/otherwise ‘key’, the
second to index into the outer container.

Alternatively, keep track of collection size and add entries to
the hash table under both the user-given key and the monotonic
‘size - 1’ (two entries).  Extremely wasteful spacewise.

Less wasteful spacewise but more wasteful timewise would be to
combine the above approaches, where the initial (static, before
growth) set does not get a double entry, but in turn requires
another check (branch) in ‘lookup’.  Actually the cut-off index
need not be the initial set; it need only correspond to the
initial array allocation (possibly greater than the initial set).

I suspect some of these considerations have gone into the vlist
implementation (though not all since IIRC vlists alloc doubling
on overflow; there is not a single threshold), so another idea
is to use that, back-porting as necessary.

Tangential: I think vlists should be aggressively exercised,
debugged and then back-ported to 1.8 and before.  It's a nice
self-contained design.  This requires a change of policy, though.



Re: Growable arrays?

2012-06-11 Thread Daniel Hartwig
On 11 June 2012 15:25, David Kastrup d...@gnu.org wrote:
 Yes, but then it will actually be quite rare that the structure is
 extended while it is read rather often.  It would probably do fine to
 just do the extension lazily by exception, but then wrapping a catch
 around every access is likely to be slower than checking the addition
 list to be empty.

Certainly, catching exceptions is expensive even for the edge-case  of
the rare write op.  Not sure why you would go that route though I
suppose you understand your use case much better :-)

For reference, attached is a growable vector I use in several
projects, adapted to support the length operation similar to Lua (i.e.
first unset numerical index).  There is no catching of exceptions
here, every access to the data is through the dynvector procedures
which check the index and vector size.


dynvector.scm
Description: Binary data


Re: Growable arrays?

2012-06-11 Thread Andy Wingo
Hi David,

You raise an interesting issue.  But first, a nitpick :)

On Sat 09 Jun 2012 14:32, David Kastrup d...@gnu.org writes:

 Scheme hashtable

To be very pedantic, there are no hashtables in R5RS Scheme.  SRFI-69
and R6RS specify them (in different ways), but do not mandate that they
grow.  Guile's do grow, but that's a detail.

The underlying message important:  _Scheme_ doesn't give you much in the
way of data structures.  It does give you some good tools to define new
data structures, though.

Guile (as an implementation) also provides a few data structures
built-in.  It could provide more.

But, I don't think that table is the right strategy for what you want.

Lua (and JS) implementations typically have many different
implementation strategies for their table-like objects.  For example, V8
has over two dozen.  I don't think we want to pull all that complexity
into the core of Guile where it's not necessary for idiomatic Guile
programming.  It would certainly exist in the language-specific runtime,
though.  Perhaps a module would be more appropriate, though whether
something that fundamental could be shared between any two languages is
tricky.

You might enjoy
http://blog.mrale.ph/post/24351748336/explaining-js-vm-in-js.

For your particular problem, I would just define a new data type,
growable-vector or so, perhaps using srfi-9.

Regards,

Andy
-- 
http://wingolog.org/



Re: Growable arrays?

2012-06-11 Thread Daniel Hartwig
On 11 June 2012 17:01, Daniel Hartwig mand...@gmail.com wrote:
 For reference, attached is a growable vector I use in several
 projects, adapted to support the length operation similar to Lua (i.e.
 first unset numerical index).  There is no catching of exceptions
 here, every access to the data is through the dynvector procedures
 which check the index and vector size.

Always test before posting even small changes to existing code :-)

Updated attachment to actually run without problems


dynvector.scm
Description: Binary data


Re: Growable arrays?

2012-06-11 Thread David Kastrup
Andy Wingo wi...@pobox.com writes:

 You raise an interesting issue.  But first, a nitpick :)

 On Sat 09 Jun 2012 14:32, David Kastrup d...@gnu.org writes:

 Scheme hashtable

 To be very pedantic, there are no hashtables in R5RS Scheme.  SRFI-69
 and R6RS specify them (in different ways), but do not mandate that they
 grow.  Guile's do grow, but that's a detail.

 The underlying message important:  _Scheme_ doesn't give you much in the
 way of data structures.

Lua gives you one data structure, period.  And it is pretty good at
making this one work well.  Sometimes, the one thing that works well
approach beats the give the user choice between several options with
different drawbacks approach, since there may be use cases exercising
at least one deficiency in every option.

 Guile (as an implementation) also provides a few data structures
 built-in.  It could provide more.

 But, I don't think that table is the right strategy for what you
 want.

Tables are a superset of what I need here.  I need the growable vector
aspect, not the hash part aspect.  Guile 1.8 only offers subsets:
growable does not come together with vector.

Guile hashtables _internally_ use a growable vector, but it is not
accessible as such.

 Lua (and JS) implementations typically have many different
 implementation strategies for their table-like objects.  For example,
 V8 has over two dozen.

Uh what?  Lua has one implementation strategy.  Array part, and hash
part.  Lua is not JavaScript.  That Lua emulators without access to a
reasonably fitting table building block have to make do with substitutes
is not much of a surprise.

There has been some sort of a language shootout on the LilyPond
developer list.  With regard to the question Guile/Lua, my respective
pro/contra list for suitability for extensions was

a) lexically Scheme is _very_ nice to integrate.  Scheme reader rules
are straightforward.  LilyPond basically starts Scheme expressions with
#, and so things like #3, #(+ 2 1) #(cond ...) provide a seamless
transition from just constants to more complex things, including the
ability to plug into scopes.  The combination of a straightforward lexer
and a basically non-existent parser (you enter the parse trees directly)
makes for predictable and extensible behavior and easy manipulation of
expressions and pattern matching.

b) from the data types, Lua is _very_ nice to integrate because it does
not give you choice.  You don't need to pick between symbols and strings
when mapping your system to the extension language, because there are
only interned strings.  You don't need to pick between lists and
fixed-size arrays because there are just variable size tables.  You
don't need to pick between arrays and structs because everything is a
table.  And so on.  Since you get only a minimal set of data types and
data structures, but those implemented in a manner where they actually
cover most use cases nevertheless, you are saved from a lot of decisions
that may have different drawbacks.

c) for general-purpose programming, Lua is more human-friendly.  That
comes at the cost of being less computer-friendly by having a parser
between input and parse trees.  Macro manipulation in Lua is sort of a
text processing exercise and not reliable.  It is off the charts as a
feasible programming technique.

Goops is powerful and flexible, but has a bit of a problem in that it is
too powerful and flexible.  As a result, there is no really established
programming style for it: it is more a tool for developing your own OOP
technology rather than prescribing one itself.

That would already be a mixed blessing.  It becomes worse since, while
the documentation does not take a lot of choices regarding how to do
things or not and leaves the programmer with a lot of freedom, the
implementation is optimized for certain uses, and if you want to find
out if your plan of mapping your OOP requirement to Goops will actually
work reasonably efficiently, you need to read all the internals of the
Goops implementation and figure out just _which_ use patterns are
supported efficiently, and which patterns aren't.

That's somewhat less than fabulous.  While the results are reasonably
easy to understand and maintain, designing the first implementation
requires a grandwizard if performance is not just an academical
consideration.

If you take a look at Lua, while things do get complex when you dive
into metatable territory, the performance implications are rather
transparent even without having to analyze the implementation of Lua
manually in fine-grained detail.

This is mostly a diversion from the original question, but it might
still be interesting as a sort of observation.

-- 
David Kastrup



scandir patch

2012-06-11 Thread Andy Wingo
Any thoughts on this patch?

From 711ca0a5ed7351d6fde360f9b451600e77403522 Mon Sep 17 00:00:00 2001
From: Andy Wingo wi...@pobox.com
Date: Mon, 11 Jun 2012 12:25:24 +0200
Subject: [PATCH] scandir: select? takes basenames, operates on (sub)dirs also

* module/ice-9/ftw.scm (scandir): Run the select? procedure on all
  items, including subdirs and the `.' and `..' entries.  Pass it the
  basename of the file in question instead of the full name.

* test-suite/tests/ftw.test (scandir): Adapt expectation for the .test
  selector.  Add test for a selector that rejects everything.
---
 module/ice-9/ftw.scm  |   19 +++
 test-suite/tests/ftw.test |7 +--
 2 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/module/ice-9/ftw.scm b/module/ice-9/ftw.scm
index 96422b5..6c9db27 100644
--- a/module/ice-9/ftw.scm
+++ b/module/ice-9/ftw.scm
@@ -538,26 +538,29 @@ of file names is sorted according to ENTRY?, which defaults to
   (define (enter? dir stat result)
 (and stat (string=? dir name)))
 
-  (define (leaf name stat result)
-(if (select? name)
-(and (pair? result)  ; must have a . entry
- (cons (basename name) result))
+  (define (visit basename result)
+(if (select? basename)
+(cons basename result)
 result))
 
+  (define (leaf name stat result)
+(and result
+ (visit (basename name) result)))
+
   (define (down name stat result)
-(list .))
+(visit . '()))
 
   (define (up name stat result)
-(cons .. result))
+(visit .. result))
 
   (define (skip name stat result)
 ;; All the sub-directories are skipped.
-(cons (basename name) result))
+(visit (basename name) result))
 
   (define (error name* stat errno result)
 (if (string=? name name*) ; top-level NAME is unreadable
 result
-(cons (basename name*) result)))
+(visit (basename name*) result)))
 
   (and= (file-system-fold enter? leaf down up skip error #f name stat)
  (lambda (files)
diff --git a/test-suite/tests/ftw.test b/test-suite/tests/ftw.test
index 805c779..33537d0 100644
--- a/test-suite/tests/ftw.test
+++ b/test-suite/tests/ftw.test
@@ -310,14 +310,17 @@
   (pass-if test-suite
 (let ((select? (cut string-suffix? .test )))
   (match (scandir (string-append %test-dir /tests) select?)
-((. .. 00-initial-env.test (? select?) ...)
+((00-initial-env.test (? select?) ...)
  #t
 
   (pass-if flat file
 (not (scandir (string-append %test-dir /Makefile.am
 
   (pass-if EACCES
-(not (scandir /.does-not-exist.
+(not (scandir /.does-not-exist.)))
+
+  (pass-if no select
+(null? (scandir %test-dir (lambda (_) #f)
 
 ;;; Local Variables:
 ;;; eval: (put 'with-file-tree 'scheme-indent-function 2)
-- 
1.7.10


-- 
http://wingolog.org/


Re: Growable arrays?

2012-06-11 Thread Andy Wingo
Hi,

On Mon 11 Jun 2012 11:55, David Kastrup d...@gnu.org writes:

 Tables are a superset of what I need here.  I need the growable vector
 aspect, not the hash part aspect.  Guile 1.8 only offers subsets:
 growable does not come together with vector.

Why not just make your own growable vectors, then?  It will be just as
efficient.  Guile's hash tables are not designed to be addressed by
index.

I guess to summarize: if you want an abstraction like tables, you would
build it out of vectors and hash tables.  But vectors and hash tables
themselves are the lower-level building blocks.

 Lua (and JS) implementations typically have many different
 implementation strategies for their table-like objects.  For example,
 V8 has over two dozen.

 Uh what?  Lua has one implementation strategy.  Array part, and hash
 part.

LuaJIT doesn't pack unboxed numerics in the array part?  I would be
surprised.  I would also be surprised if it didn't do clever things in
the properties part, as V8 also does.

 There has been some sort of a language shootout on the LilyPond
 developer list.  With regard to the question Guile/Lua, my respective
 pro/contra list for suitability for extensions was [...]

Pretty good observations here.

Andy
-- 
http://wingolog.org/



Re: scandir patch

2012-06-11 Thread Ludovic Courtès
Hi!

Andy Wingo wi...@pobox.com skribis:

 * module/ice-9/ftw.scm (scandir): Run the select? procedure on all
   items, including subdirs and the `.' and `..' entries.

Since the goal was to mimic scandir(3), I double-checked:

#include stdlib.h
#include unistd.h
#include dirent.h

int
main ()
{
  int count;
  struct dirent **list;
  count = scandir (/, list,
		   ({
		  int filter (const struct dirent *e) { return 0; }
		  filter;
		   }),
		   NULL);
  return count == 0 ? EXIT_SUCCESS : EXIT_FAILURE;
}

On GNU/Linux, this test succeeds for me, so this change is correct
(yeah I should have checked POSIX instead ;-)).

 Pass it the basename of the file in question instead of the full name.

Makes sense too, since in C ‘struct dirent’ contains just the base name.

Please apply!  :-)

Thanks,
Ludo’.


Re: Growable arrays?

2012-06-11 Thread Daniel Hartwig
On 11 June 2012 18:38, David Kastrup d...@gnu.org wrote:
 Well, considering the cost of dynvector-grow!, doing the growth in a
 loop rather then just the determination of the new size seems a bit
 expensive:

Only if you are repeatedly setting values at indices far beyond the
current capacity.  This does not occur in my usage patterns where
values are primarily inserted at the tail position (i.e. length),
which is amortized O(1) since grow! is geometric.

This is only an example, something which fits my needs well.  You are
free to adapt it or not to your particular needs.



Re: subbytevectors

2012-06-11 Thread Ludovic Courtès
Hi,

Andy Wingo wi...@pobox.com skribis:

 On Sat 09 Jun 2012 17:16, Thien-Thi Nguyen t...@gnuvola.org writes:

 If you want to make a case for such a facility, why not
 show some code, both without (status quo) and with (proposed)?
 It should be clear what expressiveness is gained, and how.

 For example, let's say I mmap a big file.

   (define x (mmap-file /usr/lib/debug/lib/x86_64-linux-gnu/libc-2.13.so))

 I did some computation and have figured out that there is a region of
 interest between bytes 121241 and 121263 that interests me.  I would
 like to be able to pass off that region to some other piece of code,
 without giving it access to the entire bytevector.  I would also like to
 be able to pass around  

What about using copying (or rather, copy-on-write) sub-bytevectors to
start with?  That would avoid the aliasing issue; OTOH COW would make
the implementation more complex.

Thanks,
Ludo’.




Re: Growable arrays?

2012-06-11 Thread David Kastrup
Andy Wingo wi...@pobox.com writes:

 Hi,

 On Mon 11 Jun 2012 11:55, David Kastrup d...@gnu.org writes:

 Tables are a superset of what I need here.  I need the growable vector
 aspect, not the hash part aspect.  Guile 1.8 only offers subsets:
 growable does not come together with vector.

 Why not just make your own growable vectors, then?  It will be just as
 efficient.

Sure, I will.  A native implementation would be able to benefit from
storage layout conditions that would, in some cases, allow extending the
array without allocating a new memory range, so it _can_ be done.

 Guile's hash tables are not designed to be addressed by index.

Sure they are: the hash _is_ being used as an index.  And one could
likely provide a hash function doing just that, but it would still be
storing a coalescing list in each bucket rather than a single element.

Most of the _mechanisms_ are there.  Just not user-accessible.

 I guess to summarize: if you want an abstraction like tables, you would
 build it out of vectors and hash tables.  But vectors and hash tables
 themselves are the lower-level building blocks.

Not low-level enough: they are already specialized in different
directions making them equally unsuitable for footing the bill.

 Lua (and JS) implementations typically have many different
 implementation strategies for their table-like objects.  For example,
 V8 has over two dozen.

 Uh what?  Lua has one implementation strategy.  Array part, and hash
 part.

 LuaJIT doesn't pack unboxed numerics in the array part?  I would be
 surprised.  I would also be surprised if it didn't do clever things in
 the properties part, as V8 also does.

LuaJIT is not Lua.  The point was that basic table usage in Lua (which
won't use preallocation) can't be mapped well to Guile's data
structures, and that is somewhat embarrassing.

In this particular case, Lua is just something used for namedropping
since it may be more relevant than my particular application.  In either
case, having to make the decision _either_ vector addressing _or_
growable is not really forthcoming, since hashtables _do_ use that
combination internally, so it is not really technical reasons preventing
it.

-- 
David Kastrup



Re: Growable arrays?

2012-06-11 Thread Ludovic Courtès
Hi David,

David Kastrup d...@gnu.org skribis:

 Scheme/Guile vectors are fixed size.  Now I have a situation where I
 have a basic type lattice with records stored in vectors, and this type
 lattice may be extended dynamically (which typically happens at the
 start of a whole file, for potentially multi-file runs).

As Daniel put it, vlists would probably be a good match for that
(info (guile) VLists).

Ludo’.




Re: scandir patch

2012-06-11 Thread Nala Ginrut
Fine to me. ;-)

On Mon, Jun 11, 2012 at 6:26 PM, Andy Wingo wi...@pobox.com wrote:
 Any thoughts on this patch?



 --
 http://wingolog.org/




Re: Growable arrays?

2012-06-11 Thread David Kastrup
David Kastrup d...@gnu.org writes:

 Andy Wingo wi...@pobox.com writes:

 Hi,

 On Mon 11 Jun 2012 11:55, David Kastrup d...@gnu.org writes:

 Tables are a superset of what I need here.  I need the growable vector
 aspect, not the hash part aspect.  Guile 1.8 only offers subsets:
 growable does not come together with vector.

 Why not just make your own growable vectors, then?  It will be just as
 efficient.

 Sure, I will.  A native implementation would be able to benefit from
 storage layout conditions that would, in some cases, allow extending the
 array without allocating a new memory range, so it _can_ be done.

P.S.: I still need to look at vlists.  They might already address this
  issue, though I can't use them in Guile 1.8.

-- 
David Kastrup




Re: Growable arrays?

2012-06-11 Thread Noah Lavine
Hello,

 vlist is a data type introduced around guile 2.0.  You will find it
 documented in the Guile Reference under Compound Data Types.

 They are growable and provide vector-like access performances and
 memory locality.

 Ah, too bad.  This needs to run under 1.8 as well.

If you need to support older versions of Guile, then you can't use any
data structures we add now anyway, correct? So it seems like you will
have to implement growable vectors yourself no matter what.

If you want to, though, you could look at the implementation of vlists
in Guile 2.0, which I think is all in Scheme. See
module/ice-9/vlist.scm.

Even if we can't fix it, it is still nice to hear about data
structures you wish Guile had, so that in a few more versions this
might not be a problem for you.

Noah



Re: Growable arrays?

2012-06-11 Thread Daniel Hartwig
On 11 June 2012 20:00, David Kastrup d...@gnu.org wrote:
 I guess to summarize: if you want an abstraction like tables, you would
 build it out of vectors and hash tables.  But vectors and hash tables
 themselves are the lower-level building blocks.

 Not low-level enough: they are already specialized in different
 directions making them equally unsuitable for footing the bill.

Really?

The Implementation of Lua 5.0 [1], section 4 illustrates how Lua
tables are constructed from a standard hash table and array (vector).
In particular, see Figure 2.

The contiguous, numerically indexed slots are stored only in the
array, with all other slots stored only in the hash table.  This is
perfectly able to be implemented in guile using the standard vectors
and hash tables.  It does require the vectors to be growable, a that
capability which has already been demonstrated.

[1] http://www.lua.org/doc/jucs05.pdf

As Andy points out, Scheme (and guile) provide a toolset of primitive
data types out of which you can build the particular abstractions you
require.  This has the advantage that you can optimize heavily for
your own particular needs, is that possible to the same extent with
Lua given that it only has tables as a fundamental container?

When comparing Lua to guile I would not consider it an issue that
guile does not natively provide a particular data type because most
data types are simple to implement with the tools provided.

Regards



Re: Growable arrays?

2012-06-11 Thread David Kastrup
David Kastrup d...@gnu.org writes:

 David Kastrup d...@gnu.org writes:

 Andy Wingo wi...@pobox.com writes:

 Hi,

 On Mon 11 Jun 2012 11:55, David Kastrup d...@gnu.org writes:

 Tables are a superset of what I need here.  I need the growable vector
 aspect, not the hash part aspect.  Guile 1.8 only offers subsets:
 growable does not come together with vector.

 Why not just make your own growable vectors, then?  It will be just as
 efficient.

 Sure, I will.  A native implementation would be able to benefit from
 storage layout conditions that would, in some cases, allow extending the
 array without allocating a new memory range, so it _can_ be done.

 P.S.: I still need to look at vlists.  They might already address this
   issue, though I can't use them in Guile 1.8.

No, the immutable angle would make them unsuitable again.

-- 
David Kastrup




Re: Growable arrays?

2012-06-11 Thread David Kastrup
Noah Lavine noah.b.lav...@gmail.com writes:

 vlist is a data type introduced around guile 2.0.  You will find it
 documented in the Guile Reference under Compound Data Types.

 They are growable and provide vector-like access performances and
 memory locality.

 Ah, too bad.  This needs to run under 1.8 as well.

 If you need to support older versions of Guile, then you can't use any
 data structures we add now anyway, correct? So it seems like you will
 have to implement growable vectors yourself no matter what.

 If you want to, though, you could look at the implementation of vlists
 in Guile 2.0, which I think is all in Scheme. See
 module/ice-9/vlist.scm.

 Even if we can't fix it, it is still nice to hear about data
 structures you wish Guile had, so that in a few more versions this
 might not be a problem for you.

I don't think I need yet another data structure deficient in some
respects.  We have vectors that can't grow, hashtables that can grow but
only index through a hash function, vlists that can grow but have
immutable content...

Where is the point in thinking up yet another restriction that will make
for a new data structure complementing the others?

Why not just have a superset without arbitrary restriction and implement
the other structures based on that?  Then each one just needs to enforce
its personal restrictions in its accessor functions, and otherwise just
use what is there in the general mechanism.

-- 
David Kastrup




Re: Growable arrays?

2012-06-11 Thread David Kastrup
Daniel Hartwig mand...@gmail.com writes:

 On 11 June 2012 20:00, David Kastrup d...@gnu.org wrote:
 I guess to summarize: if you want an abstraction like tables, you would
 build it out of vectors and hash tables.  But vectors and hash tables
 themselves are the lower-level building blocks.

 Not low-level enough: they are already specialized in different
 directions making them equally unsuitable for footing the bill.

 Really?

 The Implementation of Lua 5.0 [1], section 4 illustrates how Lua
 tables are constructed from a standard hash table and array (vector).
 In particular, see Figure 2.

Sure.

 The contiguous, numerically indexed slots are stored only in the
 array, with all other slots stored only in the hash table.  This is
 perfectly able to be implemented in guile using the standard vectors
 and hash tables.  It does require the vectors to be growable,

Cough, cough.  Standard vectors are not growable.  Which is the original
problem of this thread, never mind Lua.

 a that capability which has already been demonstrated.

Interesting.

 As Andy points out, Scheme (and guile) provide a toolset of primitive
 data types out of which you can build the particular abstractions you
 require.

Too bad that the existing types all have rather arbitrary seeming
limitations.  Vectors don't grow, hashtables have additional indirection
through hash buckets and coalescing lists, vlists are immutable.

It seems like they are all rather similar, and all with a rather
arbitrarily chosen additional restriction tacked on making them
unsuitable for this task.

 When comparing Lua to guile I would not consider it an issue that
 guile does not natively provide a particular data type because most
 data types are simple to implement with the tools provided.

Except that this one isn't.

-- 
David Kastrup




Re: Growable arrays?

2012-06-11 Thread Daniel Hartwig
On 11 June 2012 20:20, David Kastrup d...@gnu.org wrote:
 P.S.: I still need to look at vlists.  They might already address this
       issue, though I can't use them in Guile 1.8.

 No, the immutable angle would make them unsuitable again.

Note that vlists are only immutable in the sense that you can not
modify the value of a slot already allocated.  You can modify the
object that a slot points to and you can extend a vlist as much as you
like.

Multiple vlists can even share tails.

For example, this session modifies the object in slot 0 of the vlist:

 (use-modules (ice-9 vlist))
 (list-vlist '((a b) (c d) (e f)))
$1 = #vlist ((a b) (c d) (e f))
 (vlist-ref $1 0)
$2 = (a b)
 (set-cdr! $2 '(boo))
 $1
$3 = #vlist ((a boo) (c d) (e f))

 Scheme/Guile vectors are fixed size.  Now I have a situation where I
 have a basic type lattice with records stored in vectors, and this type
 lattice may be extended dynamically (which typically happens at the
 start of a whole file, for potentially multi-file runs).

From this I gather that your use case only appends to the lattice, if
so, vlist is suitable for that task.

 Cough, cough.  Standard vectors are not growable.  Which is the original
 problem of this thread, never mind Lua.

True, but a growable vector is a tiny step away from the standard vector.

 hashtables have additional indirection
 through hash buckets and coalescing lists

This is fairly standard for a hash table.  I would be quite surprised
if the hash table part of a Lua table did not also use buckets.

 Except that this one isn't.

Why not?

You take a vector and a hash table, store your values in them, and
grow either as needed.  This is not a complicated type.



Re: subbytevectors

2012-06-11 Thread Andy Wingo
On Mon 11 Jun 2012 13:55, l...@gnu.org (Ludovic Courtès) writes:

 What about using copying (or rather, copy-on-write) sub-bytevectors to
 start with?  That would avoid the aliasing issue; OTOH COW would make
 the implementation more complex.

Not a bad idea.  The FFI can still introduce aliasing, though I don't
know if that matters.  Alignment is still a problem though.  Plus it's a
bit more complexity, though an optimizer should be able to do something
about that.

I'll see how much work an implementation would be.

Andy
-- 
http://wingolog.org/



Re: Growable arrays?

2012-06-11 Thread David Kastrup
Daniel Hartwig mand...@gmail.com writes:

 On 11 June 2012 20:20, David Kastrup d...@gnu.org wrote:
 P.S.: I still need to look at vlists.  They might already address this
       issue, though I can't use them in Guile 1.8.

 No, the immutable angle would make them unsuitable again.

 Note that vlists are only immutable in the sense that you can not
 modify the value of a slot already allocated.

Which makes it useless here.

 Scheme/Guile vectors are fixed size.  Now I have a situation where I
 have a basic type lattice with records stored in vectors, and this type
 lattice may be extended dynamically (which typically happens at the
 start of a whole file, for potentially multi-file runs).

 From this I gather that your use case only appends to the lattice, if
 so, vlist is suitable for that task.

Wrong.  My use case only _allocates_ at the end of the existing type
lattice, but the records are not read-only.

 Cough, cough.  Standard vectors are not growable.  Which is the
 original problem of this thread, never mind Lua.

 True, but a growable vector is a tiny step away from the standard
 vector.

A tiny step if you are modifying the C code.  A not so tiny step if you
are working with Scheme.

 hashtables have additional indirection
 through hash buckets and coalescing lists

 This is fairly standard for a hash table.  I would be quite surprised
 if the hash table part of a Lua table did not also use buckets.

But it is not standard for a growable vector that it only comes with
buckets and chains.

 Except that this one isn't.

 Why not?

 You take a vector and a hash table, store your values in them, and
 grow either as needed.  This is not a complicated type.

Except that vectors don't grow.  Are you even reading what you are
replying to?

-- 
David Kastrup




Re: Growable arrays?

2012-06-11 Thread Stefan Israelsson Tampe
Maybe this could be a first stub for a table structure that is uses both an
array and a hash-table. I do think that implementing this might need fine
tuning in order
to have good defaults and so on. So in that sense one probably need to
check out reference
implementations. But this is for discussion!

I Assumed growing here and have no shrinking!

I made it nonfunctional.

One note to the discussion. It is not just to combine a growable vector
with a growable hash
in ordder to have a one tool for all cases. The reason is that one need to
tackle the issue with sparse tables with integer keys. I tried to add that
feature as well in some way.

Anyway it shows that you don't need a ton of code to get something pretty
functional.


On Mon, Jun 11, 2012 at 4:19 PM, David Kastrup d...@gnu.org wrote:

 Daniel Hartwig mand...@gmail.com writes:

  On 11 June 2012 20:20, David Kastrup d...@gnu.org wrote:
  P.S.: I still need to look at vlists.  They might already address this
issue, though I can't use them in Guile 1.8.
 
  No, the immutable angle would make them unsuitable again.
 
  Note that vlists are only immutable in the sense that you can not
  modify the value of a slot already allocated.

 Which makes it useless here.

  Scheme/Guile vectors are fixed size.  Now I have a situation where I
  have a basic type lattice with records stored in vectors, and this type
  lattice may be extended dynamically (which typically happens at the
  start of a whole file, for potentially multi-file runs).
 
  From this I gather that your use case only appends to the lattice, if
  so, vlist is suitable for that task.

 Wrong.  My use case only _allocates_ at the end of the existing type
 lattice, but the records are not read-only.

  Cough, cough.  Standard vectors are not growable.  Which is the
  original problem of this thread, never mind Lua.
 
  True, but a growable vector is a tiny step away from the standard
  vector.

 A tiny step if you are modifying the C code.  A not so tiny step if you
 are working with Scheme.

  hashtables have additional indirection
  through hash buckets and coalescing lists
 
  This is fairly standard for a hash table.  I would be quite surprised
  if the hash table part of a Lua table did not also use buckets.

 But it is not standard for a growable vector that it only comes with
 buckets and chains.

  Except that this one isn't.
 
  Why not?
 
  You take a vector and a hash table, store your values in them, and
  grow either as needed.  This is not a complicated type.

 Except that vectors don't grow.  Are you even reading what you are
 replying to?

 --
 David Kastrup





hasharray.scm
Description: Binary data


Re: Growable arrays?

2012-06-11 Thread Andy Wingo
On Mon 11 Jun 2012 16:19, David Kastrup d...@gnu.org writes:

 Are you even reading what you are replying to?

Please be civil.  People are trying to help you.

Thanks,

Andy
-- 
http://wingolog.org/



Re: subbytevectors

2012-06-11 Thread Ludovic Courtès
Hi,

Andy Wingo wi...@pobox.com skribis:

 On Mon 11 Jun 2012 13:55, l...@gnu.org (Ludovic Courtès) writes:

 What about using copying (or rather, copy-on-write) sub-bytevectors to
 start with?  That would avoid the aliasing issue; OTOH COW would make
 the implementation more complex.

 Not a bad idea.  The FFI can still introduce aliasing, though I don't
 know if that matters.  Alignment is still a problem though.  Plus it's a
 bit more complexity, though an optimizer should be able to do something
 about that.

Currently alignment is more of a theoretical problem because we are not
yet at a point where such an optimization matters (and I think it
wouldn’t matter on x86 anyway), no?  :-)

Though in principle, I agree we should avoid preventing future
optimizations.

Ludo’.



Re: Growable arrays?

2012-06-11 Thread David Kastrup
Andy Wingo wi...@pobox.com writes:

 On Mon 11 Jun 2012 16:19, David Kastrup d...@gnu.org writes:

 Are you even reading what you are replying to?

 Please be civil.  People are trying to help you.

More like telling me off.  Of course, I am perfectly able to implement
my own moderately efficient version in Guile using already existing data
structures, and this is exactly what I will be doing in this case.

It is not me that needs help here but rather Guile, since the disjoint
members of the set of abstract data structures chosen by Scheme don't
include some combinations of features that map reasonably well both to
an abstract problem space as well as to the underlying implementation.
There is some sense in making use of this feature overlap in reducing
the number and increasing the versatility of the underlying primitives,
whether at the Scheme or the VM level.

Whatever.  I think we can all agree that I don't know what I am talking
about and move on.  Saves a lot of hassle for everybody.

-- 
David Kastrup




Re: Growable arrays?

2012-06-11 Thread Mark H Weaver
Hi David,

David Kastrup d...@gnu.org writes:
 I don't think I need yet another data structure deficient in some
 respects.  We have vectors that can't grow, hashtables that can grow but
 only index through a hash function, vlists that can grow but have
 immutable content...
[...]
 Why not just have a superset without arbitrary restriction and implement
 the other structures based on that?  Then each one just needs to enforce
 its personal restrictions in its accessor functions, and otherwise just
 use what is there in the general mechanism.

Simpler data structures can usually be implemented with less memory,
shorter code sequences with fewer conditional branches and less space in
the instruction cache, which in turn means they can be implemented more
efficiently.  Therefore, to allow efficient compilation, primitive data
structures should be very simple, with more complex structures built on
simpler ones instead of the other way around.

For example, consider resizable vectors vs fixed-size vectors.  A
fixed-size vector can be represented as a single memory block that
contains both the length and the elements together.  A resizable vector
requires an additional level of pointer indirection, which inevitably
means slower accesses and greater code size.  Furthermore, fixed-size
vectors allow the possibility of compiling in an unsafe mode where
out-of-bounds checks can be skipped.

Even if one is willing to pay the price of a relatively heavyweight
primitive data structure, Lua's tables are not always a good fit.  What
if you need a table keyed on large, sparsely distributed integer keys?
What if you need a linked list with O(1) insertions?  What if you need a
doubly-linked list with O(1) deletions?

Data structure design involves many tradeoffs, and unfortunately there
is no complex one-size-fits-all data structure that is a good
universal primitive upon which to build all others.

Regards,
  Mark