I have decided that man's best friend is 1) a dog, 2) hexdump. ;)
What I did was try out the various prefix classes you mentioned with
the +List, +Ref, +String etc. in different combinations, and just went
and looked at them. More on that below.
> Exactly. But this may not be a good idea if the first file has a
> rather small block size, because the B-Tree nodes are only happy if
> they have enough space to store several key-value pairs. As opposed
> to entity objects (which may occupy more than one block, see above),
> a node splits into two if it needs more space. So you end up with
> many (perhaps too small) nodes.
I'm a bit confused with which is the "entity object" and which are the
"B-Tree nodes". The key-value pair seems like it was in file 1 in any
case, and the index files had values, with an link back to the
object name (?). Thought I had it figured out, then was not so sure.
> In fact, I never used (+Ref +List +String), always found it not
> useful. I would expect the same for (+Key +List +String). Because you
> can always store the whole string instead - without splitting it into
> a list - in a (+Key +String) or (+Ref +String), right?
> What turned out more useful is (+List +Ref +String), i.e. indexing the
> individual words, or, more typical, (+List +Fold +Ref +String).
Concrete examples will help. This is all about language (sorry!!), what
I think I will need eventually is for the Wrds file to hold lemmas, so
objects that are semantically unique. And before had used a List prefix
as a way of holding the article ("Teil" "r") ("Teil" "s") for that
reason. Later there will be verbs which can be reflexive or not. And
these are also part of the word semantically, are different lemmas.
I don't think I would want to index across these with +List anyway, and
maybe the best solution is just to write them into a single string and
not fold them (e.g. "Teil, r" or "vorstellen, sich"). Then of course
you have cases where the usage (or context) makes a word different
(süß) and the plot thickens! Languages get intricate rather fast.
For my test program here it is not important, and I don't want to get
too bogged down on this, just get it completely done (so with updates
and GUI ). But it will be important later, to have lemmas and signs as
the two main data files, with indexes, categories and such.
> In all (+List +String) cases it is a bit tedious to handle these for
> data in the GUI. Note that you can use +ListTextField to map a
> single text field to a list of strings.
This is any case is a must, and if it is easier for the GUI to use and
to reformat a simple string (for display) then just do that. And so
maybe stick with just the ((rel wrd (+Key +String))?