on the bind conjunction proposal, one huge benefit of it would be the apply verb.
'[ + +/@]'& apply 1 2 3 7 8 9 NB. 1 2 3 + 6 because [ monadically is same as ] if we've used apply, we've all tried this: 2 '[ + +/@]'& apply 1 2 3 31 32 33 which executes '+/'& apply(^:2) 1 2 3 so: b=: & @: (,&boxopen) applybind =: 1 : ' (''('' , m ,'')&>/'') b apply' 2 '[ + +/@]' applybind 1 2 3 8 NB. dyadic apply application. instead of ^:x The/an internal apply verb could be optimized upon receiving its x parameter binding as well. b: is a potential available builtin instead of &:: On Monday, February 7, 2022, 11:29:34 p.m. EST, 'Pascal Jasmin' via Programming <programm...@jsoftware.com> wrote: The best idea I had in previous post b =: & @: (,&boxopen) 3 b (,&<) 2 ┌─┬───┐ │3│┌─┐│ │ ││2││ │ │└─┘│ └─┴───┘ (,&<) b 3 (b 2) (1 b) 4 ┌───────┬─┐ │┌─┬─┬─┐│3│ ││1│4│2││ │ │└─┴─┴─┘│ │ └───────┴─┘ First example binds 3 to x, 2nd binds 3 to y. A bound function will then assume that the other parameter is a boxed list, and when called dyadically will build up that list from unboxed dyadic parameters. When called monadically, a list of length 1 will be the other (x or y) parameter. This allows verbs with more than 2 parameters, where 1 is a compound boxed parameter. Once the unboxed parameter is bound/curried, there is a choice of 2 curryings of the boxed compound parameter for the next applications of b. First or last. In 2nd example above, 2 is bound to last slot, then 1 bound to first slot (b 1 instead of 1 b would bind 1 to next to last slot... ie last of remaining slots). Last expression can always be called dyadically or monadically 5 (,&<) b 3 (b 2) ( 1 b) 4 ┌─────────┬─┐ │┌─┬─┬─┬─┐│3│ ││1│5│4│2││ │ │└─┴─┴─┴─┘│ │ └─────────┴─┘ bf =: b f. NB. would provide optimization potential by "expanding constants." For optimization, either f. or new f: would "propagate constants". so: (3 + ])&2 f. 5"_ ([ + 3 + ])&2 f. 5 +~ [ by scanning for f@] (when &constant/noun is provided) in a tacit expression, constants can get propagated, and so deep optimizations are possible: (U V F@])&N -> (F N) V~ U (g@] V f@])&N -> (G N) V (F N) -> (N V N)"_ for the compound parameter (0 {:: ]) or (_1 {:: ]) could also receive constant propagation optimizations once those positions are set. f. or f: (this proposal is incompatible with previous version so new f: would avoid changing old uses of f.) could also tunnel into explicit code such that lines that are name =. f y become name =. constant when 4 : 'y =. f y ...'&constant f: is provided. Getting back to dictionaries, I've been proposing a "thin class" that holds MetaInfo about data without the data (encapsulating data makes it a heavy class). The disadvantage of this approach is that a lot more parameters to inverted table functions that need to be passed. while get__myItblMeta does not need meta info passed along to it, a generic get verb does + the data + field(s) to get from + a possible modifier for which column subset to retrieve. And then if there is a column subset modifier to a filter function (subset of records selected) then new meta data must also be returned as the original column index map is no longer valid. So, the new bind definition makes all of this manageable when the metadata is "known"/bound first (with optimization potential), then fieldtolookup and data become the dyadic parameters. set has extra parameters for key and newdata. Another idea, I can make a functional simple kv similar to JP's class that is metadata free/implied (normal programming style). The specs are: keys are unique symbols. (symbols are more flexible than J variable names. Can include spaces and other chars. Shoehorning any data into a symbol is possible and guaranteed if shoehorning to a string representation is possible. All J data has string representations. A further restriction that keys not have any leading or trailing blanks is worthwhile just to avoid access errors. Specialized functionality that does allow leading+trailing blanks in fields can be copy/edited into new functions that removes code that enforces these constraints. access to keys interface supports/assumes boxed and string descriptions of symbol key (parameters). keys are store in (num,1) shape set function is a combination of JP's set or delete functionality with k/q's upsert, with "merge" oriented optimizations for bulk oriented upserts. To get JP's upsert or delete functionality, whenever value is null, delete is performed. This corresponds to keyhasnullvalue get kvdata returns the same when k is not found or associated value is null, and so (key;a:) set kvdata provides the same results if key is deleted as if it were associated with null. add functionality, first tries to append value unboxed. If error, then tries append boxed value. If error, the boxes each existing items of values and appends boxedopen. For set functionaly, copies the boxing level of existing value. Once values have been boxed as a result of adding a single boxed value, they won't ever get unboxed in the rare case of overwritting a boxed value with an unboxed one would provide a homogeneous typed value list. The core get functionality is unoptimized key lookup. utility function is provided for user to create an optimized keys&i. function, that user can access after they are done modifying the dictionary, and many accesses or dictionary is quite long that optimization is useful to user. Optimized get will still work with both y as kv or just a value list, though perhaps separate versions of each need to be provided as utilities when we assume that detecting kv vs just-values structure has a relatively high timing penalty. utilities are also provided to transform the kv data from symbol-value inverted table to either boxedstring keys or padded string keys and value inverted tables such that either lookup by value or partial string based queries can be used for access. Courteousy utility functions also provided to access/query kv data in such string-value inverted table form. Some other justifications for this spec: symbol keys provide optimization even without keys&i. step afaik. Symbol compatible keys go well beyond compatibility with J locale names. Compound keys may allow space or other char separated keys as symbols. Merge/bulk oriented optimizations are worthwhile. Can prescan the list of updates into unique constraints and value boxing constraints. Where value constraints with nulls are still compatible if deletes (mixed with upserts) are moved to the top as seperate action items. The unique filter for key uses i: in order to just keep the last modification action in the bulk list. This kv data spec is enough to specify an inverted table metadata for fields. Variant value field means a dictionary/kv-data implementation to represent unique/sorted/type/boxed provides an easy access pattern to all of the fields. If there are not that many fields, then space innefficiency doesn't matter. If you never have to access fields by one of their attributes (say all fields that are sorted) then inverted table access efficiencies are not needed. This lua-styled approach works well for small data with key driven access. Provides easy descriptive access: 'sorted' kv 'myfield' kv kvdata NB. retrieves the sorted property of myfield key in kvdata. 'sorted' kv 'myfield' kv 'fields' kv metadata NB. fields is a collection of dictionaries with each item keyed as a fieldname with dictionary of related attributes. Collections allow for like-typed data to be grouped together, and then "walked through" for processing. In metadata, easier to separate from simple properties. Another metadata-less dictionary structure that could be generally useful, and specifically useful for 1:1 mapping to class definitions is something I call kve: a 3 column table with key as symbols, values as strings, and "encoding" as symbols. Where the encoding data associated with each key and value informs how to turn the value string encoding into "native data or functions". I'm not sure there is general use for this, but if class descriptions need to be done, then this kve scheme seems necessary. On Sunday, February 6, 2022, 04:34:28 p.m. EST, 'Pascal Jasmin' via Programming <programm...@jsoftware.com> wrote: You covered some of the issues with a data encapsulated class approach like yours. The big issue for me is that your set verb returns 0 0 $0, but even if it returned the object reference, J is poor at compound expressions that operate on an object. Need to pass strings to what effectively becomes a dsl new j903 modifier trains get useful, but still messy d=: dict 'abc';1 2 3 loc_z_=: (,&'_'@[ ,&'_'@, ":@>@])"1 0 boxopen in_z_ =: ([. loc ].)~ d ('gf' in ]: + 'gf' (in d)) 'a' NB. parameterizing dictionary as an adverb for lhs of fork, and hard coding on rhs 2 but if set returned an object, having a verb that operated on that object would require explicit code (__y will work) to be simple. Then there is the issue of a set operation that doesn't want a "forced side effect" of permanently altering the object. instead a copy that wants to be temporarily used. A filter/query operation that returns multiple "records" Instead of a data encapsulated class, functions that operate on inverted tables would allow returning a new/subset of the "data". This adds extra work to save, but the extra work to copy a class in order to modify only the copy, but predeciding that if you want to do this, you would never want to overwrite the original dictionary, which seems like being above the paygrade of a function operating on inverted tables. Also remember to destroy the copy in your code when it is supposed to be discarded (actually a hard problem that would need its own dsl to solve all "responsibility combintations"). And then J, has unfriendly access problems on operating with an object parameter to a function if not an explicit function. J's strengths come from its functional approach. Returning a new copy of data is functional. It is very easy in J, especially in console, to modify the previous line of code such that it assigns a new result value to existing or new variable names. Double checking that the function works properly before overwritting "production" or lesser data is a prudent approach I'd recommend 100% of the time. J's impure functional approach is also the perfect functional approach. Pure (never side effect) functions inside, but the last caller/user (outside) decides on what side effects to make. An inverted table argument makes it easy to write functions that operate on that y argument inverted table. An encapsulated class makes that difficult to extend. I still think "keyed table" (multi column dictionary including potential multicolumn keys uniquely identifying a record) is still the right approach to a generalized dictionary, and most (90%+) column use cases would be uniformly typed. A defining property of dictionaries is access by full key match which necessarily brings symbols as an optimization feature of fields, but even if dictionary/keyed table, general query access is a nice to have, that you have with inverted tables, and an ability to covert to/from symbols when "necessary". A class based approach to keyed tables is possible and easiest to create. I've mentioned a general datastructure framework. Which is metadata about the data in one box, data in the other. Metadata is a "property dictionary" where values are data or functions. A string encoding is possible especially if there is a "class type" field that directs the encoding/decoding, but encoding values as boxed items to distinguish among different types/classes of values and functions is also an option. There is an easyish 1:1 mapping between a metadata structure about data, and a class definition that references DATA variable, or better yet, use data that is expected to conform to metadata understanding of the data as its y function parameter. This necessarily makes this approach exactly as easy as the first. Write a class, and use it either as class or as metadata described structure (data) to be chose by user. A third option, especially if it applies just to keyed tables, is having a dsl/description of the inverted table structure as an adverb parameter. An adverb allows for optimization in the returned verb/modifier. To optimize get (your valuable feature of your dict class), you only need to know the table constraints/definitions. set using a datastructure definition can generate a (pre)validation of input, along with informative descriptions for why elements fail if they do. A multi column dictionary description dsl would look like: key: ... value: ... NB. where ... is a list of fields with attributes (reserved words not allowed as field names) as follows: colname: u(nique): s(orted): type: or b(oxed): (optional if first item determines type. But benefits optimization if provided in dictionary description) single line definition potential is a huge convenience for both copy/edit coding, and console simplicity. So a generic get (by whole field match) is an adverb that first uses 'keyed table def' get, but then by a column list (indexes or colnames) that permits an indexing optimization step on that index (m&i. where m is the column parameter), when a single column is passed, then all keys in y are used to retrieve records (one for each key passed), and when multiple columns are part of final adverb parameter, then y is expected as a boxed values for each column, and all records with a key match retrieved. It is possible to choose (with additional (named) adverb) that if only one record is in dataset, then just raw values instead of full dictionary structure are returned. A metadata encoded datastructure seems superior to the adverb dsl processor in that an adverb dsl processor could with a preceding adverb interpret any meta+data parameter with just the metadata portion that allows it to operate on any other similar structured/metadata'd data. The end goal of an approach, IMO, should be to create improvements to J in terms of generic inverted table functions, with some specific improvements already identified in this thread: 'column list' { meta-described-dictionary NB. use FIELDS metadata keyword that contains symbol data, to retrieve column indexes (or other potential use of FIELDS duck named variable specific to datastructure) referenced in string. &:: =: bind =: (& @: ;) new modifier train such that dyadic m&:: f and f &::n are (m&f)(@:;) or (f&n)(@:;). J already has bound =: (f&n) or (m&f) have special dyadic interpretations of bound^:x y. The above enhancement would allow an interpertation of bound(@:;) which allows writing f for 3 arguments, ie. compound 2 boxed x or y arguments, but allows user to provide compound part as dyadic unboxed arguments. &:: compounded allows even more arguments. If x takes 3 (boxed) arguments than arg0&::f&::y applied dyadically, has x as arg1 and y as arg2. If applied monadically, then the 3rd x argument (arg2) to f would be missing, and f c/would deal. Compounding &:: calls would increase arity of functions from 3 to higher than 3 parameters. This feature would also allow optimizing inside f. If f is explicit than any line that is varname =: f x (if m&f is bound) or f y (if f&n is bound), and where an ideal structure is x =. f x or y =. f y internally as proof that original x can be discarded. If f is implicit, than any u@] or u@[ can be optimized away to a constant based on m&::f or f&::n, and if N V N occurrs as result of that optimization, then that too can be optimized into a constant. What the above allows beyond syntax sugar for more than 2 parameter verbs, is not having to resort to self-written-code optimizations inside adverbs. verbs can self optimize based on bound parameters (when for example (m i. ]) has same optimization as m&i. > Lua table references I've been thinking of k/q as the guiding model. Lua's variant (boxed) key and variant (boxed) values tables have the simplicity of storing every potential scenario, but as a dictionary implementation, would provide a strong incentive to avoid the dictionaries for performance reason. If you wanted to use a dictionary as a key, in J, you could use a linear representation of that dictionary in order to keep all keys as strings. But, repeating sorry, a boxed/variant column type can coexist along side uniform typed columns. Metadata (not at all Lua interpretation) would instead specify types and attributes of inverted table columns in the case of keyed tables. But also (kinda like Lua) include optimized/specified functions related to data. In general, I'd also say that access_keys_ being limited to valid spaceless J naming conventions is not a huge sacrifice for accessnames. Extending to spaceless unicode strings is not an ease of use problem if the user wants unicode keys, though it would interfere with that 1:1 J locale/classname mapping of datastructure metadata. On Sunday, February 6, 2022, 09:52:04 a.m. EST, Jan-Pieter Jacobs <janpieter.jac...@gmail.com> wrote: Hi Pascal, I responded inline below: A workaround is to optimize SET, ADD, UPDATE, DEL for bulk operations > (multiple items processed at once (] F..) super useful), and after bulk > operations, "redefine" (just repeat execution of same definition) GET such > that any m&i. updates. Also update FILTER functions (GET multiple if they > gain from static binding optimization. > This is, if I get it correctly, exactly what my dict implementation ( https://github.com/jpjacobs/types_dict) does: it allows setting/updating/removing multiple keys and the lookup verbs used are updated only if there is a change in keys > > An approach that just presumes key uniqueness instead of enforcing it, is > for GET to be based on i: instead of i. and then any ADD with a duplicate > key effectively will return the last updated/added values. > This would gather a lot of garbage and would loose the advantage of in-place updating. > > Back to generic datastructure, everything a class can do is possible > within a datastructure. All administrative "properties" (names) and their > associated values including functions can be encoded in a dictionary, > including a string representation dsl for representing "name values" with > ease as to function/data. What specializes a datastructure over a "mere" > class is the concept of existential data held by the datastructure that a J > user would want complete access to that data. In a class based > implementation, a universal name data =: holds the core data that the J > programmer would want access to. Usually, it is compound greater than > atomic data that can be represented as inverted tables of "linked data". > And part of the data specifying dsl's purpose is to include descriptions > that permit any possible optimizations that include what k/q's attributes > do (sorted, unique), but with extensible dsl, any other > implications/constraints on the data can use/select a specific > implementation of universally named "accessors"/functions > So a datastructure contains 2 boxes: 1st holds the name of the > datastructure class (for lookup value of any metadata of that classname), > and all administrative properties, and specialized functions for > GET/ADD/DELETE and other functions expected to have meaning relative to its > "existential" data, and the 2nd box holds the (likely compound and so extra > boxed) "data" > > An advantage of a compound datastructure over a class is the user gets to > decide whether to overwrite the "permanent" data while still having access > to SET/DEL/ADD functionality of their own copy they may want for their > application/data needs. It is also possible for generic GET/ADD/DELETE to > query the datastructure as to how it can best accomplish its integral > functionality, should there not be a specialized version defined in the > datastructure, and GET as an adverb that takes either '', > datastructure_name, or a specific instance of datastructure can optimize > itself as a first step, or one that can be bound to an optimized named > function, or if '' is the adverb parameter to GET, then the generic verb > "inspect y for datastructure properties" before selecting implementation is > returned. > I think these ideas are pretty much what Lua implements with its tables (dictionaries that can contain anything as keys and values, joined by their metatables, i.e. tables that can contain functions to override e.g. indexing operations). These tables do everything: from working as locales (function environments), over separating modules (our addons) to implementing OOP (making liberal use of the __call metamethod, specifying what happens if you calln a table as if you were calling a function, and __index, specifying what happens if you try to get a non-existent key in a table). In my view, the problem with a locale-based dict implementation like mine is currently that you cannot nest dicts without loosing generality. As numbered locales are referred to by boxed numbers, you could make a special case for these in your implementation, but would evidently loose the possibility to store boxed numbers. Even when adding checks to whether a boxed number is a locale, one cannot be sure the user intended to refer to a locale or actually wanted to store a boxed number. One could think of using the locales themselves as dicts, but there you'd have the problem that: - only valid names can be keys - referring to values is only possible with dict__key, which precludes doing so tacitly. For such implementation to work, one could (note, I have no clue about the implementation itself :p): - make a datatype only for referring to locales - implement indexing into that type with {:: following more or less the same idea as indexing with {:: - providing a verb to amend along the same lines - have a conjunction DoneIn that allows something like verb DoneIn mylocale (could be called 'of' as well) - allowing any value as "name" in locales. Like that, implementing a dict that allows storing arbitrary keys and values, nesting dicts and even self-reference, reference loops etc, using locales would become possible. In the end, I guess this would end up at about the same functionality as Lua does for tables… so I don't know what's more effort: implementing everything in J/C, or binding Lua. There's been a time I would have loved to have Lua instead of J's explicit language, but I guess that would end up as a different language :). Jan-Pieter ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm