RE: [REBOL] Mold, load and Core 2.6

[EMAIL PROTECTED] wrote:
> 
> 
> 
> Here are some comments regarding the recent discussion on mold and load,
> explanations of which of those observations represent bugs and which do not,
> and what is going to change in the next upcoming Core 2.6 release:
> 
> 
> First of all, the *intended* use of load and mold for data is in the following
> way:
> 
> stored data -> load -> data in memory -> mold -> stored data.
> 
> Technically, instead of "load" you should use "first load/all", because
> load/all is more "transparent" in that it does not try to interpret the header
> or remove an outermost block.
> 
> If used in this way, i.e. always starting off with load, there are, as far as
> we are aware of, no bugs in mold or load. 
> 
> 
> Sometimes people try to use load and mold in a different way:
> 
> data in memory -> mold -> stored data -> load -> data in memory
> 
> i.e. to serialize data. If used for serialization, mold and load have a number
> of limitations, just like any other serialization system in any language. Some
> of the limitations are unavoidable (how do you serialize an open socket
> connection ?), others could be removed by improvements in the implementation.
> 
> We are currently aware of the following issues when using load and mold for
> serialization:
> 
> 1. Serialization means creating literal representations of data. Unfortunately
> not all datatypes *have* literal representations. This leads to two types of
> problems:
> 
> 1.a) Some values, when molded, become words, and are thus indistinguishable
> from regular words. For instance "mold 'none" and "mold none" both result in
> none
> , which, when loaded back, becomes the word none, not the value none.
> 
> Comment: Will be addressed in Core 2.6.
> 
> 1.b) Some values, when molded, become a sequence of values that represent
> instructions how to recreate the value. For instance molding a hash! results in
> something like "make hash! [1 2 3]". The problem with that is that it requires
> the loading script to evaluate the resulting block (which it ordinarily is not
> supposed to do, because other items in the block, e.g. words, cannot be
> evaluated, plus evaluation is a security risk if the data is untrusted).
> 
> Comment: Will be addressed in Core 2.6.
> 
> 2. Series indices are not included in the molded data. For instance a string
> series "abc" with an index of 2 becomes "bc" after mold and load.  Molding drops
> the data before the index.
> 
> Comment: Will be addressed in Core 2.6.
> 
> 3. It is not possible to create an object! without performing an evaluation of
> the value fields, i.e. molding and loading an object! represents a security
> risk, even if the loader is careful and explicitly checks for the words "make"
> and "object!", because the spec block may contain expressions with side
> effects.
> 
> 3.a) A related issue: Objects have to be molded in such a way that the
> resulting object spec block can be evaluated. This causes some problems
> and ambiguities, e.g. object containing lit-words or set-words as values
> are not correctly molded and loaded, because during evaluation such values
> do not behave as regular values, but have side effects.
> 
> Comment: Will be addressed in Core 2.6.
> 
> 4. In memory it is possible to create values of certain datatypes that include
> characters or expressions that are usually not valid for that particular
> datatype. For instance it is possible to create an email! value which does not
> contain an "@" sign, or an issue which contains a semicolon. mold and load do not
> handle this correctly, because the scanner requires certain hints to identify
> datatypes.
> 
> Comment: Not a bug, will not be addressed. Creating datatypes with invalid
> contents is simply an invalid operation. REBOL allows you to do it (because
> type restrictions are only checked when scanning), but that does not mean that
> it is a valid thing to do. If you want to be able to process arbitrary data
> then you need to use string! and binary! only. Other string series have
> limitations on their structure which have to be complied with for mold and load
> to work.
>

haha. change then import-email. or, use it, have some mad spammer and destroy
your archive next time you save. since it "to-emails" whatever this guy thinks could
be a nice broken address.
i think this is generally, one will not check everywhere for proper formatting,
to-email does the job 99,99% of time, then some crazy data destroys all.
remembering {{} . i expect a fix in a year or so?
#[email! "/badguy-hahaha"] would be so easy..
 
> 5. Various issues regarding references (circular or otherwise). This is always
> difficult to handle in the context of serialization. There are three cases:
> 
> 5.a) The data represents a tree, i.e. each item is referenced no more than
> once, and there are no cycles. For this type of data mold and load should work
> without problems, and this is the type of data organization recommended if data
> needs to be serialized.
> 
> 5.b) The data represents a directed, acyclic graph, i.e. there are no cycles,
> but data items can be referenced more than once. For this type of data
> mold and load should still work, but the referenced items may be included in the
> molded data separately for each reference, i.e. after loading the data back the
> references point to separate items.
> 
> 5.c) The data represents a general, directed graph, with cycles. This is
> strongly discouraged :-). mold and load will not work at all with this kind of
> data.
> 
> Comment: No changes are planned regarding this, and the current behavior is not
> considered to be a bug. Serialization of data structures with non-tree-based
> references requires special serialization functions. mold and load are not
> suitable for this.
>
> 6. Word bindings are not preserved by mold.
> 
> Comment: Not a bug. No changes are planned regarding this. It would be pretty
> much impossible to correctly preserve word bindings without saving the complete
> REBOL machine state :-). Load always binds words into the global context.
> 
>

why binding global ? 
we get kicked whenever somebody inserts a paren! cleverly.  
having load/unbound would be more secure.  
and binding to a restricted set of words in a fresh context, like 
[make object! true false none].
(to-block hangs sometimes here, and is not exactly the same)

> As far as mold and save are concerned, the major change for Core 2.6 is the
> /all refinement, which makes mold and save more suitable for serialization.
> The /all refinement has the following effects:
> 
> - (Almost) all data types are molded in a literal form. Datatypes which already
>   have a natural literal form continue to use this form (integer, words etc.)
>   Datatypes which so far have not had a literal form will use a new notation
>   that acts as a pseudo-literal. This notation is "#[type! description]" or,
>   for some datatypes, "#[value]" (without the quotes). For instance:
> 
>   Value-oriented pseudo-literals:
> 
>   true     ->   #[true]
>   false    ->   #[false]
>   none     ->   #[none]
>   unset!   ->   #[unset!]
>  
>   Datatype-oriented pseudo-literals:
> 
>   object   ->   #[object! [a: 1 b: 2 ...]]
>   list     ->   #[list! [a b c ...]]
> 
>   etc.
> 
>   When loading pseudo-literals back no special refinements have to be used
>   with 'load. The 'load function recognizes pseudo-literals just like all
>   other literals.
> 
> - When a series with an index different than the head of the series is molded
>   then the complete series is molded while preserving the index. To do this,
>   the series is molded in its pseudo-literal form, with the index following the
>   content. For instance the string "abc" with an index at position 2 is molded
>   as "#[string! "abc" 2]. Loading the string back results in a string "abc"
>   with an index at position 2.
>
> - When object pseudo-literals are loaded the spec block is not treated as a
>   block to be executed under the object's context, but strictly as a name/value
>   pair block. This allows objects containing set-words, lit-words etc. to be
>   loaded correctly. For instance #[object! [a: val: b: 1]] results in the
>   object [
>       a: val:  (a containing the set-word val)
>     b: 1
>   ]
>   instead of
>   object [
>     a: 1
>     val: 1
>     b: 1
>   ] as you would get from make object! [a: val: b: 1].
>

i would think [a: #[val:] b: 1] are more obvious?
 
>   Also, the value items are not evaluated before storing them in the object.
>   They are treated as literals. These changes should make it completely safe
>   to send molded objects across untrusted communication lines and load them
>   back at the receiver.
> 
> Please note that the normal output format of mold and load is not affected
> at all. The changes only affect the output of mold/all and load/all. The
> intended use is:
> 
> - If you start with a string representation or a file, load that file,
>   manipulate the resulting block, and then write the block back into a
>   file, then use mold or save without the /all refinement.
> 
> - If you start with some data structure in memory, mold it for
>   serialization purposes, store it on disk or send it across a network,
>   and then load it back, then use mold or save with the /all refinement.
>

all in all sounds great. makes load/mold for serialisation pretty usable.
drawbacks are:
-unparsable data breaks all -> no use of handy parsings like import-email.
 at least some check while molding would be nice,
  instead of something like [equal?  mold data  mold load mold data] as today.. 
-global binding -> paren! kills security (or use :this :that everywhere..),
-crazy molded set-words
-oh yes, and if the newline-tag could be set by programm..
 i don't like having 4K-lines after reduce, unable to fix it in block-form.
 having to mold everything by hand and reload isnt the best solution..


> -- 
> Holger Kruse
> [EMAIL PROTECTED]
> -- 

-volker 
-- 
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with "unsubscribe" in the 
subject, without the quotes.

Reply via email to