On Wed, Jun 15, 2011 at 8:29 PM, Eliot Miranda <[email protected]>wrote:

> Hi Martin & Mariano,
>
>     regarding filtering.  Yesterday my colleague Yaron and I successfully
> finished our port of Fuel to Newspeak and are successfully using it to save
> and restore our data sets; thank you, its a cool framework.
>

Thanks Eliot. These are nice words. We are used to receive critics like "yet
another serializer?". So it is good for us to know you like it and that
could port it even to Newspeak. That means, at some point, that Fuel design
is good. And this is good because we spend a lot of time in design, in good
class names, good method names, class comments, tests, benchmarks, etc.


> We had to implement two extensions, the first of which the ability to save
> and restore Newspeak classes, which is complex because these are
> instantiated classes inside instantiated Newspeak modules, not static
> Smalltalk classes in the Smalltalk dictionary.  The second extension is the
> ability to map specific objects to nil, to prune objects on the way out.  I
> want to discuss this latter extension.
>
> In our data set we have a set of references to objects that are logically
> not persistent and hence not to be saved.  I'm sure that this will be a
> common case.  The requirement is for the pickling system to prune certain
> objects, typically by arranging that when an object graph is pickled,
> references to the pruned objects are replaced by references to nil.  One way
> of doing this is as described below, by specifiying per-class lists of
> instance variables whose referents shoudl not be saved.
>

Exactly. At least this was implemented this week in  Fuel-MartinDias.259.
Basically, you can define a class method #fuelIgnoredInstanceVariableNames
and all instances of such class will ignore those instVarNames. Example:

MyClass class >> fuelIgnoredInstanceVariableNames
^ #(instVar1)


But this can be clumsy; there may be references to objects one wants to
> prune from e.g. more than one class, in which case one may have to provide
> multiple lists of the relevant inst vars; there may be references to objects
> one wants to prune from e.g. collections (e.g. sets and dictionaries) in
> which case the instance variable list approach just doesn't work.
>

+1  Do you have an example?  For example, you may don't want to serialize
classes of #specialObjectsArray?


>
> Here are two more general schemes.  VFirst, most directly, Fuel could
> provide two filters, implemented in the default mapper, or the core
> analyser.  One is a set of classes whose instances are not to be saved.  Any
> reference to an instance of a class in the toBePrunedClasses set is saved as
> nil.
>

Yes, I really want that. I discussed with Dale because this is what they
have in GemStone. There, they have DbTransient and they can do  "aClass
makeInstancesDbTransient". After that, all instnaces of aClass will be
ignored.
In addition, they have TransientValue, which is a class that it is always
ignored.   So you can use an instance of TransientValue to wrap the actual
value want to be transient ... kind of value holder

What do you think about it ?  it sounds really similar to what you suggest
:)


> The other is a set of instances that are not to be saved, and also any
> reference to an instance in the toBePruned set is saved as nil.  Why have
> both?  It can be convenient and efficient to filter by class (in our case we
> had many instances of a specific class, all of which should be filtered, and
> finding them could be time consuming), but filtering by class can be too
> inflexible, there may indeed be specific instances to exclude (thing for
> example of part of the object graph that functions as a cache; pruning the
> specific objects in the cache is the right thing to do; pruning all
> instances of classes whose instances exist in the cache may prune too much).
>
>
+999. GemStone has exactly that so I am not surprise that you come to the
same conclusion :)

So...to sum up, we want:

- classes whose instances are always ignored. Say #toBePrunedClasses  or
#makeInstancesDbTransient
- particular references that we want to ignore. Say TransientValue or
something like that.


> As an example here's how we implemented pruning.  Our system is called
> Glue, and we start with a mapper for Glue objects, FLGlueMapper:
>
> FLMapper subclass: #FLGlueMapper
>  instanceVariableNames: 'prunedObjectClasses newspeakClassesCluster
> modelClasses'
> classVariableNames: ''
>  poolDictionaries: ''
> category: 'Fuel-Core-Mappers'
>
> It accepts newspeak objects and filters instances in the
> prunedObjectsClasses set, and as a side-effect collects certain classes that
> we need in a manifest:
>
> FLGlueMapper>>accepts: anObject
>  "Tells if the received object is handled by this analyzer.  We want to
> hand-off
>  instantiated Newspeak classes to the newspeakClassesCluster, and we want
>  to record other model classes.  We want to filter-out instances of any
> class
>  in prunedObjectClasses."
>  ^anObject isBehavior
> ifTrue:
> [(self isInstantiatedNewspeakClass: anObject)
>  ifTrue: [true]
> ifFalse:
> [(anObject inheritsFrom: GlueDataObject) ifTrue:
>  [modelClasses add: anObject].
> false]]
> ifFalse:
>  [prunedObjectClasses includes: anObject class]
>
> It prunes by mapping instances of the prunedObjectClasses to a special
> cluster.  It can do this in visitObject: since any newspeak objects it is
> accepting will be visited in its visitClassOrTrait: method (i.e. it's
> implicit that all arguments to visitObjects: are instances of the
> prunedObjectsClasses set).
>
> FLGlueMapper>>visitObject: anObject
>
> analyzer
> mapAndTrace: anObject
>  to: FLPrunedObjectsCluster instance
> into: analyzer clustersWithBaselevelObjects
>
> FLPrunedObjectsCluster is a specialization of the nil,true,false cluster
> that maps its objects to nil:
>
> FLNilTrueFalseCluster subclass: #FLPrunedObjectsCluster
>  instanceVariableNames: ''
> classVariableNames: ''
> poolDictionaries: ''
>  category: 'Fuel-Core-Clusters'
>
> FLPrunedObjectsCluster >>serialize: aPrunedObject on: aWriteStream
>
>  super serialize: nil on: aWriteStream
>
>
I understand. But I have a question. This way, imagine an object whose class
is in #prunedObjectClasses. So, you will serialize it as nil. BUT, what
happens with the references from that object ?  are those references still
analyzed ?  Because probably you don't want that.  But you should take care
of #referencesOf: anObject do: aBlock
In this case you are lucky because FLNilTrueFalseCluster  uses the empty
implementation of FLCluster. So...no problem I think. Martin ?

Check what we do in:

FLObjectCluster  >> referencesOf: anObject do: aBlock

    | ignoredInstanceVariableNames |
    ignoredInstanceVariableNames := theClass
fuelIgnoredInstanceVariableNames.

    theClass instVarNamesAndOffsetsDo: [:name :index |
        (ignoredInstanceVariableNames includes: name)
            ifFalse: [ aBlock value: (anObject instVarAt: index) ]]

so of course, we don't follow those instance variables



> So this would generalize by the analyser having an e.g. FLPruningMapper as
> the first mapper, and this having a prunedObjects and a priunedObjectClasses
> set and going something like this:
>
> FLPruningMapper>>accepts: anObject
> ^(prunedObjects includes: anObject) or: [prunedObjectClasses includes:
> anObject class]
>
> FLPruningMapper >>visitObject: anObject
> analyzer
> mapAndTrace: anObject
>  to: FLPrunedObjectsCluster instance
> into: analyzer clustersWithBaselevelObjects
>
> and then one would provide accessors in FLSerialzer and/or FLAnalyser to
> add objects and classes to the prunedObjects and prunedObjectClasses set.
>
>
Yes, in fact, when you talk about #toBePrunedClasses  you mean something
like this:

Behavior >> toBePrunedClasses
FLGlueMapper addPrunnedClass: self

or something like that. I am right?   what I mean is that
#toBePrunedClasses  put the class in the array 'prunedObjectClasses'  of
FLGlueMapper.


> For efficiency one could arrange that the FLPruningMapper was not added to
> the sequence of mappers unless and until objects or classes were added
> to the prunedObjects and prunedObjectClasses set.
>
>
That's a good idea, because those mappers are evaluated for EVERY single
object. If 80% of the times it is FLDefaultMapper, then we have to pay the
cost of the #accepts: of all the rest of the mappers.


> I think both Yaron and I feel the Fuel framework is comprehensible and
> flexible.
>

Again. Thanks. This is what people don't see when using other serializers.


> We enjoyed using it and while we took two passes at coming up with the
> pruning scheme we liked (our first was based on not serializing specific ins
> vars and was much more complex than our second, based on pruning instances
> of specific classes) we got there quickly and will very little frustration
> along the way.  Thank you very much.
>

Thanks to you for the wonderful feedback.


>
> Finally, a couple of things.  First, it may be more flexible to implement
> fuelCluster as fuelClusterIn: anFLAnalyser so that if one is trying to
> override certain parts of the mapping framework an implementation can access
> the analyser to find existing clusters, e.g.
>
> MyClass>>fuelClusterIn: anFLAnalyser
> ^self shouldBeInASpecialCluster
>  ifTrue: [anFLAnalyser clusterWithId: MySpecialCluster id]
>  ifFalse: [super fuelClusterIn: anFLAnalyser]
>
> This makes it easier to find a specific unique cluster to handle a group of
> objects specially.
>

I understand. Do you have an example?


>
> Lastly, the class-side cluster ids are a bit of a pain.
>

Yes, we know :(


>  It would be nice to know a) are these byte values or general integer
> values, i.e. can there be more than 256 types of cluster?, and b) is there
> any meaning to the ids?  For example, are clusters ordered by id, or is this
> just an integer tag?
>

Good questions. I will let Martin to answer ;)


> Also, some class-side code to assign an unused id would be nice.
>
>
+999


> You might think of virtualizing the id scheme.  For example, if FLCluster
> maintained a weak array of all its subclasses then the id of a cluster could
> be the index in the array, and the array could be cleaned up occasionally.
>  Then each fuel serialization could start with the list of cluster class
> names and ids, so that specific values of ids are specific to a particular
> serialization.
>
>
Thanks, good idea.


> again thanks for a great framework.
>
>
Thanks for the feedback. Now, a couple of things I would like to comment:

1) we are preparing a paper right now so we are a concentrated there instead
of the code. In addition, we have 5 failing tests that we should fix ;)
So...as far as we finish with that we will continue with all these things
you are talking about.

2) I will open issues in our bug tracker for everything we discuss in this
thread.

3) Do you have something in mind so that we can ease your port? I mean...if
we continue the development...do you plan to get new versions in the future?
how are you going to do that?

Thanks a lot,

Mariano



> best,
> Eliot
>
> On Mon, Jun 13, 2011 at 10:16 AM, Mariano Martinez Peck <
> [email protected]> wrote:
>
>>
>>
>> On Thu, Jun 9, 2011 at 3:35 AM, Eliot Miranda <[email protected]>wrote:
>>
>>> Hi Martin and Mariano,
>>>
>>>     a couple of questions.  What's the right way to exclude certain
>>> objects from the serialization?  Is there a way of excluding certain inst
>>> vars from certain objects?
>>>
>>>
>>
>> Eliot and the rest....Martin implemented this feature in
>> Fuel-MartinDias.258. For the moment, we decided to put
>> #fuelIgnoredInstanceVariableNames at class side.
>>
>> Behavior >> fuelIgnoredInstanceVariableNames
>>     "Indicates which variables have to be ignored during serialization."
>>
>>     ^#()
>>
>>
>> MyClass class >> fuelIgnoredInstanceVariableNames
>>   ^ #('instVar1')
>>
>>
>> The impact in speed is nothing, so this is good. Now....we were thinking
>> if it is common to need that 2 different instances of the same class need
>> different instVars to ignore. Is this common ? do you usually need this ?
>> We checked in SIXX and it is at instance side. Java uses the prefix
>> 'transient' so it is at class side...
>>
>> thanks
>>
>>
>> --
>> Mariano
>> http://marianopeck.wordpress.com
>>
>>
>


-- 
Mariano
http://marianopeck.wordpress.com

Reply via email to