On Wed, Jun 6, 2012 at 3:57 PM,  <[email protected]> wrote:
> Hi Toon,
>
> I greatly appreciate the input - that helped me, thank you.
>
> The basic idea is to have a taint tracking tag added to the unused 30bits of
> bit_field3. I also want to pass on that tag when e.g. appending strings (I'm
> just at the start of the project and this is a nice little test) - can you
> point in the direction of the proper function for that? I added debug
> logging to Heap::AllocateStringFromAscii and Heap::AllocateConsString but

We don't always go into these C++ routines.  The generated code can
create strings.  Search for string_map in the src/ia32 subdirectory to
see examples.

> those did not trigger. The idea of what I want to get can be grasped in the
> JS example below.

The instance descriptors that Toon describes are for JS Objects.
These are 'real objects' in the JS sense that you can attach arbitrary
properties to.

The strings are primitive objects that you cannot attach arbitrary
properties to.  They do not have identity: the == and === operators
just test character-for-character equivalence they don't tell you if
two objects have the same object identity like == and === will on real
JS objects.  The strings have their own maps.  There are currently a
lot of different maps for different kinds of strings:

7-bit vs. 16 bit
Sequential, cons, slice and external strings
Symbols and non-symbols

That's 16 different string maps.

You probably want to double that by having a tainted and a non-tainted
map for each.

Symbols may be tricky for you. They are internally canonicalized so
that there cannot be two different symbols that have the same sequence
of characters from start to end.  When a string is used for a property
name it can be turned into a symbol if there was not already a symbol
with those characters.

So for example:

var key = "x" + tainted_foo;  // key is tainted.

hash.key = 0;  // key is now a symbol.

// The following now happens in a completely different unrelated part
of the program:

var key2 = "x" + "foo";  // Not tainted.

hash2.key2 = 0;  // The symbol "xfoo" is used in hash2, which is tainted.

for (k in hash2)  {
  do_something(k);  // k is tainted, this may fail.
}

Perhaps you just want to forbid using tainted strings for property
names.  It's often a bug, due to things like hash collision DOSs or
untrusted sources of the __proto__ string.

Note that all 1-character and most 2-character strings are symbols.

>
> a=document.title;
> a+=" some other string is appended";
> document.title=a;
>
> In the DOM I want to be able to see that what I wrote in document.title
> actually in parts came from there as well :-)
>
> Cheers,
>  Ben
>
> Am Mittwoch, 6. Juni 2012 10:51:19 UTC+2 schrieb Toon Verwaest:
>>
>> Hi Ben,
>>
>> the instance descriptors obviously contain descriptors that describe
>> instances ;-) They (currently) are used to store two different concepts:
>> properties descriptors and map transitions.
>>
>> The property descriptors describe what properties look like, and how they
>> are stored, within instances of the current map.
>>
>> Transitions mean that you used an object in a way that its current map did
>> not support, hence we have to transition to a new map. The transitions are
>> stored in the descriptors so we can share maps with the same semantics
>> between instances; which, in addition to the reduced memory overhead, is
>> necessary for effective inline caching. The transitions can be map
>> transitions (added a property), callback transitions (added setters and/or
>> getters) and elements transitions (when storing for example a double in an
>> array that until now only contained Smis).
>>
>> Basically whenever you do something like "obj.property = value" on an
>> object that previously didn't have "property", a new map is created that
>> contains the new property in its descriptor array. The obj will use this map
>> as its map from then on. At the same time, the descriptor array of the old
>> map is modified to contain a map transition to this new map; under the name
>> "property", so that instances similar to the old obj can also use the new
>> map if they get the "property" added. Finally, the new map gets a
>> BackPointer (stored where the prototype transitions are potentially stored)
>> to the old map for incremental marking.
>>
>> Since the descriptor array is stored in the location where bit_field3 is
>> stored, we move bit_field3 into the descriptor array when such an array is
>> present.
>>
>> To support enumeration of properties in the order of addition (for
>> for-in-loops), the descriptor array also keeps track of the order of
>> addition of its properties. For this reason it also contains an enumeration
>> index; and potentially an enumeration cache.
>>
>> If all you want to do is add information to bit_field3, you should be able
>> to do so without knowing much about all this machinery, however.
>>
>> I hope that helps,
>> Toon
>>
>> On Wed, Jun 6, 2012 at 10:14 AM, <> wrote:
>>>
>>> Hi guys,
>>>
>>> can anyone tell me what the instance descriptors are used for and how
>>> they are initiliazed? I want to store additional information on objects -
>>> that's when I came across bit_field3 which would be fine with me, as it only
>>> needs one bit and not an int (as far as I can grasp). When investigang
>>> further I stumpled across the descriptors where the comment only states that
>>> they store instance descriptors which did not help me all that much :-)
>>>
>>> I appreciate the help,
>>>  Ben
>>>
>>> --
>>> v8-dev mailing list
>>> [email protected]
>>> http://groups.google.com/group/v8-dev
>>
>>
> --
> v8-dev mailing list
> [email protected]
> http://groups.google.com/group/v8-dev



-- 
Erik Corry, Software Engineer
Google Denmark ApS - Frederiksborggade 20B, 1 sal,
1360 København K - Denmark - CVR nr. 28 86 69 84

-- 
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev

Reply via email to