Den 24.05.2011 17:39, skrev Artur Siekielski:
Disk access is about 1000x slower than memory access in C, and Python
in a worst case is 50x slower than C, so there is still a huge win
(not to mention that in a common case Python is only a few times
slower).
You can put databases in shared memor
2011/5/24 Stefan Behnel
> Maciej Fijalkowski, 24.05.2011 13:31:
>
> CPython was not designed for CPU cache usage as far as I'm aware.
>>
>
> That's a pretty bold statement to make on this list. Even if it wasn't
> originally "designed" for (efficient?) CPU cache usage, it's certainly been
> aro
On Tue, May 24, 2011 at 8:44 AM, Terry Reedy wrote:
> On 5/24/2011 8:25 AM, Sturla Molden wrote:
>
>> Artur Siekielski is not talking about cache locality, but copy-on-write
>> fork on Linux et al.
>>
>> When reference counts are updated after forking, memory pages marked
>> copy-on-write are copi
On 5/24/2011 8:25 AM, Sturla Molden wrote:
Artur Siekielski is not talking about cache locality, but copy-on-write
fork on Linux et al.
When reference counts are updated after forking, memory pages marked
copy-on-write are copied if they store reference counts. And then he
quickly runs out of m
2011/5/24 Sturla Molden :
> Den 24.05.2011 11:55, skrev Artur Siekielski:
>>
>> PYRO/multiprocessing proxies isn't a comparable solution because of
>> ORDERS OF MAGNITUDE worser performance. You compare here direct memory
>> access vs serialization/message passing through sockets/pipes.
> The bottl
On Tue, May 24, 2011 at 10:05 PM, Stefan Behnel wrote:
> Maciej Fijalkowski, 24.05.2011 13:31:
>>
>> CPython was not designed for CPU cache usage as far as I'm aware.
>
> That's a pretty bold statement to make on this list. Even if it wasn't
> originally "designed" for (efficient?) CPU cache usage
Antoine Pitrou, 24.05.2011 14:32:
On Tue, 24 May 2011 14:05:26 +0200Stefan Behnel wrote:
I doubt that efficient CPU cache usage was a major design goal of PyPy
right from the start. IMHO, the project has changed its objectives way too
many times to claim something like that, especially at the l
On Tue, 24 May 2011 14:05:26 +0200
Stefan Behnel wrote:
>
> I doubt that efficient CPU cache usage was a major design goal of PyPy
> right from the start. IMHO, the project has changed its objectives way too
> many times to claim something like that, especially at the low level where
> the CPU
Den 24.05.2011 11:55, skrev Artur Siekielski:
POSH might be good, but the project is dead for 8 years. And this
copy-on-write is nice because you don't need changes/restrictions to
your code, or a special garbage collector.
Then I have a solution for you, one that is cheaper than anything else
Den 24.05.2011 13:31, skrev Maciej Fijalkowski:
Not sure what scenario exactly are you discussing here, but storing
reference counts outside of objects has (at least on a single
processor) worse cache locality than inside objects.
Artur Siekielski is not talking about cache locality, but copy
Den 24.05.2011 11:55, skrev Artur Siekielski:
PYRO/multiprocessing proxies isn't a comparable solution because of
ORDERS OF MAGNITUDE worser performance. You compare here direct memory
access vs serialization/message passing through sockets/pipes.
The bottleneck is likely the serialization, bu
Maciej Fijalkowski, 24.05.2011 13:31:
CPython was not designed for CPU cache usage as far as I'm aware.
That's a pretty bold statement to make on this list. Even if it wasn't
originally "designed" for (efficient?) CPU cache usage, it's certainly been
around for long enough to have received nu
On Sun, May 22, 2011 at 1:57 AM, Artur Siekielski
wrote:
> Hi.
> The problem with reference counters is that they are very often
> incremented/decremented, even for read-only algorithms (like traversal
> of a list). It has two drawbacks:
> 1. CPU cache lines (64 bytes on X86) containing a beginnin
2011/5/24 Sturla Molden :
>> Oh, and using explicit shared memory or mmap is much harder, because
>> you have to map the whole object graph into bytes.
>
> It sounds like you need PYRO, POSH or multiprocessing's proxy objects.
PYRO/multiprocessing proxies isn't a comparable solution because of
ORD
On Tue, May 24, 2011 at 8:33 AM, Sturla Molden wrote:
> Den 24.05.2011 00:07, skrev Artur Siekielski:
>>
>> Oh, and using explicit shared memory or mmap is much harder, because
>> you have to map the whole object graph into bytes.
>
> It sounds like you need PYRO, POSH or multiprocessing's proxy o
Den 24.05.2011 00:07, skrev Artur Siekielski:
Oh, and using explicit shared memory or mmap is much harder, because
you have to map the whole object graph into bytes.
It sounds like you need PYRO, POSH or multiprocessing's proxy objects.
Sturla
___
P
2011/5/23 Guido van Rossum :
>> Anyway, I'd like to have working copy-on-write in CPython - in the
>> presence of GIL I find it important to have multiprocess programs
>> optimized (and I think it's a common idiom that a parent process
>> prepares some big data structure, and child "worker" process
On Mon, May 23, 2011 at 1:55 PM, Artur Siekielski
wrote:
> Ok, I managed to make a quick but working patch (sufficient to get
> working interpreter, it segfaults for extension modules). It uses the
> "ememoa" allocator (http://code.google.com/p/ememoa/) which seems a
> reasonable pool allocator. T
Ok, I managed to make a quick but working patch (sufficient to get
working interpreter, it segfaults for extension modules). It uses the
"ememoa" allocator (http://code.google.com/p/ememoa/) which seems a
reasonable pool allocator. The patch: http://dpaste.org/K8en/. The
main obstacle was that ther
Den 23.05.2011 06:59, skrev "Martin v. Löwis":
My expectation is that your approach would likely make the issues
worse in a multi-CPU setting. If you put multiple reference counters
into a contiguous block of memory, unrelated reference counters will
live in the same cache line. Consequentially,
2011/5/23 "Martin v. Löwis"
> > I'm not a compiler/profiling expert so the main question is if such
> > design can work, and maybe someone was thinking about something
> > similar?
>
> My expectation is that your approach would likely make the issues
> worse in a multi-CPU setting. If you put mul
> I'm not a compiler/profiling expert so the main question is if such
> design can work, and maybe someone was thinking about something
> similar?
My expectation is that your approach would likely make the issues
worse in a multi-CPU setting. If you put multiple reference counters
into a contiguou
>> 1. CPU cache lines (64 bytes on X86) containing a beginning of a
>> PyObject are very often invalidated, resulting in loosing many chances
>> to use the CPU caches
>
> Mutating data doesn't invalidate a cache line. It just makes it
> necessary to write it back to memory at some point.
>
I think
Hello,
On Sun, 22 May 2011 01:57:55 +0200
Artur Siekielski wrote:
> 1. CPU cache lines (64 bytes on X86) containing a beginning of a
> PyObject are very often invalidated, resulting in loosing many chances
> to use the CPU caches
Mutating data doesn't invalidate a cache line. It just makes it
n
Hi.
The problem with reference counters is that they are very often
incremented/decremented, even for read-only algorithms (like traversal
of a list). It has two drawbacks:
1. CPU cache lines (64 bytes on X86) containing a beginning of a
PyObject are very often invalidated, resulting in loosing man
25 matches
Mail list logo