Sorry, this mail is too long...

On Thu, Jun 9, 2011 at 9:20 AM, tora - Takamichi Akiyama <t...@openoffice.org 
<mailto:t...@openoffice.org>> wrote:
    That is why I would like to encourage programmers to take care of the life 
time of data.

I know that that statement is controversial.

On 2011/06/09 18:02, Stephan Bergmann wrote:
First of, I am doubtful that encouraging manual memory management is a good 
idea.  Errors in manual memory management probably are the cause for the vast 
majority of severe failures in C/C++ programs.

Please be noticed that I don't say programmers should need to explicitly call 
memory management related functions such as malloc() or free().

Rather, I would like to suggest thinking of the characteristics of the 
questioned data.

1. Delegation of the responsibility to choose a type of memory allocator
To achieve both stability and performance at the same time, I would like to propose "Don't do 
all of it in the SAL", rather "Delegate certain responsibility to its users, i.e. 
programmers."

Who knows the type of life time of data? SAL does? No. The programmers do.

Life time of data
 (1) data lasting until the soffice.bin quits.
 (2) data lasting until a document is closed.
 (3) data lasting until a current thread ends.
 (4) data lasting until a certain task finishes.
 (5) data lasting until a current function call returns.
 (6) data lasting until a current block ends.

Multithread awareness
 (a) data that is shared with more than one threads.
 (b) data that is used in the only this thread.

Asynchronous awareness
 (i) data that is used in a asynchronously called function such as a signal 
handler.
 (ii) data that is used in a normal function.


2. Potential dead lock
A code for crash reporter has a potential, dead lock problem.
http://hg.services.openoffice.org/DEV300/file/tip/sal/osl/unx/backtrace.c
Asynchronous-unsafe functions such as fprintf() are used in the context of 
signal handler.

Consider this situation:
1. A "Segment violation, aka SEGV" occurs in malloc() or free() due to memory 
corruption. Such a function holds the global mutex lock.
2. The first call of fprintf(), it internally calls malloc() to obtain a memory 
area as a text buffer. Then a dead lock occurs.

For that topic, I would be posing a question later.


> Hence, I would always try to abstract from actual memory as much as possible. 
 (Performance considerations are of course valid, but they must be balanced 
against safety and maintainability considerations.)

3. Come up with the exciting measures
There in no need to keep relying on the traditional approaches invented in the 
20th century.

With my experiences from 8 bit processor, I certainly believe the programmers' 
awareness of how memory area is treated is the crucial factor to achieve 
performance, safety, and maintainability at the same time.

I do not have an objection against your idea "abstraction," though.

=====
// Slicing cheese and throwing them out at once
#define ALLOCATION_SIZE ( 1024 * 1024 ) // 1MB
#define ALIGNMENT       4

void* SCATTOAO::xmalloc( size_t nSize )
{
    nSize = ( ( nSize - 1 ) / ALIGNMENT + 1 ) * ALIGNMENT;
    if ( m_nRest < nSize ) {
        nAllocationSize = ( ( nSize - 1 ) / ALLOCATION_SIZE + 1 ) * 
ALLOCATION_SIZE;
        p = memory_page_allocation( nAllocationSize, PRIVATE|ANONIMOUS );
        m_vector.append( Entry( p, nAllocationSize ) );
        m_pNose = p;
        n_nRest = nAllocationSize;
    }
    ret = m_pNose;
    m_pNose += nSize;  // Slice a block of cheese
    m_nRest -= nSize;
    return (void *) ret;
}

void SCATTOAO::xfree( void* )
{
    // do nothing at all
}

SCATTOAO::~SCATTOAO()
{
    if ( Applicatoin::IsMemoryCheckRequested() )
        for ( iterator m_vector )  // Turn them to be a trap
            alter_page_attribute( *it, NO_READ_ACCESS|NO_WRITE_ACCESS|NO_EXEC );
    else
        for ( iterator m_vector )  // Throw them at once
            memory_page_deallocation( it->m_pAddress, it->m_nSize );
}
=====

Please have a look at an additional code fragment in the destructor above:

    if ( Applicatoin::IsMemoryCheckRequested() )
        for ( iterator m_vector )  // Turn them to be a trap
            alter_page_attribute( *it, NO_READ_ACCESS|NO_WRITE_ACCESS|NO_EXEC );

1. soffice.bin is invoked with a new command line option such as "-memorycheck"
2. Applicatoin::IsMemoryCheckRequested() returns TRUE.
3. The memory pages being freed turns to be a trap.
4. A problematic code mistakenly attempts to read or write data in the 
already-freed-memory-area.
5. The trap sets off the alarm and an interruption is sent by the OS.
6. A signal handler in the SAL catches the interruption.
7. A crash report that reveals the exact location of the code is made.

We have been cultivating thousands of test scenarios for more than a decade.
Just leave the qatesttool running for a day and night with the option 
-memorycheck.


4. Utilizing the cutting-edge technology invented in the 21th century.

solaris$ cat attempt-of-accessing-the-already-freed-memory-area.c

#include <stdlib.h>
int main()
{
    char *p = (char *) malloc(10);
    free(p);
    *p = 1;
    return 0;
}

$ cc -g attempt-of-accessing-the-already-freed-memory-area.c

$ LD_PRELOAD=watchmalloc.so.1 MALLOC_DEBUG=WATCH,RW ./a.out
Trace/Breakpoint Trap (core dumped)

$ dbx ./a.out core
...
program terminated by signal TRAP (write access watchpoint trap)
Current function is main
    7       *p = 1;

Is it easy enough?

For details
watchmalloc(3MALLOC)
http://download.oracle.com/docs/cd/E19082-01/819-2243/watchmalloc-3malloc/index.html


But there, it is the language implementation---and not the programmer writing a 
program in that language---that carries out the proof that keeping data in a 
region of memory that is discarded wholesale at a certain point in time is 
sound.

I agree. I have been trying to devise easily-understandable, error-proof, 
programming interfaces for programmers from the perspective of SAL.

Pushing a default memory allocator onto the area, kind of stack, in the 
thread-specific-data at a checkpoint.
When the time is sound, the allocator will be pop out from the stack. ...


5. 99.9% use cases could be the default.

func()
{
    OUString aString("abc");
    return aString;
}

1. The reference counter in the aString is initialized with 1.
2. The counter increases from 1 to 2 for the preparation of "return" because of 
assignment.
3. The counter decreases from 2 to 1 in the destructor.

Behind the scene in the step 3 above, what is going on?

~OUString()
http://hg.services.openoffice.org/DEV300/file/tip/sal/inc/rtl/ustring.hxx#l230

void SAL_CALL IMPL_RTL_STRINGNAME( release )( IMPL_RTL_STRINGDATA* pThis )
http://hg.services.openoffice.org/DEV300/file/tip/sal/rtl/source/strtmpl.c#l1002

internRelease (rtl_uString *pThis)
http://hg.services.openoffice.org/DEV300/file/tip/sal/rtl/source/ustring.c#l852

osl_incrementInterlockedCount:
http://hg.services.openoffice.org/DEV300/file/tip/sal/osl/unx/asm/interlck_x86.s#l32

That is beautiful and admirable!

What that wants to do might be:

    if ( --refCount == 0 )
         free( pThis )

In the case above, i.e. in the typical, 99.9% code of OpenOffice.org, I don't 
think multithread awareness is required.

Therefore, the current implementation taking care of multithread-safe, 
multicore-processor-awareness might be the waste of energy.

Again, it would be better, i think, to provide a multithread-unsafe, no mutex 
lock involved, simple String class to programmers for 99.9% typical use cases.

And also provide highly professional ones for the certain, critical, 
race-conditioning use cases.

Programmers should be aware of the characteristics of your own data!

Don't you think so?

Best regards,
Tora







--
-----------------------------------------------------------------
To unsubscribe send email to dev-unsubscr...@openoffice.org
For additional commands send email to sy...@openoffice.org
with Subject: help

Reply via email to