Hi, thanks for looking at this. Some comments inline:
On 21/11/06, Yang ZHONG <[EMAIL PROTECTED]> wrote:
There's a way of using std::string in Tuscany which might allocate/release
heap and copy memoery too frequently.
Could you please verify that's the case and brainstorm an optimization?
std::string is just an example, using object vs. using &/* is the more
generic topic.
This's the code snippet:
typedef std::string SDOString;
SDOString DataFactoryImpl::getFullTypeName(const SDOString& uri, const
SDOString& inTypeName) const
{
return uri + "#" + inTypeName;
}
void DataFactoryImpl::addType(const char* pccUri, const char*
pccTypeName,...)
{
SDOString fullTypeName = getFullTypeName(pccUri, pccTypeName);
...
}
1. getFullTypeName(pccUri,pccTypeName) call will allocate stack for
std::string instance uri and inTypeName.
Since a URI is likely longer than 16(see std::string implementation), a
heap(5-1) piece will be allocated.
pccUri and pccTypeName will be copied into uri(7-1) and inTypeName(7-2)
respectively
2. uri+"#" will allocate stack and *heap*(5-2) for a new std::string
instance, and uri will be copied(7-3)
3. ...+inTypeName will allocate stack and *heap*(5-3) for another new
std::string instance, and above(2.) std::string(7-4) and inTypeName(7-5)
will be copied
4. getFullTypeName return will allocate stack and *heap*(5-4) for yet
another new std::string instance, and return value will be copied(7-6)
5. The assignment to local variable fullTypeName, will allocate stack and
*heap*(5-5) for one more new std::string instance, and value will be
copied(7-7)
Could you please verify that's the case?
Possibly! It all depends on the compiler and the std:string implementation.
The return value from getFullTypeName and it's allocation to the local
variable may not allocate heap for the actual string buffer. I believe the
std::string implementation can reference count the underlying string buffer
so it is not always allocating/deallocating.
If true, it's too frequent that simple 2 lines of code allocate/release
heap
*five* times and copy memory *seven* times.
Could you please brainstorm an optimization?
This might be a start:
SDOString& DataFactoryImpl::getFullTypeName(SDOString& stringBuffer,
const char* inTypeName) const
{
stringBuffer += "#";
return stringBuffer += inTypeName;
}
void DataFactoryImpl::addType(const char* pccUri, const char*
pccTypeName,...)
{
SDOString fullTypeName = pccUri;
getFullTypeName(fullTypeName, pccTypeName);
...
}
It allocates/releases heap 4 times less and copys memory 5 times less.
In general, we may want to consider & and * while using an object.
A huge difference for Java developer to notice when programming C++ is,
Java
Object variable is by reference(&/* in C++) while C++ object is by
*value*.
Passing in an "INOUT" parameter to hold the return string will improve
efficiency. This particular case has addType taking char* as parameters.
This will be changed so the input parameters are const std::string& as we
are trying to get away from using the potentially unsafe char*.
So...
SDOString& DataFactoryImpl::getFullTypeName(SDOString& stringBuffer,
const SDOString& inTypeUri, const SDOString& inTypeName) const
{
stringBuffer = inTypeUri + "#" + inTypeName;
return stringBuffer;
}
void DataFactoryImpl::addType(const SDOString& pccUri, const SDOString&
pccTypeName,...)
{
SDOString fullTypeName;
getFullTypeName(fullTypeName, pccUri, pccTypeName);
...
}
.. is a possibility.
In your example the caller to getFullTypeName is making an assumption as to
what constructing the name is, i.e. you are assuming that it will be of the
form uri+some other stuff. The point of factoring out the construction of
the fullTypeName into a separate method is to allow the form of that name to
be changed at any time without affecting the caller.
So... yes it can be improved by passing an object as an out parameter rather
than the return value however the readability of the code is also important
so it is a balance between having a simple "client api" and having a super
efficient implementation so each case should be looked at on its own to
decide if it is a significant performance issue taking into account many
things like how often the method is actually called.
--
Yang ZHONG
Cheers,
--
Pete