Re: [Harbour] Re: Intresting corner of info

Przemyslaw Czerpak Mon, 23 Feb 2009 04:31:55 -0800

On Mon, 23 Feb 2009, Massimo Belgrano wrote:

Hi,


> Follow prev
> Multithreading as the ability to execute code in different code paths
> is a feature of modern OS sinces decades. The problem with MT is that
> it adds another dimension of complexity to the developers task. While
> with single threaded apps. the developer needs only to think in a more
> or less sequential way with MT each execution path adds a new
> dimentions to the equation of programm complexity. Development
> languages supporting MT such as Delphi, .NET (C#,VB) or Harbour and
> xHarbour support MT thats correct, but they do not remove the burden
> of correctness from the programmer. It is in the sole responsibility
> of the programmer to ensure programm correctness in two different
> areas; data-consistency and algorithm isolation.

I agree,

> The problem of data consistency occurs as soon as more than one thread
> is accessing the same data - such as a simple string or an array.
> Besides nuances in terms of single or multiple readers/writers the
> consistency of the data must be ensured, so developers are forced to
> use mutex-semaphores or other higher level concepts such monitors,
> guards... to ensure data-consistency.

Yes, usually they are though different languages gives some additional
protection mechanisms here so not always is necessary to use user level
synchronization.

> Algorithm isolation is somewhat related to data-consistency, it
> becomes obvious that a linked-list accessed from multiple threads must
> be protected otherwise dangling pointer occurs. But what about a
> table/relation of a database. The problem here is that concurrency
> inside the process can be resolved - but this type of "isolation" does
> break the semantics of the isolation principles which are already
> provided by the underlying dbms (sql-isolation-levels, record or file
> locks, transactions). Therefore algorithm isolation/correctness is a
> complete different beast as it is located at a very high semantic
> level of the task.

yes, it is.


> Alaska Software has put an enormous amount of research efforts into
> that area and we have more than a decade of practical experience with
> that area based on real world customers and real world applications.
> >From that point of view I would like to reiterate my initial statement
> "As of today there is still no tool available in the market which
> provides that clean and easy to use way of multithreading".

I was not making such "enormous amount of research efforts" ;-)
Just simply looked at good balance between performance, basic
protection and flexibility for programmers.

> Lets start with xHarbour, its MT implementation is not well thought,
> as it provides MT features to the programmer without any model, just
> the features. xHarbour even allows the usage of a workarea from
> different threads which is a violation of fundamental dbms isolation
> principles. In fact xHarbour ist just a system language in the sense
> of MT and makes life not really easier compared with other system
> languages. Therefore there is no value in besides being able to do MT.
> Also keep in mind due to the historical burden of the VM and RT core
> the MT feature is implemented in a way making it impossible to scale
> in future multi-core scenarios (see later-note).

I agree. Giving the unprotected access to workareas is asking for a
troubles. It can create very serious problems (f.e. data corruption
in tables) and gives nothing for programmers because they have to use
own protection mechanisms to access the tables so final application
have to be reduced to the same level as using dbRequest()/dbRelease()
to lock/unlock the table. The difference is only that in such model
programmer has to implement everything itself.

> Harbour is better here because if follows more the principles of
> Xbase++, while I am not sure if the Harbour people have decided to
> adapt the Xbase++ model for compatibility reasons or not I am glad to
> see that they followed our models point of view. The issues with
> Harbour however is that it suffers from the shortcoming of its runtime
> in general, the VM design and of course the way how datatypes - the
> blood of a language - are handled. It is still in a 1980 architectual
> style centered around the original concept how Clipper did it. This is
> also true for xHarbour, so both suffer from the fact that MT was added
> I think in 2007, while the VM and RT core is from 1999 - without
> having MT in mind.

Here I can agree only partially.
1-st Harbour does not follow xbase++ model. With the exception to
xbase++ emulation level (xbase++ sync and thread classes, thread
functions and sync methods) the whole code is the result of my own
ideas. The only one idea I partially borrowed is dbRequest()/dbRelase()
semantic. Personally I wanted to introduce many workarea holders
(not only single zero area zone) and dbDetach()/dbAttach() functions.
Later I heard about xbase++ implementation and I've found the cargo
codeblock attaching as very nice feature so I implemented it but
internally it operates on workarea sets from my original idea and
still it's possible to introduce support for multiple WA zones if
we decide to add .prg level API for it. In some cases it maybe usable.
Also the internal WA isolation in native RDDs is different. For POSIX
systems it's necessary to introduce file handle sharing and this
mechanism is already used so now we can easy extended it adding support
for pseudo exclusive mode (other threads will be able to access tables
open in exclusive mode which is exclusive only for external programs)
or add common to aliased WA caches.
Of course Harbour supports also other xbase++ extensions but they were
added rather for compatibility with xbase++ on xbase++ users and internally
use basic Harbour MT API.

2-nd this old API from 1980 is a real problem in some places and probably
will be good to change it. But I also do not find the xbase++ API
as the only one final solution. Harbour gives full protection for read
access to complex items. User have to protect only write access
and only if he will want to change exactly the same item not
complex item member, f.e. this code:
   aVal[ threadID() ] += aVal[ threadID() ] * 2 + 100
is MT safe in Harbour even if the same aVal is used by many different
threads. Important is the fact that each thread operates on different
aVal items and aVal is not resized. Otherwise it may cause data corruption.
But when complex items can be resized the we usually need additional
protection also in xbase++ because user code makes many operations which
have to be atomic in some logical sense so in most of cases there is
only one difference here between Harbour and xbase++: in xbase++ with
full internal protection and missing user protection RT error is generated.
In Harbour it may cause internal data corruption. I agree here that it's
very important difference but in mouse of such cases we are talking about
wrong user code which needs additional user protection in both languages.
And here we have one fundamental question:
   What is the cost of internal protection for scalability?
and if we can or cannot accept it. My personal feeling is that the cost
will be high, even very high but I haven't made any tests myself though
some xbase++ users confirmed that it's a problem in xbase++.
I'm really interested in some scalability tests of xbase++ and Harbour.
It could give few very important answers. If some xbase++ user can port
tests/speedtst.prg to xbase++ then it will be very helpful.

Of course it's possible that I missed something here but I've never used
xbase++ and I cannot see its source code so I only guess how some things
are implemented in this language.

> This is in fact one of the biggest differences between Xbase++ and the
> "Harbours" from a pure architectual point of view, we designed a
> runtime architecture from the beginning to be MT/MP and Distributed,
> they designed a runtime based on the DOS Clipper blueprint.
> In fact, I could argue on and on, specifically it it comes to
> dedicated implementations of the Harbour runtime core or the Harbour
> VM but sharing these type of technical details is of course
> definitively not what I am paid for -;) Anyway allow me to make it
> clear in a general terms.

See above. It's not such clear as you said.
I think that you will find users which can say that the cost of
scalability is definitively not what they be paid for. Especially
when the missing user protection is also problem for xbase++ and
the bad results are only different. For sure RT error is much better
then internal data corruption but how much users can paid for such
functionality.

> First, any feature/functionality of Xbase++ is reentrant there is not
> a single exception of this rule. Second, any datatype and storage type
> is thread-safe regardless of its complexity so there is no way to
> crash an Xbase++ process using multithreading. Third, the runtime
> guarantees that there is no possibility of a deadlock in terms of its
> internal state regardless what you are doing in different threads.
> There is a clean isolation and inheritance relationship of settings
> between different threads. In practical terms that means, you can
> output to the console from different threads without any additional
> code, you can execute methods or access state of GUI (XbasePARTS)
> objects from different threads, you can create a codeblock which
> detaches a local variable and pass it to another thread, you are
> performing file I/O or executing a remote procedure call and in the
> meanwhile the async. garbagge collector cleans up your memory - and
> the list goes on... But in Xbase++ you can do all that without the
> need to think about MT or ask a question such as "Is the ASort()
> function thread safe" or can I change the caption of a GUI control
> from another thread. Thats all a given, no restrictions apply, the
> runtime does it all automatically for you.

Most of the above is also true in Harbour with the exception to
missing GUI components and obligatory internal item storage protection.
But it's the subject of efficiency discussed above.
Let's make some scalability tests and we can decide if we want to pay
the same cost of xbase++ users.

> Anyway, I like Harbour more than xHarbour in terms of MT support.
> However the crux is still there, no real architecture around the
> product, leading to the fact that MT is supported form a technical
> point of view but not from a 4GL therefore leading to a potential of
> unnecessary burden for the average programmers, and of course that was
> and is still not the idea of Clipper as a tool.

The only one fundamental difference between Harbour and xbase++ in the
above is obligatory internal items protection. At least visible for me
now and as I said the cost of such functionality may not be acceptable
for users. But let's make some real tests to see how big problem it
creates in real life.

> Btw, the same is true for VO or so, they left the idea of the language
> and moved to something more like a system -language, while I can
> understand that somewhat I strongly disagree with that type of
> language design for a simple reasons; its not practical in the long
> term - we will see that in the following years as more and more multi
> core system will find their way in the mainstream and developers need
> to make use of them for performance and scaleability reasons. In 10 -
> 15 years from now we will have 100 if not thousands cores per die -
> handling multithreading , synchronisation issues by hand becomes then
> impossible, the same is true for offloading tasks for performance
> reasons. So there is a need for a clean model in terms of the language
> - thats at least into what we believe at Alaska Software. It goes even
> further, the current attempty by MS in terms of multicore support with
> VS2010 or NET 4.0 are IMO absolutely wrong, as they force the
> developer to write code depending on the underlaying execution
> infrastructure alias cores available. In other words, infrastructure
> related code/algorithms get mixed with the original algorithm the
> developers writes and of course the developer gets payed for. Thats a
> catastrophic path which for sure does not contribute to increased
> productivity and reliability of software solutions.

I agree with you only partially. Over some reasonable cost limit
the MT programing stops to be usable and is much more efficient,
safer and easier to use separated processes. The cost of data
exchanging between them will be simply smaller the cost of internal
obligatory MT synchronization. So why to use MT mode? For marketing
reasons?

> Funnily enough, the most critical, and most difficult aspect in that
> area; getting performance gains from multi core usage is even not
> touched with my technical arguments right now. However it adds another
> dimension of complexity to the previous equation as it needs to take
> into account the memory hierarchy which must be handled by a 4GL
> runtime totally different as it is with the simple approach of
> Harbour/xHarbour. Their RT core and VM needs a more or less complete
> rewrite and redesign to go that path.

I do not see bigger problems with Harbour core code modifications.
If we decide that it's worth then I'll implement it.
Probably the real problem will be forcing different API to 3-rd party
developers. Here we probably should chose something close to xbase++
C API to not introduce additional problems for 3-rd party developers
which have to create code for both projects to have some basic
compatibility f.e. at C preprocessor level.
Anyhow I'm still not sure I want to pay for the cost of full item
access serialization.

> In other words, Xbase++ is playing in the Multithreading ballpark
> since a decade. Harbour is still finding its way into the MT ballpark
> while xHarbour is in that context at a dead-end.
> I would bet that Xbase++ will play in the multicore ballpack while the
> Harbours are still with their MT stuff.

And it's highly possible that it will happen. But Harbour is free
project and if we decide that adding full item protection with the
cost of speed is valuable feature then maybe we implement it.
It's also possible that we add such functionality as alternative VM
library. Just like now we have hbvm and hbvmmt we will have hbvmpmt
(protected mt).

> In a more theoretical sense, it is important to understand that a
> programming language and its infrastructure shall not adapt any
> technical feature, requirement or hype. Because then the language and
> infrastucture are getting more and more complicated up to an point of
> lost control. Also backward compatibility and therefore protection of
> existing investments becomes more and more a mess with Q&A costs going
> through the roof.

_FULLY_AGREE_. Things should be as simple as possible. Any hacks or
workarounds for single features in longer terms create serious problems
and blocks farther developing.
For me it was the main of xHarbour problem when I was working on this
project.

> Nor is it a good idea to provide software developers any freedom - the
> point here is, a good MT modell does smoothly guide the developer
> through the hurdels and most of the time is even not in the awareness
> of the developer. The contrary is providing the developer all freedom,
> but this leads to letting him first build the gun-powder, then the gun
> to finally shoot a bullet -;)

:-)

> Therefore let me rephrase my initial statement to be more specific;
> 
> As of today there is still no tool available in the market which
> provides that clean and easy to use way of multithreading, however
> there are other tools which support MT - but they support it just as
> an technical feature without a modell and thats simple wrong as it
> leads to additional dimensions in code complexity - finally ending in
> applications with lesser reliability and overall quality.
> Just my point of view on that subject - enough said

Thank you very much for this very interesting text.
I hope that now the main internal difference between Harbour and xbase++
is well visible for users.
To the above we should add yet tests/speedtst.prg results to compare
scalability so we will know the real cost which is important part of
the above description.
I'm very interesting in real life results and I hope that some xbase++
users will port tests/speedtst.prg to xbase++ so we can compare the
results.

best regards,
Przemek
_______________________________________________
Harbour mailing list
[email protected]
http://lists.harbour-project.org/mailman/listinfo/harbour

Re: [Harbour] Re: Intresting corner of info

Reply via email to