On Mon, 23 Feb 2009, Massimo Belgrano wrote: Hi,
> Follow prev > Multithreading as the ability to execute code in different code paths > is a feature of modern OS sinces decades. The problem with MT is that > it adds another dimension of complexity to the developers task. While > with single threaded apps. the developer needs only to think in a more > or less sequential way with MT each execution path adds a new > dimentions to the equation of programm complexity. Development > languages supporting MT such as Delphi, .NET (C#,VB) or Harbour and > xHarbour support MT thats correct, but they do not remove the burden > of correctness from the programmer. It is in the sole responsibility > of the programmer to ensure programm correctness in two different > areas; data-consistency and algorithm isolation. I agree, > The problem of data consistency occurs as soon as more than one thread > is accessing the same data - such as a simple string or an array. > Besides nuances in terms of single or multiple readers/writers the > consistency of the data must be ensured, so developers are forced to > use mutex-semaphores or other higher level concepts such monitors, > guards... to ensure data-consistency. Yes, usually they are though different languages gives some additional protection mechanisms here so not always is necessary to use user level synchronization. > Algorithm isolation is somewhat related to data-consistency, it > becomes obvious that a linked-list accessed from multiple threads must > be protected otherwise dangling pointer occurs. But what about a > table/relation of a database. The problem here is that concurrency > inside the process can be resolved - but this type of "isolation" does > break the semantics of the isolation principles which are already > provided by the underlying dbms (sql-isolation-levels, record or file > locks, transactions). Therefore algorithm isolation/correctness is a > complete different beast as it is located at a very high semantic > level of the task. yes, it is. > Alaska Software has put an enormous amount of research efforts into > that area and we have more than a decade of practical experience with > that area based on real world customers and real world applications. > >From that point of view I would like to reiterate my initial statement > "As of today there is still no tool available in the market which > provides that clean and easy to use way of multithreading". I was not making such "enormous amount of research efforts" ;-) Just simply looked at good balance between performance, basic protection and flexibility for programmers. > Lets start with xHarbour, its MT implementation is not well thought, > as it provides MT features to the programmer without any model, just > the features. xHarbour even allows the usage of a workarea from > different threads which is a violation of fundamental dbms isolation > principles. In fact xHarbour ist just a system language in the sense > of MT and makes life not really easier compared with other system > languages. Therefore there is no value in besides being able to do MT. > Also keep in mind due to the historical burden of the VM and RT core > the MT feature is implemented in a way making it impossible to scale > in future multi-core scenarios (see later-note). I agree. Giving the unprotected access to workareas is asking for a troubles. It can create very serious problems (f.e. data corruption in tables) and gives nothing for programmers because they have to use own protection mechanisms to access the tables so final application have to be reduced to the same level as using dbRequest()/dbRelease() to lock/unlock the table. The difference is only that in such model programmer has to implement everything itself. > Harbour is better here because if follows more the principles of > Xbase++, while I am not sure if the Harbour people have decided to > adapt the Xbase++ model for compatibility reasons or not I am glad to > see that they followed our models point of view. The issues with > Harbour however is that it suffers from the shortcoming of its runtime > in general, the VM design and of course the way how datatypes - the > blood of a language - are handled. It is still in a 1980 architectual > style centered around the original concept how Clipper did it. This is > also true for xHarbour, so both suffer from the fact that MT was added > I think in 2007, while the VM and RT core is from 1999 - without > having MT in mind. Here I can agree only partially. 1-st Harbour does not follow xbase++ model. With the exception to xbase++ emulation level (xbase++ sync and thread classes, thread functions and sync methods) the whole code is the result of my own ideas. The only one idea I partially borrowed is dbRequest()/dbRelase() semantic. Personally I wanted to introduce many workarea holders (not only single zero area zone) and dbDetach()/dbAttach() functions. Later I heard about xbase++ implementation and I've found the cargo codeblock attaching as very nice feature so I implemented it but internally it operates on workarea sets from my original idea and still it's possible to introduce support for multiple WA zones if we decide to add .prg level API for it. In some cases it maybe usable. Also the internal WA isolation in native RDDs is different. For POSIX systems it's necessary to introduce file handle sharing and this mechanism is already used so now we can easy extended it adding support for pseudo exclusive mode (other threads will be able to access tables open in exclusive mode which is exclusive only for external programs) or add common to aliased WA caches. Of course Harbour supports also other xbase++ extensions but they were added rather for compatibility with xbase++ on xbase++ users and internally use basic Harbour MT API. 2-nd this old API from 1980 is a real problem in some places and probably will be good to change it. But I also do not find the xbase++ API as the only one final solution. Harbour gives full protection for read access to complex items. User have to protect only write access and only if he will want to change exactly the same item not complex item member, f.e. this code: aVal[ threadID() ] += aVal[ threadID() ] * 2 + 100 is MT safe in Harbour even if the same aVal is used by many different threads. Important is the fact that each thread operates on different aVal items and aVal is not resized. Otherwise it may cause data corruption. But when complex items can be resized the we usually need additional protection also in xbase++ because user code makes many operations which have to be atomic in some logical sense so in most of cases there is only one difference here between Harbour and xbase++: in xbase++ with full internal protection and missing user protection RT error is generated. In Harbour it may cause internal data corruption. I agree here that it's very important difference but in mouse of such cases we are talking about wrong user code which needs additional user protection in both languages. And here we have one fundamental question: What is the cost of internal protection for scalability? and if we can or cannot accept it. My personal feeling is that the cost will be high, even very high but I haven't made any tests myself though some xbase++ users confirmed that it's a problem in xbase++. I'm really interested in some scalability tests of xbase++ and Harbour. It could give few very important answers. If some xbase++ user can port tests/speedtst.prg to xbase++ then it will be very helpful. Of course it's possible that I missed something here but I've never used xbase++ and I cannot see its source code so I only guess how some things are implemented in this language. > This is in fact one of the biggest differences between Xbase++ and the > "Harbours" from a pure architectual point of view, we designed a > runtime architecture from the beginning to be MT/MP and Distributed, > they designed a runtime based on the DOS Clipper blueprint. > In fact, I could argue on and on, specifically it it comes to > dedicated implementations of the Harbour runtime core or the Harbour > VM but sharing these type of technical details is of course > definitively not what I am paid for -;) Anyway allow me to make it > clear in a general terms. See above. It's not such clear as you said. I think that you will find users which can say that the cost of scalability is definitively not what they be paid for. Especially when the missing user protection is also problem for xbase++ and the bad results are only different. For sure RT error is much better then internal data corruption but how much users can paid for such functionality. > First, any feature/functionality of Xbase++ is reentrant there is not > a single exception of this rule. Second, any datatype and storage type > is thread-safe regardless of its complexity so there is no way to > crash an Xbase++ process using multithreading. Third, the runtime > guarantees that there is no possibility of a deadlock in terms of its > internal state regardless what you are doing in different threads. > There is a clean isolation and inheritance relationship of settings > between different threads. In practical terms that means, you can > output to the console from different threads without any additional > code, you can execute methods or access state of GUI (XbasePARTS) > objects from different threads, you can create a codeblock which > detaches a local variable and pass it to another thread, you are > performing file I/O or executing a remote procedure call and in the > meanwhile the async. garbagge collector cleans up your memory - and > the list goes on... But in Xbase++ you can do all that without the > need to think about MT or ask a question such as "Is the ASort() > function thread safe" or can I change the caption of a GUI control > from another thread. Thats all a given, no restrictions apply, the > runtime does it all automatically for you. Most of the above is also true in Harbour with the exception to missing GUI components and obligatory internal item storage protection. But it's the subject of efficiency discussed above. Let's make some scalability tests and we can decide if we want to pay the same cost of xbase++ users. > Anyway, I like Harbour more than xHarbour in terms of MT support. > However the crux is still there, no real architecture around the > product, leading to the fact that MT is supported form a technical > point of view but not from a 4GL therefore leading to a potential of > unnecessary burden for the average programmers, and of course that was > and is still not the idea of Clipper as a tool. The only one fundamental difference between Harbour and xbase++ in the above is obligatory internal items protection. At least visible for me now and as I said the cost of such functionality may not be acceptable for users. But let's make some real tests to see how big problem it creates in real life. > Btw, the same is true for VO or so, they left the idea of the language > and moved to something more like a system -language, while I can > understand that somewhat I strongly disagree with that type of > language design for a simple reasons; its not practical in the long > term - we will see that in the following years as more and more multi > core system will find their way in the mainstream and developers need > to make use of them for performance and scaleability reasons. In 10 - > 15 years from now we will have 100 if not thousands cores per die - > handling multithreading , synchronisation issues by hand becomes then > impossible, the same is true for offloading tasks for performance > reasons. So there is a need for a clean model in terms of the language > - thats at least into what we believe at Alaska Software. It goes even > further, the current attempty by MS in terms of multicore support with > VS2010 or NET 4.0 are IMO absolutely wrong, as they force the > developer to write code depending on the underlaying execution > infrastructure alias cores available. In other words, infrastructure > related code/algorithms get mixed with the original algorithm the > developers writes and of course the developer gets payed for. Thats a > catastrophic path which for sure does not contribute to increased > productivity and reliability of software solutions. I agree with you only partially. Over some reasonable cost limit the MT programing stops to be usable and is much more efficient, safer and easier to use separated processes. The cost of data exchanging between them will be simply smaller the cost of internal obligatory MT synchronization. So why to use MT mode? For marketing reasons? > Funnily enough, the most critical, and most difficult aspect in that > area; getting performance gains from multi core usage is even not > touched with my technical arguments right now. However it adds another > dimension of complexity to the previous equation as it needs to take > into account the memory hierarchy which must be handled by a 4GL > runtime totally different as it is with the simple approach of > Harbour/xHarbour. Their RT core and VM needs a more or less complete > rewrite and redesign to go that path. I do not see bigger problems with Harbour core code modifications. If we decide that it's worth then I'll implement it. Probably the real problem will be forcing different API to 3-rd party developers. Here we probably should chose something close to xbase++ C API to not introduce additional problems for 3-rd party developers which have to create code for both projects to have some basic compatibility f.e. at C preprocessor level. Anyhow I'm still not sure I want to pay for the cost of full item access serialization. > In other words, Xbase++ is playing in the Multithreading ballpark > since a decade. Harbour is still finding its way into the MT ballpark > while xHarbour is in that context at a dead-end. > I would bet that Xbase++ will play in the multicore ballpack while the > Harbours are still with their MT stuff. And it's highly possible that it will happen. But Harbour is free project and if we decide that adding full item protection with the cost of speed is valuable feature then maybe we implement it. It's also possible that we add such functionality as alternative VM library. Just like now we have hbvm and hbvmmt we will have hbvmpmt (protected mt). > In a more theoretical sense, it is important to understand that a > programming language and its infrastructure shall not adapt any > technical feature, requirement or hype. Because then the language and > infrastucture are getting more and more complicated up to an point of > lost control. Also backward compatibility and therefore protection of > existing investments becomes more and more a mess with Q&A costs going > through the roof. _FULLY_AGREE_. Things should be as simple as possible. Any hacks or workarounds for single features in longer terms create serious problems and blocks farther developing. For me it was the main of xHarbour problem when I was working on this project. > Nor is it a good idea to provide software developers any freedom - the > point here is, a good MT modell does smoothly guide the developer > through the hurdels and most of the time is even not in the awareness > of the developer. The contrary is providing the developer all freedom, > but this leads to letting him first build the gun-powder, then the gun > to finally shoot a bullet -;) :-) > Therefore let me rephrase my initial statement to be more specific; > > As of today there is still no tool available in the market which > provides that clean and easy to use way of multithreading, however > there are other tools which support MT - but they support it just as > an technical feature without a modell and thats simple wrong as it > leads to additional dimensions in code complexity - finally ending in > applications with lesser reliability and overall quality. > Just my point of view on that subject - enough said Thank you very much for this very interesting text. I hope that now the main internal difference between Harbour and xbase++ is well visible for users. To the above we should add yet tests/speedtst.prg results to compare scalability so we will know the real cost which is important part of the above description. I'm very interesting in real life results and I hope that some xbase++ users will port tests/speedtst.prg to xbase++ so we can compare the results. best regards, Przemek _______________________________________________ Harbour mailing list [email protected] http://lists.harbour-project.org/mailman/listinfo/harbour
